Call us Toll-Free:
1-800-218-1525
Email us

 Sponsors

Optimize website Performance

Mike Peters, 07-16-2007
In a recent interview with John Reese, Jeremy Schoemaker of AuctionAds mentioned their site is currently handling 75 million requests per day, or close to 900 requests per second.

While you are probably operating in a completely different traffic "category", continually striving to increase traffic to your site, it helps to understand the basic principles behind optimizing your website to effectively handle any unexpected traffic spikes.

The "Slashdot effect" or "Digg Effect" are just two popular terms, describing a website crashing under the load because it wasn't designed to properly handle spikes in traffic.

The most important principle you have to understand, if you want to optimize website performance for effectively handling spikes in traffic, is this:

READ is cheap
WRITE is expensive


READ is the operation of accessing a static piece of data that was already calculated/stored and is available for immediate use.

WRITE is updating a database record, writing to a database socket stream (even if it's just for fetching data) or anything that involves some form of computation before the output can be sent back to the user.

--

To use a real life example, one of our clients had a WordPress blog. They had some basic posts about gadgets & electronics, a MyBlogLog widget and a few other WordPress widgets like the popular posts one and the "buy me a beer" widget.

In the never ending quest for traffic, they tried Digg. The owner posted a niche article to Digg, linking back to the main site.

Article made it front page of Digg within a few hours. Two thousand hits later, the site stopped responding. Traffic lost.

Why?

Because they did not optimize website performance for READ vs WRITE operations. All users hitting the site were competing over the allowed 60 MySQL database connections and the server's limited CPU cycles.

Here's how we took over and optimized their site:

Image

Two database clusters, one handling all READ requests (end users hitting the site to view pages). All pages are served by memCached.

Administrator WordPress interface runs on a different database cluster, applying all WRITE operations to that cluster. MySQL replication handles transmitting the updates to the READ cluster.

Going back to AuctionAds, to effectively process all requests, I'm guessing all AuctionAds widgets are filtered through a dedicated READ cluster. Most if not all data is processed without requiring a single on-demand database query.

All stats and ads-building probably takes place in the background (offline processing), so that the amount of time each connection is open is kept to the bare minimum.

Here are a few more tips for the hardcore optimizer:

* For the READ cluster - Ditch Apache. Apache+PHP is a resource hog. Look into writing your own proprietary database cache layer.
* For the READ cluster- Forget about Perl/PHP. Stick to precompiled C
* For the READ cluster- use UltraDNS for basic load balancing
* For the WRITE cluster - Minimize the number of queries and connection time required to handle every request
* For the WRITE cluster - Switch everything possible to offline processing
* Use replication (MySQL made or your own) to improve scalability and service reliability

--

Please use the comments section to highlight anything else that can help optimizing for 100 Million pageviews per day.

KiLVaiDeN, 07-17-2007
Most sites also generate static HTML in order to avoid the cost of replicating a database.

In terms of response time, it's even faster to provide static HTML, than to provide a READ only database cluster.

It's even quite simple to generate with some good thoughts, and it might even be much better for referencing ( since the static content is supposed to be static and not changing ).

Mike Peters, 07-17-2007
KiLVaiDeN - the cache layer in the READ cluster handles most requests from memory without ever accessing the database.

Pure static HTMLs are great but in many cases you cannot afford to serve stale content and need a mechanism that will continually update your pages.

Some basic examples - forums listing "total number of visitors online", blogs listing "most popular outgoing links" etc.

KiLVaiDeN, 07-17-2007
The cache layer will handle already processed requests though, but it's an acceptable READ charge.

I rather would work with static HTML and AJAX requests to handle those data that needs to be up to date; But I fully agree that this system won't work properly in matters of data which changes often, I have more in mind those business applications with content which could be generated at day-1.

Have you ever seen the possibilities of cache like JBoss TreeCache ? We are in our way to make a cluster framework for one of our applications which will be accessed by potentially millions of users, therefore we will invest in a clustered cache like JBoss TreeCache.

An advantage I see then about the cache possibilities of Java against those of PHP, is that for PHP ( tell me if I'm wrong ) your application has no way of handling data in memory, while Java runs a JVM where you can store objects for further accessing.

Can you explain how does the Cache Layer work please ?

Adrian Singer, 07-17-2007
This is the first time I've heard Java mentioned at the same time with performance.

If you're looking for performance, stay away from Java.

To answer your question - if you use memCached, you can do in-memory caching regardless of the language you pick.

Memcached is still pretty much the king of cache even in the Java world. The fact that it's written in C is a good thing.

As for using static HTMLs with Ajax - if you have 1,000 users hitting a page with Ajax, the Ajax code is still going to query your server 1,000 times asking for data. If you get that data using database queries, you're back to square one and you have no caching.

Tekrat, 07-17-2007
I love to use either flat files or server side XML file when I need speed. XML is nice for sorting and structuring data but when you hit that 100-300 parent node range XML starts dropping in performance fast.

A nifty trick for storing a lot of data in a flat file in using your language's version of Url Encode/Decode. By encoding the data you can easily parse out the data you need. This example is all the data for a single record on one line. A pipe (|) and an equal sign is used for the delimiters:

a=2|b=852|url=http%3A%2F%2Fwww%2Emyplace%2Ecom

Tekrat, 07-17-2007
Something I forgot in my earlier post is that there are solid alternatives to C/C++:
- VB 6 was great on Windows systems but of course the its ran its life cycle out and was/is a memory hog.
- If you're a PHP purist then you could compile your code to an executable with Roadsend (www.roadsend.com) They have a compiler for just about any system.
- If you know Pascal you can use BloodShed's devPascal for you binaries (www.bloodshed.net).
- If you just want to stick with basic there are things like FreeBasic (www.freebasic.net), Emergence Basic (www.ionicwind.com), and PowerBasic (www.powerbasic.com) The first two can compile on most OS's. The later is a solid Windows only language.

KiLVaiDeN, 07-17-2007
Answer to Adrian Singer :

"This is the first time I've heard Java mentioned at the same time with performance. If you're looking for performance, stay away from Java."

Java performance was bad in earlier versions of the JVM, but now, it's quite fast and performs pretty well, even against natively compiled programs; this is due to the fact that actual JVM is very well optimized in its JIT compilation process. I wouldn't say Java is slow; it's FAST. It's in fact the fastest ( and by far ) server-side language, if you exclude CGI ( who would code a CGI webapp nowadays anyways ? ) and you can be convinced about it with this language benchmark : http://shootout.alioth.debian.org/gp4/

Java is not top ranked, and often outclassed by it's C or C++ equivalents, but it still performs quite good in my opinion. Anyways, even if this proves you that the statement that Java is slow is totally wrong ( check PHP rankings there.. ), performance is not the only parameter to take into account when coding an application, maintenability and ease of use is another, and Java is by far more convenient that older languages, both because it's a well concieved language and because it has an extensively HUGE library of well thought software ready to be used.

"To answer your question - if you use memCached, you can do in-memory caching regardless of the language you pick. Memcached is still pretty much the king of cache even in the Java world. The fact that it's written in C is a good thing."

I didn't know about Memcached before reading this article.. I'll have to take a look into it :)

"As for using static HTMLs with Ajax - if you have 1,000 users hitting a page with Ajax, the Ajax code is still going to query your server 1,000 times asking for data. If you get that data using database queries, you're back to square one and you have no caching."

In a page with static content, and lets say 10 "dynamic" elements, I would say that with a single AJAX request you could get them. Those elements can be cached too. It's much faster to get 10 tiny elements from a cache, and having other things displayed from a static HTML, than gathering everything from the cache. As I stated though, I am totally aware that the static HTML solution isn't compatible with every kind of webapp you could do. One of the easy example that beats this is a Forum for exemple. No way you could do a forum with static HTML !

Chris Tata, 07-17-2007
KiLVaiDeN - good pointers!

Here's a guide I wrote about How to install memCached

Robert, 07-17-2007
In taking over your client's site, did you think about doing something simple first, like using the WP-Cache plugin? Surely the HTTP server could handle the 20k pageviews on a simple HTML file couldn't it?

Using things like clusters and memory caches are fine, but for something like a blog that doesn't constantly get that kind of traffic, wouldn't a simple solution have worked just fine?

Mike Peters, 07-18-2007
Excellent question Robert.

WP-Cache might be the right solution for a basic blog that gets digs from time to time. Our client's blog was doing some price comparisons and heavy lifting on each page, in an attempt to present the most up to date affiliate offers to every visitor.

I recall we looked into WP-Cache in the past for another client, but found too many reported problems to a point where we didn't feel comfortable implementing it.

The platform we created here for this client albeit not the simplest to comprehend, will serve their needs for years to come

Alessandra Grieco, 07-18-2007
For anyone using PHP, eAccelerator is a great PHP optimizer that pre-compiles PHP pages so that they can be instantly fetched similar to static files.

Here's a guide about How to install eAccelerator

Geeks are Sexy, 08-18-2008
Mike Peters: Just to let you know, since we've upgraded to WP Super Cache, the problem has stopped occurring completely. (You linked to the WP-Cache problem post on our site)

Cheers.

K. [GAS]

medicine of herbal, 09-23-2011
ok thx.....
but iam still dizzy about it...
completely got lost...ahahaha
Enjoyed this post?

Subscribe Now to receive new posts via Email as soon as they come out.

 Comments
Post your comments












Note: No link spamming! If your message contains link/s, it will NOT be published on the site before manually approved by one of our moderators.



About Us  |  Contact us  |  Privacy Policy  |  Terms & Conditions