Full-service Internet Marketing & Web Development
Recent Posts

Recommended Reads
|
redis: a persistent key-value storeMike Peters, November 5, 2010 -- Posted under Programming |
redis is a key-value store, similar to memcached but with data persistence.
redis supports multiple value types, counters with atomic increment/decrement and built-in key expiration.
To achieve persistence without scarifying speed, like Cassandra, redis performs updates in memory as well as adding them to an append-only file, which is synced to disk from time to time.
redis is fast (110,000 writes per second, 81,000 reads per second), supports sharding and master-slave replication (no master-master yet)
Why redis?
Those of you keeping track, know we've always been big fans of MySQL but at the same time, we keep writing about migrating different parts of our application to Memcached, Cassandra, Lucene and ElasticSearch.
Why do we keep jumping from one storage engine to another? Can't we make up our minds already and settle with the "best" storage engine that meets our needs?
In short, No.
A common misconception is the belief that all storage engines are created equal, all designed to simply "store stuff" and provide fast access to your data. Unless your application performs one clearly defined simple task, it is a dire mistake to expect a single storage engine will effectively fulfill all of your data warehousing and processing needs.
* MySQL is great when you need ad-hoc queries and you're dealing with a relatively small data set.
* Memcached comes into play when you have a read-heavy environment and need a quick volatile cache to avoid querying MySQL a dozen times per page.
* Lucene and ElasticSearch are your friends when you need fulltext search, or when your MySQL data set grows to a point where running the filters and joins in MySQL becomes slow like a snail.
* Cassandra is amazing when you have a write-heavy environment, need to be capable of scaling writes linearly and supporting a huge data set.
* redis works particularly well as a state machine, when you need counters with atomic increment/decrement. Typical uses: "how many users are on my website?" ala ChartBeat, "how many jobs are waiting to be processed" etc.
Architecture
redis is written in Ansi C and runs as a single light-weight daemon on your machine. All updates are done in memory first and saved to disk later, asynchronously.
Supported languages: C, C#, Erlang, Java, JavaScript, Lua, Perl, PHP, Python, Ruby, Scala, Go, and Tcl.
As of 15 March 2010, development of redis is funded by VMware.
Installing redis
Step 1
Download the redis tarball and extract it
wget "http://redis.googlecode.com/files/redis-2.0.3.tar.gz"
tar xvzf redis-2.0.3.tar.gz
Step 2
Compile redis and install it
gmake
gmake install
Step 3
Run redis
/usr/local/bin/redis-server
Once running you can use redis-benchmark to run some benchmarks.
redis doesn't come with a config file, it will use all default settings by default. But you're going to want to study the config options and set them up.
Sample redis config file here
Configuration options
A few important redis.conf options you're going to want to set:
* First, if you will only be connecting to a local Redis instance, uncomment the bind configuration in the sample file:
bind 127.0.0.1
That tells Redis not to listen for external connections.
* redis supports multiple databases, but for most use cases, you're only going to need one. Change the default 16 to 1:
databases 1
* You can set the maximum number of bytes Redis can allocate, after which it will start purging volatile keys. If it cannot reclaim any more memory it will start refusing write commands. Here's a sample setting for a 100MB limit:
maxmemory 104857600
* The server will periodically fork and asynchronously dump the current contents of the database to disk. The dump is actually made to a temporary file and then moved to replace any older dump, so the operation is atomic and won't leave you with a partially dumped database. If Redis is eventually shutdown and reloaded, it will restore from this dump file.
How often it dumps the keys is configureable by the amount of time that passes and the number of changes that have been made to the data. For example, the following settings tell Redis to dump the database after 60 seconds if 100 changes have been made or after five minutes if there has been at least 1 change:
save 300 1
save 60 100
* By default redis starts in foreground mode. To fix that, change demonize option in redis.conf file to "Yes":
demonize yes
To Redis or not to Redis?
If you have a large data set that cannot comfortably fit into RAM, Redis is not the key value store for you to use, but if you have smaller sets, and if you can live with the asynchronous write behavior, then, for me, the answer is definitely "to Redis."
As an alternative, Tokyo Cabinet is very fast for a synchronous key value store, and it does support some features that Redis does not, such as tables. Redis permits a master/slave setup, which can alleviate fears of data loss from failure, but it's not as certain as something like Tokyo Cabinet, which will write the data as soon as it gets it. On the other hand, Redis is blazingly fast, incredibly easy to use, and will support just about anything you can think of doing with your data.
More resources:
* Try redis in your browser
* Download redis
* Retwis - a PHP twitter clone using redis
View 4 Comment(s)
redis supports multiple value types, counters with atomic increment/decrement and built-in key expiration.
To achieve persistence without scarifying speed, like Cassandra, redis performs updates in memory as well as adding them to an append-only file, which is synced to disk from time to time.
redis is fast (110,000 writes per second, 81,000 reads per second), supports sharding and master-slave replication (no master-master yet)
Why redis?
Those of you keeping track, know we've always been big fans of MySQL but at the same time, we keep writing about migrating different parts of our application to Memcached, Cassandra, Lucene and ElasticSearch.
Why do we keep jumping from one storage engine to another? Can't we make up our minds already and settle with the "best" storage engine that meets our needs?
In short, No.
A common misconception is the belief that all storage engines are created equal, all designed to simply "store stuff" and provide fast access to your data. Unless your application performs one clearly defined simple task, it is a dire mistake to expect a single storage engine will effectively fulfill all of your data warehousing and processing needs.
* MySQL is great when you need ad-hoc queries and you're dealing with a relatively small data set.
* Memcached comes into play when you have a read-heavy environment and need a quick volatile cache to avoid querying MySQL a dozen times per page.
* Lucene and ElasticSearch are your friends when you need fulltext search, or when your MySQL data set grows to a point where running the filters and joins in MySQL becomes slow like a snail.
* Cassandra is amazing when you have a write-heavy environment, need to be capable of scaling writes linearly and supporting a huge data set.
* redis works particularly well as a state machine, when you need counters with atomic increment/decrement. Typical uses: "how many users are on my website?" ala ChartBeat, "how many jobs are waiting to be processed" etc.
Architecture
redis is written in Ansi C and runs as a single light-weight daemon on your machine. All updates are done in memory first and saved to disk later, asynchronously.
Supported languages: C, C#, Erlang, Java, JavaScript, Lua, Perl, PHP, Python, Ruby, Scala, Go, and Tcl.
As of 15 March 2010, development of redis is funded by VMware.
Installing redis
Step 1
Download the redis tarball and extract it
wget "http://redis.googlecode.com/files/redis-2.0.3.tar.gz"
tar xvzf redis-2.0.3.tar.gz
Step 2
Compile redis and install it
gmake
gmake install
Step 3
Run redis
/usr/local/bin/redis-server
Once running you can use redis-benchmark to run some benchmarks.
redis doesn't come with a config file, it will use all default settings by default. But you're going to want to study the config options and set them up.
Sample redis config file here
Configuration options
A few important redis.conf options you're going to want to set:
* First, if you will only be connecting to a local Redis instance, uncomment the bind configuration in the sample file:
bind 127.0.0.1
That tells Redis not to listen for external connections.
* redis supports multiple databases, but for most use cases, you're only going to need one. Change the default 16 to 1:
databases 1
* You can set the maximum number of bytes Redis can allocate, after which it will start purging volatile keys. If it cannot reclaim any more memory it will start refusing write commands. Here's a sample setting for a 100MB limit:
maxmemory 104857600
* The server will periodically fork and asynchronously dump the current contents of the database to disk. The dump is actually made to a temporary file and then moved to replace any older dump, so the operation is atomic and won't leave you with a partially dumped database. If Redis is eventually shutdown and reloaded, it will restore from this dump file.
How often it dumps the keys is configureable by the amount of time that passes and the number of changes that have been made to the data. For example, the following settings tell Redis to dump the database after 60 seconds if 100 changes have been made or after five minutes if there has been at least 1 change:
save 300 1
save 60 100
* By default redis starts in foreground mode. To fix that, change demonize option in redis.conf file to "Yes":
demonize yes
To Redis or not to Redis?
If you have a large data set that cannot comfortably fit into RAM, Redis is not the key value store for you to use, but if you have smaller sets, and if you can live with the asynchronous write behavior, then, for me, the answer is definitely "to Redis."
As an alternative, Tokyo Cabinet is very fast for a synchronous key value store, and it does support some features that Redis does not, such as tables. Redis permits a master/slave setup, which can alleviate fears of data loss from failure, but it's not as certain as something like Tokyo Cabinet, which will write the data as soon as it gets it. On the other hand, Redis is blazingly fast, incredibly easy to use, and will support just about anything you can think of doing with your data.
More resources:
* Try redis in your browser
* Download redis
* Retwis - a PHP twitter clone using redis
View 4 Comment(s)
|
Using Nginx as a Reverse Proxy to IISAdrian Singer, November 4, 2010 -- Posted under Traffic |
We were recently approached by a client who's using a legacy Content Management system running on Microsoft IIS that is becoming painfully slow, hurting their business.
The system was not keeping up with their traffic.
Typically, in a situation like this, we would recommend re-architecting the application, piece by piece, replacing IIS with LAMP and optimizing database access.
In this case, the client was low on budget and didn't want to make too many changes. They were looking for a quick fix.
Following careful review of their .asp application, it became clear we're dealing with a chaotic buggy system and that we would have to cut deep, if we want to optimize existing code.
So we decided to go with a different approach.
Keep everything as is and use Nginx to reverse-proxy all incoming requests.
What is a Reverse Proxy?
A Reverse Proxy is a web server that handles all incoming requests from end-users, caching, load balancing and communicating with your back end primary servers as necessary.
IIS is slow. Nginx is super fast.
If we can't rewrite the code, let's have Nginx handle all traffic, connect to IIS internally and then cache the response from IIS, so that future requests can be fulfilled without ever hitting IIS.
The idea is to switch 1 million users downloading an image from IIS, with those users downloading everything from Nginx directly. Nginx is faster, light weight and scales easily.
Why Nginx?
Whenever we setup reverse proxies, one of our favorite options is Squid.
Squid has been around for a long time, very easy to setup and provides a good reverse-proxy caching solution.
In this case however, incoming requests required further logic before a request could be routed to IIS. Nginx is just as fast and offers greater flexibility by letting us use PHP.
Setting up
We provisioned a new dedicated server for the client and installed Nginx with PHP-FPM.
Analyzed all possible requests the IIS system was handling. They were all HTTP_GET requests, with varying parameters. IIS handled several vhosts, so we had to properly handle http://DomainA.com/dosomething?a=b and http://DomainB.com/dosomething?a=b
Configured Nginx to rewrite all requests for files that did not exist, to go to a notfound.php script:
location /
{
if (-d $request_filename)
{
break;
}
if (!-f $request_filename)
{
rewrite ^(.*)$ /notfound.php?$1 last;
break;
}
}
In notfound.php, we would connect to IIS to retrieve the image / static page / dynamic content, then save it locally.
The IIS system served different content based on the user's ip address and origin, so we had to take that into account when saving file names. (/us/google/welcome.gif vs /canada/yahoo/welcome.gif)
Going live
After testing everything locally, we had the client update their DNS, sending all traffic to Nginx instead of IIS.
The impact on performance has been very noticeable.
IIS CPU utilization went down from 70% to below 5% at all times and Nginx was barely breaking a sweat, handling the majority of the requests locally, reverting to IIS only when presented with a new combination of parameters that was never seen before.
We later developed a simple way to "expire" content on Nginx so that whenever the client updated the IIS Content Management System, the changes would propagate properly.
There is one aspect of this solution that is still lacking and worth mentioning. In the event of a sudden burst of new requests with never-seen-before parameters, the current implementation will revert all requests to IIS until files are created locally. A better approach would be to queue requests for new content, avoiding hitting IIS more than once when there's a sudden burst of new requests.
Implementing a RabbitMQ/Cassandra queue for new requests would be the next step here, so we can avoid an initial slowdown when hit with a burst of new requests.
In Conclusion
SPI engineers came up with a quick fix, that didn't involve any changes to the original application and made a huge impact on throughput and the number of concurrent connections the service can handle.
If you're dealing with massive traffic and you're not using Nginx yet, you owe it to yourself to take it for a spin.
View 4 Comment(s)
The system was not keeping up with their traffic.
Typically, in a situation like this, we would recommend re-architecting the application, piece by piece, replacing IIS with LAMP and optimizing database access.
In this case, the client was low on budget and didn't want to make too many changes. They were looking for a quick fix.
Following careful review of their .asp application, it became clear we're dealing with a chaotic buggy system and that we would have to cut deep, if we want to optimize existing code.
So we decided to go with a different approach.
Keep everything as is and use Nginx to reverse-proxy all incoming requests.
What is a Reverse Proxy?
A Reverse Proxy is a web server that handles all incoming requests from end-users, caching, load balancing and communicating with your back end primary servers as necessary.
IIS is slow. Nginx is super fast.
If we can't rewrite the code, let's have Nginx handle all traffic, connect to IIS internally and then cache the response from IIS, so that future requests can be fulfilled without ever hitting IIS.
The idea is to switch 1 million users downloading an image from IIS, with those users downloading everything from Nginx directly. Nginx is faster, light weight and scales easily.
Why Nginx?
Whenever we setup reverse proxies, one of our favorite options is Squid.
Squid has been around for a long time, very easy to setup and provides a good reverse-proxy caching solution.
In this case however, incoming requests required further logic before a request could be routed to IIS. Nginx is just as fast and offers greater flexibility by letting us use PHP.
Setting up
We provisioned a new dedicated server for the client and installed Nginx with PHP-FPM.
Analyzed all possible requests the IIS system was handling. They were all HTTP_GET requests, with varying parameters. IIS handled several vhosts, so we had to properly handle http://DomainA.com/dosomething?a=b and http://DomainB.com/dosomething?a=b
Configured Nginx to rewrite all requests for files that did not exist, to go to a notfound.php script:
location /
{
if (-d $request_filename)
{
break;
}
if (!-f $request_filename)
{
rewrite ^(.*)$ /notfound.php?$1 last;
break;
}
}
In notfound.php, we would connect to IIS to retrieve the image / static page / dynamic content, then save it locally.
The IIS system served different content based on the user's ip address and origin, so we had to take that into account when saving file names. (/us/google/welcome.gif vs /canada/yahoo/welcome.gif)
Going live
After testing everything locally, we had the client update their DNS, sending all traffic to Nginx instead of IIS.
The impact on performance has been very noticeable.
IIS CPU utilization went down from 70% to below 5% at all times and Nginx was barely breaking a sweat, handling the majority of the requests locally, reverting to IIS only when presented with a new combination of parameters that was never seen before.
We later developed a simple way to "expire" content on Nginx so that whenever the client updated the IIS Content Management System, the changes would propagate properly.
There is one aspect of this solution that is still lacking and worth mentioning. In the event of a sudden burst of new requests with never-seen-before parameters, the current implementation will revert all requests to IIS until files are created locally. A better approach would be to queue requests for new content, avoiding hitting IIS more than once when there's a sudden burst of new requests.
Implementing a RabbitMQ/Cassandra queue for new requests would be the next step here, so we can avoid an initial slowdown when hit with a burst of new requests.
In Conclusion
SPI engineers came up with a quick fix, that didn't involve any changes to the original application and made a huge impact on throughput and the number of concurrent connections the service can handle.
If you're dealing with massive traffic and you're not using Nginx yet, you owe it to yourself to take it for a spin.
View 4 Comment(s)
|
How to hide .php extension in your urls with NginxAdrian Singer, November 4, 2010 -- Posted under Programming |
Looking for clean urls (/hello instead of /hello.php)?
Here's how to set it up:
Step 1
Create a notfound.php script and place it in your root web server folder
// Set this for easier access
$url = substr($REQUEST_URI,1);
// Strip parameters
if (($pos = strpos($url,"?"))>0)
{
$url_parameters = substr($url, $pos+1);
$url = substr($url, 0, $pos);
}
$url = trim(strtolower($url));
// Strip prefix and suffix '/'
if ($url[0]=='/') $url = substr($url,1);
if (strlen($url)>1)
if ($url[strlen($url)-1]=='/') $url = substr($url, 0, strlen($url)-1);
// If url starts with .. it's a hack attempt
if (Strcasecmp(substr($url,0,2),"..")==0)
{
$url = str_replace("..","",$url);
}
// If we have a php script with this name
if (file_exists($url.".php"))
{
// Set PHP_SELF and REQUEST_URI to point to the real script
$_SERVER['PHP_SELF'] = $PHP_SELF = $_SERVER['REQUEST_URI'] = $REQUEST_URI = "/".$url;
if (!empty($url_parameters)) $_SERVER['REQUEST_URI'] = $REQUEST_URI .= "?".$url_parameters;
// Load real php script
require($DOCUMENT_ROOT."/$url.php");
return;
}
Step 2
Update your Nginx nginx.conf file, rewriting all urls where the file is not found, to notfound.php
location /
{
if (-d $request_filename)
{
break;
}
if (!-f $request_filename)
{
rewrite ^(.*)$ /notfound.php?$1 last;
break;
}
}
Note: This is different than doing an error_document 404 redirect. With a 404 redirect, HTTP_POST data is not preserved.
View 4 Comment(s)
Here's how to set it up:
Step 1
Create a notfound.php script and place it in your root web server folder
// Set this for easier access
$url = substr($REQUEST_URI,1);
// Strip parameters
if (($pos = strpos($url,"?"))>0)
{
$url_parameters = substr($url, $pos+1);
$url = substr($url, 0, $pos);
}
$url = trim(strtolower($url));
// Strip prefix and suffix '/'
if ($url[0]=='/') $url = substr($url,1);
if (strlen($url)>1)
if ($url[strlen($url)-1]=='/') $url = substr($url, 0, strlen($url)-1);
// If url starts with .. it's a hack attempt
if (Strcasecmp(substr($url,0,2),"..")==0)
{
$url = str_replace("..","",$url);
}
// If we have a php script with this name
if (file_exists($url.".php"))
{
// Set PHP_SELF and REQUEST_URI to point to the real script
$_SERVER['PHP_SELF'] = $PHP_SELF = $_SERVER['REQUEST_URI'] = $REQUEST_URI = "/".$url;
if (!empty($url_parameters)) $_SERVER['REQUEST_URI'] = $REQUEST_URI .= "?".$url_parameters;
// Load real php script
require($DOCUMENT_ROOT."/$url.php");
return;
}
Step 2
Update your Nginx nginx.conf file, rewriting all urls where the file is not found, to notfound.php
location /
{
if (-d $request_filename)
{
break;
}
if (!-f $request_filename)
{
rewrite ^(.*)$ /notfound.php?$1 last;
break;
}
}
Note: This is different than doing an error_document 404 redirect. With a 404 redirect, HTTP_POST data is not preserved.
View 4 Comment(s)
|
iWeb down for 4 hours, 3000 servers affectedAdrian Singer, November 4, 2010 -- Posted under Get Online |
Earlier tonight, at 11:00PM EDT, iWeb - a large data center headquartered in Canada, experienced a power outage due to a nearby fire, taking a third of its data center completely down.
As of this writing, 372 servers are still affected and iWeb is working frantically to bring machines back up.

-
No matter which hosting provider you use, Murphy's law will have every single one of them go down on you, when you least expect it.
Five 9's reliability is no longer enough.
Customers demand 100% uptime and unless you're Twitter or Facebook, customers will go elsewhere when your service availability begins to deteriorate.
If 100% uptime is important to you, contact us. We can help.
As of this writing, 372 servers are still affected and iWeb is working frantically to bring machines back up.

-
No matter which hosting provider you use, Murphy's law will have every single one of them go down on you, when you least expect it.
Five 9's reliability is no longer enough.
Customers demand 100% uptime and unless you're Twitter or Facebook, customers will go elsewhere when your service availability begins to deteriorate.
If 100% uptime is important to you, contact us. We can help.
|
Three different ways of handling a problemMike Peters, October 25, 2010 -- Posted under Programming |
Presented with a new problem, I noticed engineers can be divided to three groups:
1. I don't know, don't understand, not familiar with this part of the code
Unless you're on your first few weeks with the company, this statement is totally inexcusable. It's lazy and pathetic.
Be a Problem solver.
Waving your hands in the air announcing to the world you're not familiar with some code, does nothing more than exhibit your incompetence in picking up something new and running with it.
How comfortable will your manager, peers and clients feel, once you've made such a statement?
Be a Problem solver. Study the code, follow the execution process flow, find engineers who are more familiar than you and ask them specific questions.
Don't stop probing and asking until you've mastered the code.
Unless you fully understand the code, problem at hand and the big picture implications, you're better off not touching it at all.
2. Quick and dirty: Let me patch it up real nice. In & out as quickly as possible.
Time is of the essence, right?
How about doing the absolute minimum, so the problem can be patched up and you can move on with more important pressing items on your todo list?
Wrong.
I'm a big proponent of code clarity and elegance.
Taking shortcuts, especially when working in an agile development environment, will come back to bite you in the butt.
Why I hate patches:
* Patches clutter the code, which means the next guy will have to struggle twice as hard to understand what's going on
* Patches are often specific to dealing with one edge case, one facet of the problem. Slightly different variation of the same problem and you're back to square one.
* Patching up code is often a lazy act done by someone who didn't want to take the time to understand the big picture. Which means, chances are the patch can break other perfectly valid scenarios.
Take the time to do things right, the first time around.
The difference between patching up a problem and doing it right? That brings me to the third type of engineers, those who still have a job.
3. The right way: Complete, Elegant and Short
How do you know when you've fully mastered a problem and came up with the best possible solution?
Think Occam's Razor
* When your solution is clean, simple and easy to understand;
* When you cannot make it any better had you had unlimited time on your hands;
* When your solution doesn't handle a single edge case of the problem, but rather completely eliminate it;
* When you feel your code should go up on the code hall of fame;
...that's when you're a real code ninja.
Be careful though. It's very easy to take this the way of over-engineering things.
If you're adding complexity instead of taking things away and simplyfing, you're doing it all wrong.
This doesn't mean you should re-engineer an entire architecture from scratch, only to make things cleaner.
It's okay to take baby steps, just make sure your contribution to the code, is the single most Elegant, Complete and Short solution.
-
A simple example
MySQL database server used to store timestamps in this format:
YYYYMMDDHHIISS
For example, 7am October 25th, 2010, would be represented as: 20101025070000
Starting with MySQL 5.1, the way timestamps are represented in the database changed to: YYYY-MM-DD HH:II:SS
The same date, is now represented as: 2010-10-25 07:00:00
When no date is set, MySQL 5.1 used 0000-00-00 00:00:00 whereas previous versions of the MySQL database used 00000000000000.
A few sections of our code had to test whether or not a timestamp was set.
The original php code looked something like this:
// If no timestamp
if ($timestamp=="00000000000000")
{
// Do something
}
With the transition to MySQL 5.1, a new problem surfaced, where new records used "0000-00-00 00:00:00" to indicate no date was set, while old denormalized data still used "00000000000000".
Three different engineers approached this simple problem of supporting both the old and new timestamp formats.
I've included an excerpt of what each had to say below.
Engineer 1:
Engineer 2:
Engineer 3:
* First guy is out.
* Second guy cluttered the code, didn't properly handle other edge cases and didn't think about the future.
* Third one passed with flying colors.
Closing Thoughts
When I originally wrote this piece, I was thinking software development.
I realize now, after reading this again, that the same traits described here, apply to any area in business.
Be a true Problem solver.
It's the single most important skill you'll ever develop.
View 2 Comment(s)
1. I don't know, don't understand, not familiar with this part of the code
Unless you're on your first few weeks with the company, this statement is totally inexcusable. It's lazy and pathetic.
Be a Problem solver.
Waving your hands in the air announcing to the world you're not familiar with some code, does nothing more than exhibit your incompetence in picking up something new and running with it.
How comfortable will your manager, peers and clients feel, once you've made such a statement?
Be a Problem solver. Study the code, follow the execution process flow, find engineers who are more familiar than you and ask them specific questions.
Don't stop probing and asking until you've mastered the code.
Unless you fully understand the code, problem at hand and the big picture implications, you're better off not touching it at all.
2. Quick and dirty: Let me patch it up real nice. In & out as quickly as possible.
Time is of the essence, right?
How about doing the absolute minimum, so the problem can be patched up and you can move on with more important pressing items on your todo list?
Wrong.
I'm a big proponent of code clarity and elegance.
Taking shortcuts, especially when working in an agile development environment, will come back to bite you in the butt.
Why I hate patches:
* Patches clutter the code, which means the next guy will have to struggle twice as hard to understand what's going on
* Patches are often specific to dealing with one edge case, one facet of the problem. Slightly different variation of the same problem and you're back to square one.
* Patching up code is often a lazy act done by someone who didn't want to take the time to understand the big picture. Which means, chances are the patch can break other perfectly valid scenarios.
Take the time to do things right, the first time around.
The difference between patching up a problem and doing it right? That brings me to the third type of engineers, those who still have a job.
3. The right way: Complete, Elegant and Short
How do you know when you've fully mastered a problem and came up with the best possible solution?
Think Occam's Razor
* When your solution is clean, simple and easy to understand;
* When you cannot make it any better had you had unlimited time on your hands;
* When your solution doesn't handle a single edge case of the problem, but rather completely eliminate it;
* When you feel your code should go up on the code hall of fame;
...that's when you're a real code ninja.
Be careful though. It's very easy to take this the way of over-engineering things.
If you're adding complexity instead of taking things away and simplyfing, you're doing it all wrong.
This doesn't mean you should re-engineer an entire architecture from scratch, only to make things cleaner.
It's okay to take baby steps, just make sure your contribution to the code, is the single most Elegant, Complete and Short solution.
-
A simple example
MySQL database server used to store timestamps in this format:
YYYYMMDDHHIISS
For example, 7am October 25th, 2010, would be represented as: 20101025070000
Starting with MySQL 5.1, the way timestamps are represented in the database changed to: YYYY-MM-DD HH:II:SS
The same date, is now represented as: 2010-10-25 07:00:00
When no date is set, MySQL 5.1 used 0000-00-00 00:00:00 whereas previous versions of the MySQL database used 00000000000000.
A few sections of our code had to test whether or not a timestamp was set.
The original php code looked something like this:
// If no timestamp
if ($timestamp=="00000000000000")
{
// Do something
}
With the transition to MySQL 5.1, a new problem surfaced, where new records used "0000-00-00 00:00:00" to indicate no date was set, while old denormalized data still used "00000000000000".
Three different engineers approached this simple problem of supporting both the old and new timestamp formats.
I've included an excerpt of what each had to say below.
Engineer 1:
Quote:
|
I'm not too familiar with MySQL 5.1, but I don't think we can support the old timestamps any more. |
Engineer 2:
Quote:
|
Fixed! And it only took a minute to do. I'm now testing for both cases: // If no timestamp if ($timestamp=="00000000000000" || $timestamp=="0000-00-00 00:00:00") |
Engineer 3:
Quote:
|
To keep the code clean, handle other cases (where we have 0000-00-00 with no hour/minute/second) and make it easy to adjust in the future, I created a new function IsEmptyTimestamp() and updated the code accordingly: // If no timestamp if (IsEmptyTimestamp($timestamp)) |
* First guy is out.
* Second guy cluttered the code, didn't properly handle other edge cases and didn't think about the future.
* Third one passed with flying colors.
Closing Thoughts
When I originally wrote this piece, I was thinking software development.
I realize now, after reading this again, that the same traits described here, apply to any area in business.
Be a true Problem solver.
It's the single most important skill you'll ever develop.
View 2 Comment(s)
|
How to mount /proc on FreeBSDMichel Nadeau, September 27, 2010 -- Posted under Programming |
There are a few commands under FreeBSD that depend on procfs (process file system).
FreeBSD doesn't mount it by default.
This tutorial describes how to mount /proc on FreeBSD and how to get FreeBSD to do it automatically when rebooting.
1. Mounting /proc
To mount /proc, run the following command:
Applications and commands depending on procfs will now work correctly.
2. Mount /proc automatically when rebooting
To get /proc to be mounted automatically when rebooting, simply add this line in /etc/fstab:
There you go, /proc will now be mounted automatically at boot time.
FreeBSD doesn't mount it by default.
This tutorial describes how to mount /proc on FreeBSD and how to get FreeBSD to do it automatically when rebooting.
1. Mounting /proc
To mount /proc, run the following command:
mount -t procfs proc /proc
Applications and commands depending on procfs will now work correctly.
2. Mount /proc automatically when rebooting
To get /proc to be mounted automatically when rebooting, simply add this line in /etc/fstab:
proc /proc procfs rw 0 0
There you go, /proc will now be mounted automatically at boot time.
| « Previous Posts | » Next Posts |
