Recent Posts

Sponsors
![]() |
redis: a persistent key-value storeMike Peters, 11-05-2010 |
redis is a key-value store, similar to memcached but with data persistence.
redis supports multiple value types, counters with atomic increment/decrement and built-in key expiration.
To achieve persistence without scarifying speed, like Cassandra, redis performs updates in memory as well as adding them to an append-only file, which is synced to disk from time to time.
redis is fast (110,000 writes per second, 81,000 reads per second), supports sharding and master-slave replication (no master-master yet)
Why redis?
Those of you keeping track, know we've always been big fans of MySQL but at the same time, we keep writing about migrating different parts of our application to Memcached, Cassandra, Lucene and ElasticSearch.
Why do we keep jumping from one storage engine to another? Can't we make up our minds already and settle with the "best" storage engine that meets our needs?
In short, No.
A common misconception is the belief that all storage engines are created equal, all designed to simply "store stuff" and provide fast access to your data. Unless your application performs one clearly defined simple task, it is a dire mistake to expect a single storage engine will effectively fulfill all of your data warehousing and processing needs.
* MySQL is great when you need ad-hoc queries and you're dealing with a relatively small data set.
* Memcached comes into play when you have a read-heavy environment and need a quick volatile cache to avoid querying MySQL a dozen times per page.
* Lucene and ElasticSearch are your friends when you need fulltext search, or when your MySQL data set grows to a point where running the filters and joins in MySQL becomes slow like a snail.
* Cassandra is amazing when you have a write-heavy environment, need to be capable of scaling writes linearly and supporting a huge data set.
* redis works particularly well as a state machine, when you need counters with atomic increment/decrement. Typical uses: "how many users are on my website?" ala ChartBeat, "how many jobs are waiting to be processed" etc.
Architecture
redis is written in Ansi C and runs as a single light-weight daemon on your machine. All updates are done in memory first and saved to disk later, asynchronously.
Supported languages: C, C#, Erlang, Java, JavaScript, Lua, Perl, PHP, Python, Ruby, Scala, Go, and Tcl.
As of 15 March 2010, development of redis is funded by VMware.
Installing redis
Step 1
Download the redis tarball and extract it
wget "http://redis.googlecode.com/files/redis-2.0.3.tar.gz"
tar xvzf redis-2.0.3.tar.gz
Step 2
Compile redis and install it
gmake
gmake install
Step 3
Run redis
/usr/local/bin/redis-server
Once running you can use redis-benchmark to run some benchmarks.
redis doesn't come with a config file, it will use all default settings by default. But you're going to want to study the config options and set them up.
Sample redis config file here
Configuration options
A few important redis.conf options you're going to want to set:
* First, if you will only be connecting to a local Redis instance, uncomment the bind configuration in the sample file:
bind 127.0.0.1
That tells Redis not to listen for external connections.
* redis supports multiple databases, but for most use cases, you're only going to need one. Change the default 16 to 1:
databases 1
* You can set the maximum number of bytes Redis can allocate, after which it will start purging volatile keys. If it cannot reclaim any more memory it will start refusing write commands. Here's a sample setting for a 100MB limit:
maxmemory 104857600
* The server will periodically fork and asynchronously dump the current contents of the database to disk. The dump is actually made to a temporary file and then moved to replace any older dump, so the operation is atomic and won't leave you with a partially dumped database. If Redis is eventually shutdown and reloaded, it will restore from this dump file.
How often it dumps the keys is configureable by the amount of time that passes and the number of changes that have been made to the data. For example, the following settings tell Redis to dump the database after 60 seconds if 100 changes have been made or after five minutes if there has been at least 1 change:
save 300 1
save 60 100
* By default redis starts in foreground mode. To fix that, change demonize option in redis.conf file to "Yes":
demonize yes
To Redis or not to Redis?
If you have a large data set that cannot comfortably fit into RAM, Redis is not the key value store for you to use, but if you have smaller sets, and if you can live with the asynchronous write behavior, then, for me, the answer is definitely "to Redis."
As an alternative, Tokyo Cabinet is very fast for a synchronous key value store, and it does support some features that Redis does not, such as tables. Redis permits a master/slave setup, which can alleviate fears of data loss from failure, but it's not as certain as something like Tokyo Cabinet, which will write the data as soon as it gets it. On the other hand, Redis is blazingly fast, incredibly easy to use, and will support just about anything you can think of doing with your data.
More resources:
* Try redis in your browser
* Download redis
* Retwis - a PHP twitter clone using redis
redis supports multiple value types, counters with atomic increment/decrement and built-in key expiration.
To achieve persistence without scarifying speed, like Cassandra, redis performs updates in memory as well as adding them to an append-only file, which is synced to disk from time to time.
redis is fast (110,000 writes per second, 81,000 reads per second), supports sharding and master-slave replication (no master-master yet)
Why redis?
Those of you keeping track, know we've always been big fans of MySQL but at the same time, we keep writing about migrating different parts of our application to Memcached, Cassandra, Lucene and ElasticSearch.
Why do we keep jumping from one storage engine to another? Can't we make up our minds already and settle with the "best" storage engine that meets our needs?
In short, No.
A common misconception is the belief that all storage engines are created equal, all designed to simply "store stuff" and provide fast access to your data. Unless your application performs one clearly defined simple task, it is a dire mistake to expect a single storage engine will effectively fulfill all of your data warehousing and processing needs.
* MySQL is great when you need ad-hoc queries and you're dealing with a relatively small data set.
* Memcached comes into play when you have a read-heavy environment and need a quick volatile cache to avoid querying MySQL a dozen times per page.
* Lucene and ElasticSearch are your friends when you need fulltext search, or when your MySQL data set grows to a point where running the filters and joins in MySQL becomes slow like a snail.
* Cassandra is amazing when you have a write-heavy environment, need to be capable of scaling writes linearly and supporting a huge data set.
* redis works particularly well as a state machine, when you need counters with atomic increment/decrement. Typical uses: "how many users are on my website?" ala ChartBeat, "how many jobs are waiting to be processed" etc.
Architecture
redis is written in Ansi C and runs as a single light-weight daemon on your machine. All updates are done in memory first and saved to disk later, asynchronously.
Supported languages: C, C#, Erlang, Java, JavaScript, Lua, Perl, PHP, Python, Ruby, Scala, Go, and Tcl.
As of 15 March 2010, development of redis is funded by VMware.
Installing redis
Step 1
Download the redis tarball and extract it
wget "http://redis.googlecode.com/files/redis-2.0.3.tar.gz"
tar xvzf redis-2.0.3.tar.gz
Step 2
Compile redis and install it
gmake
gmake install
Step 3
Run redis
/usr/local/bin/redis-server
Once running you can use redis-benchmark to run some benchmarks.
redis doesn't come with a config file, it will use all default settings by default. But you're going to want to study the config options and set them up.
Sample redis config file here
Configuration options
A few important redis.conf options you're going to want to set:
* First, if you will only be connecting to a local Redis instance, uncomment the bind configuration in the sample file:
bind 127.0.0.1
That tells Redis not to listen for external connections.
* redis supports multiple databases, but for most use cases, you're only going to need one. Change the default 16 to 1:
databases 1
* You can set the maximum number of bytes Redis can allocate, after which it will start purging volatile keys. If it cannot reclaim any more memory it will start refusing write commands. Here's a sample setting for a 100MB limit:
maxmemory 104857600
* The server will periodically fork and asynchronously dump the current contents of the database to disk. The dump is actually made to a temporary file and then moved to replace any older dump, so the operation is atomic and won't leave you with a partially dumped database. If Redis is eventually shutdown and reloaded, it will restore from this dump file.
How often it dumps the keys is configureable by the amount of time that passes and the number of changes that have been made to the data. For example, the following settings tell Redis to dump the database after 60 seconds if 100 changes have been made or after five minutes if there has been at least 1 change:
save 300 1
save 60 100
* By default redis starts in foreground mode. To fix that, change demonize option in redis.conf file to "Yes":
demonize yes
To Redis or not to Redis?
If you have a large data set that cannot comfortably fit into RAM, Redis is not the key value store for you to use, but if you have smaller sets, and if you can live with the asynchronous write behavior, then, for me, the answer is definitely "to Redis."
As an alternative, Tokyo Cabinet is very fast for a synchronous key value store, and it does support some features that Redis does not, such as tables. Redis permits a master/slave setup, which can alleviate fears of data loss from failure, but it's not as certain as something like Tokyo Cabinet, which will write the data as soon as it gets it. On the other hand, Redis is blazingly fast, incredibly easy to use, and will support just about anything you can think of doing with your data.
More resources:
* Try redis in your browser
* Download redis
* Retwis - a PHP twitter clone using redis
![]() |
neo, 11-09-2010 |
I was looking for this on web for long time tanx
![]() |
anehra63, 11-09-2010 |
The article was quite cool.
![]() |
Paul, 11-10-2010 |
ElasticSearch is NOT recommended as a primary data store. It is a great search engine, but you need to ensure you have your data backed elsewhere.
![]() |
Mike Peters, 11-10-2010 |
Paul - you're right, we never suggested using Elastic Search for primary data storage.
ES does a great job at doing the heavy-lifting (filters, facets, full text search), letting your storage engine do nothing more but store data and handle primary-key lookups.
ES does a great job at doing the heavy-lifting (filters, facets, full text search), letting your storage engine do nothing more but store data and handle primary-key lookups.
![]() |
Kundan Ray, 01-16-2015 |
Great Post.Thanks
|

Subscribe Now to receive new posts via Email as soon as they come out.
Comments
Post your comments