Call us Toll-Free:
1-800-218-1525
Email us

 Sponsors

MySQL Storage Engines

Mike Peters, 01-20-2008
One of the greatest things about MySQL, other than being free, widely supported and fast, is the flexibility of choosing different storage engines for different tables.

Out of the box, MySQL comes with 7 storage engines, including an "example" stub storage engine that allows you to implement your own storage library.

What's the big deal about having all these options?

Every storage engine is completely different, designed to address a unique application need.

Not being locked down to a single storage engine (like Oracle), means you can optimize and choose the best tool for the job.

Tip: A well designed MySQL-powered application will use different storage engines for different tables. If you're still stuck with all MyISAM tables, now may be a good time to revisit.

MySQL Storage Engines Overview

MyISAM: The default engine. No transactions support, average data reliability. Offers great performance for read heavy applications. Most web services and data warehousing applications use MyISAM heavily.

HEAP: All in-memory. Very fast for data retrieval, however due to being stored only in memory - all data is lost on shutdown. Great for temporary tables.

Archive: Used for storing large amounts of data without indexes in a small footprint.

Merge: Collection of MyISAM tables logically merged together to provide a single view.

InnoDB: Transaction-safe storage engine, best suited for write heavy environments thanks to row-level locking. Offers good built-in recovery and solid data reliability. InnoDB engine was acquired by Oracle on 2005.

NDB: A clustered engine - data is automatically split and replicated across several machines, a.k.a data nodes. Best suited for applications that require high performance lookups with the highest possible degree of uptime and availability. Originally designed by Ericsson for the Telco market, NDB offers the highest levels of data reliability (99.999%). NDB works well in read heavy environments. For write heavy environments with multiple concurrent writes, consider InnoDB.

The biggest disadvantadge of NDB, is that by design your entire database must fit in memory. If your database size times 2 is too big to fit in memory, NDBCluster is not for you.

-

To make it easier to follow the unique characteristics of each storage engine, I created this magic quadrant diagram:

Image


Examples:

Below are some examples of using the best storage engine for different tasks:

Search Engine - NDBCluster

Web stats logging - Flat file for the logging with an offline processing demon processing and writing all stats into InnoDB tables.

Financial Transactions - InnoDB

Session data - MyISAM or NDBCluster

Localized calculations - HEAP

Dictionary - MyISAM

Important notes about MyISAM tables:

1. Your tables will get corrupted eventually! Plan accordingly.

Tar the entire database directory daily and setup MySQL replication to a slave for an up-to-the-minute live backup.

2. Turn on auto-repair by adding this flag to your my.cnf file:
myisam-recover=backup,force

Or consider running a check-all-tables-and-email-me cronjob daily: See our MySQL Table Maintenance automation.

3. Super fast for read (select) operations.

4. Concurrent writes lock the entire table. Switch everything to offline processing where you can, to serialize writes without taking the database down. (Offline processing is golden and applies to all table types)

Important notes about HEAP/Memory tables:

While this type of table offers super fast retrieval, it only works well for small temporary tables.

If you try to load too much data into a Memory table, MySQL will start swapping information to disk and then you lose the benefits of an all-memory storage.

Important notes about InnoDB tables:

1. ACID transactions support. Row-level locking (compared to table level locking with MyISAM) means faster concurrent writes.

2. Doing a "SELECT Count(*) FROM table" without specifying any indexes is very slow on InnoDB and requires a full table scan. (With MyIsam this operation doesn't cost anything because MyIsam stores an internal record counter with each table).

If you need to "SELECT COUNT(*)" often on InnoDB tables, create MySQL insert/delete triggers that will increment/decrement a counter whenever records are added or deleted from the table.

3. Backup:

Doing a tar/rsync backup where you simply copy all files is not possible with InnoDB.

MySQLDump backup is too slow with InnoDB. (If you insist on using it, turn on these flags: --opt --compress)

The only viable fast backup option, which can also be used to populate new slave machines, is InnoDB Hot Backup.

4. Recovery:

InnoDB has built-in recovery that works 99% of the times automatically. Never try to move .frm or .ibd files around as a way of "helping" the database to recover. If the built-in recovery doesn't work, switch to your slave server and restore the primary from backup.

5. LOAD DATA INFILE is too slow with InnoDB. Consider using MyIsam tables for LOAD DATA operations.

6. InnoDB is less forgiving than MyIsam when it comes to queries on non indexes. InnoDB is going to "School" you into ensuring every single query and update statement runs on an index. Issue no index queries and you'll pay dearly in execution time.

7. Never ever change my.cnf INnoDB log file size while the database is running. You'll corrupt the log sequence number beyond repair.

8. To maximize InnoDB MySQL database performance, start with these my.cnf settings:

innodb_open_files = 500
innodb_file_per_table
innodb_buffer_pool_size = 250M
innodb_flush_log_at_trx_commit = 2
innodb_thread_concurrency =8
innodb_lock_wait_timeout = 500
interactive_timeout = 20
back_log = 75
table_cache = 300
thread_cache = 32
thread_concurrency = 8
wait_timeout = 30
connect_timeout = 10

9. If InnoDB crashes and the built-in recovery mechanism is unable to roll-back transactions, your database will not start. This is very important to understand. With MyISAM, even if a table gets corrupted, you can still start the database and everything will work normally. InnoDB will simply refuse to start until you restore the entire database from backup. Make sure you understand this principle and backup religiously.

Scalability:

Every successful web application eventually outgrows the capacity and throughput of a single database machine.

At that point you typically have two options - Replication or NDBCluster.

As always the choice depends on the needs of your application.

For read-heavy environments, use an NDBCluster or setup replication for n MyISAM slave read-only machines.

For write-heavy environments, InnoDB on an active/passive replication setup is typically the best choice. You may also want to experiment with an NDBCluster. An NDBCluster is generally going to be slower than InnoDB in write-heavy environments, but it offers a higher level of availability.

Helpful Scripts:

* Convert all tables to InnoDB

* MySQLDump one database to another (designed for a case where you cannot use InnoDB Hot backup and you have a lot of tables/databases to import. Script automatically retries, reopens connection and imports one table at a time)

* Add triggers to all tables for fast InnoDB count(*)
You can only have one delete/insert trigger per each table so if you're already using triggers, modify the code accordingly.

Michael, 01-21-2008
Great article w. adequate length.

Thanks for the notes/tips #1 and #2 regarding MyISAM tables.

Justin Longo, 01-21-2008
This has got to be the best breakdown of MySQL storage engines I have ever seen.

Bookmarked

Lars Pohlmann, 01-21-2008
A bit short, but sufficient for the beginner.

Though this statement shows some lack of knowledge: "Not being locked down to a single storage engine (like Oracle), means you can optimize and choose the best tool for the job."

With Oracle you don't have to chose, because it delivers row-level-locking, transactions and good read-performance at the same time. With MySQL you "have" to chose, because no storage-engine is capable of that. Let's see how "Falcon" will turn out.

Mike Peters, 01-21-2008
Thank you Lars

Yes Oracle is definitely a more robust database with plenty of features that are lacking in MySQL.

However being able to select a storage engine means you can optimize the best tool for each job. This is why MySQL delivers far better performance for specialized tasks.

While we love Oracle 11g, arguments can be made in favor of MySQL in terms of performance, stability, ease of use, and cost.

Some things unique to MySQL that Oracle does not offer include: Fast connections, Easy replication, Overall ease of use.

If you don't have a dedicated DBA and approx $50k annually for Oracle Enterprise edition license fees, Oracle is not an option.

Shu, 01-21-2008
For InnoDB: remember that your datafiles will never ever shrink. If you store 10 GiB and then delete 9 GiB, datafiles will still take 10 GiB on disk.
So if you have a high write-read-delete ratio plan your deletes wisely. :)

The only way to get your storage back is to dump the whole database, delete InnoDB datafiles, then restore.

Mike Peters, 01-21-2008
Shu - Good point, but there's a trick around it.

First notice I suggested turning on innodb_file_per_table which tells InnoDB to use a separate file for each table.

Then, all you have to do to reclaim the space (similar to the MyISAM OPTIMIZE command), is to issue a single:

ALTER TABLE 'tbl_name' ENGINE=INNODB

For each of the tables you'd like to shrink.

Andrew, 09-25-2008
Mike,

Excellent article, very helpful.

I'm a novice and am looking to convert a Master-Slave MyISAM cluster to utilize InnoDB row locking and rollback features- so I have a few questions.

1. You say "Doing a "SELECT Count(*) FROM table" without specifying any indexes is very slow on InnoDB and requires a full table scan" - does this mean that all queries have to use 'USE INDEX' or 'FORCE INDEX'? OR does it just mean that searches on non-indexed rows are terribly slow because they require a full table scan?

2. How difficult is it to have an InnoDB Master replicating to MyISAM slaves (as well as InnoDB slaves)?

3. Are there any major issues with MyISAM -> InnoDB conversion (due to partitioning or table size)?

4. Would there be any advantage in waiting for Falcon? When do you think it could be generally usable?

Any help would be excellent - thanks so much.

Mike Peters, 10-07-2008
Hi Andrew,

Thank you for your comments. I will try to address all your questions -

"Does this mean that all queries have to use 'USE INDEX' or 'FORCE INDEX'? OR does it just mean that searches on non-indexed rows are terribly slow because they require a full table scan?"

It just means you should avoid count(*) on a full table. count(*) will always iterate through every single record in the resultset. If you have a large resultset, this operation will be expensive with InnoDB.

If count(*) is important in your application, one trick we use to get around it is implement MySQL triggers on insert/delete from your table, saving an external row counter that you can access anytime.

"How difficult is it to have an InnoDB Master replicating to MyISAM slaves (as well as InnoDB slaves)"

Replication works the same way, regardless of the target database storage engine. To avoid problems, your best bet is to keep master and slaves using the same MySQL database version and same MySQL storage engines.

While InnoDB <=> MyISAM replication does work, it is somewhat error prone.

"Are there any major issues with MyISAM -> InnoDB conversion"

There are no issues with the actual act of conversion. The main issues you will face are different query times due to a different implementation of each storage engine.

Following conversion, it is a good idea to go through all existing queries and run an EXPLAIN. Optimize as needed.

"Would there be any advantage in waiting for Falcon?"

We recommend waiting another year before using Falcon in a production environment.

sri, 12-19-2008
Hi Mike,
I have a very specific question. I am designing my application to heavily use surrogate keys.
I am not planning to add any indexes on my remaining business columns.

So, i will following queries perform on a huge table with 10+ million rows?

1) Select business_column1.....business_columnN from table bunsiness where id = 15000 and business_column4 = 'SomeString';

2) Select business_column1.....business_columnN from table bunsiness where id > 25000 and business_column4 = 'SomeString';

3) Select business_column1.....business_columnN from table bunsiness where id > 25000 and id<30,000 and business_column4 = 'SomeString';

Remeber, i don't have an index on any of these business_columns and 'id' is Primary Key.

Thank you
sri

Josh Prowse, 01-25-2009
Great article. I'm going to see how badly InnoDB whacks the performance of my next application-- I'd rather deal with optimizing performance than undoing data corruption!

shiva, 03-19-2009
Hi Mike
This is a very good article. and i felt its very informatic since i am a bigginer
once again thanks alot

sameer, 06-23-2009
Hi Mike

This is good article for beginners.
1. Why the InnoDB is slow when we are firing 'Select' as compare to MyIsam?
2. How the data actually get stored on Disk in MyIsam & InnoDB?

roberto.cr, 08-07-2009
Congratulations for this excellent article!

mike lee, 09-12-2009
great article, thanks a lot!

Gareth, 09-21-2009
Hi,

Great article thanks. Really nicely written with just enough detail for a quick overview. One thing though, in saying that, I kind of wish there had been a bit more info about this point in particular:

Important notes about MyISAM tables:

1. Your tables will get corrupted eventually! Plan accordingly.

I wouldn't mind knowing why this is. Presumably it's just a flaw in the design or something? Am I the only would who would be quite concerned about that?!

Thanks again,

Gareth.

varul, 12-03-2009
Selected set of instruction have been given for selecting a better Storage engines.
Turn on auto-repair, MySQL Table Maintenance automation and MySQL replication to a slave are very special to persons like me new to designing mysql database.

David Alimian, 01-25-2010
Hi Mike, You said MYISAM is super fast on reads, how is that? Is it faster than Inno DB and why? INNODB does support hash indexes and build an index out of the b-tree automatically, given that hash indexes are fastest way of looking up shit wouldnt this make innodb faster in some cases on reads than MYISAM?

One more question: My understanding is MyISAM locks the table only during a write and not during reads, updates, is this correct? What operarations put a lock on the table?

David Alimian, 01-25-2010
And thanks for your article, it was really something else. You took the breaking down of storage engines to a new level

Adnan, 03-08-2010
Hi,

I have configured MySql Clustered with 1 manager and two data nodes and 2 Mysqld nodes. All these servers are situated at the same place. I have another server placed out of this network which is also for fail-over. This server has simple Mysql installed that is not clustered version. and I want to replicate using binlog from one of data nodes to this out of box server.
Is it possible? if yes then how? currently I got this all running with Innodb data only. e.g. if I create InnoDB table, it gets replicated But when I create NDB table which is the requirement , does not..

Ayesh, 04-30-2011
Thank you very much for your great post - really useful.

Jatin Kansagara, 06-30-2011
I have don't know how to store gujarati Language in MYSql can any one help me for this issue...

David, 08-24-2011
Thanks

Gentjan, 09-10-2011
Great post. Thanks!

Sumit Neema, 01-31-2012
Great Article Also good for a initial level developers such as me
Thank You

jay, 10-21-2012
Thanks for sharing,interesting points but anyway its always a lingering thought before db creation ,MyISAm or ?

imran ahmad, 12-12-2013
I want to built a data warehouse using mySql, I have some knowledge about data warehouse using Sql server. Anyone, please help me, What should be my first step?

DWLearner, 04-22-2016
Hi Imran,

I also have the same challenge of creating a DW on MySQL and have worked earlier on SQL Server.
How did it go for you ?
Enjoyed this post?

Subscribe Now to receive new posts via Email as soon as they come out.

 Comments
Post your comments












Note: No link spamming! If your message contains link/s, it will NOT be published on the site before manually approved by one of our moderators.



About Us  |  Contact us  |  Privacy Policy  |  Terms & Conditions