Full-service Internet Marketing & Web Development
Recent Posts

Sponsors
![]() |
Never use VARCHAR in MySQLMike Peters, 12-15-2009 |
When using strings in MySQL, you have a choice between several field types:
* VARCHAR: String of 0..65,535 characters with an extra byte to designate the length.
* CHAR: String of a fixed length of 0 to 255 characters.
* TEXT / BIGTEXT: String of 1..4GB characters with an extra byte to designate the length. Unlike VARCHAR, text fields cannot be compared, can't "order by", can't use "like" on them etc.
* BLOB: For binary or unicode strings. Supports 8bits per character and unlike the previous types, the string is not converted to a character-set prior to insertion.
Quick tip: Never ever use varchar.
Replace VARCHAR with CHAR whenever possible, or use TEXT if your strings are longer than 255 characters.
While VARCHAR offers slightly improved storage space (it will only take up n+1 characters on disk), there are a few major pitfalls with this variable type:
#1. If you have any field in the table that is a CHAR (fixed size) field, all VARCHARs will be stored as n+1, so you don't get any reduced storage benefits.
#2. If you're using MySQL replication, you may end up with a few nasty "master/slave table column type mismatch" errors - this is a big one
#3. CHAR fields offer better performance over VARCHAR
-
We first learned about the issues with VARCHAR when replication began breaking with these cryptic messages:
"Table definition on master and slave does not match: Column 37 size mismatch - master has size 255, testtable on slave has size 256. Master's column size should be <= the slave's column size."
Both tables were identical and mysqldump'ing one on top of the other didn't help. It seems to be rooted to MySQL's internal implementation of VARCHAR.
To convert all table VARCHAR fields to CHAR, use this PHP snippet:
* VARCHAR: String of 0..65,535 characters with an extra byte to designate the length.
* CHAR: String of a fixed length of 0 to 255 characters.
* TEXT / BIGTEXT: String of 1..4GB characters with an extra byte to designate the length. Unlike VARCHAR, text fields cannot be compared, can't "order by", can't use "like" on them etc.
* BLOB: For binary or unicode strings. Supports 8bits per character and unlike the previous types, the string is not converted to a character-set prior to insertion.
Quick tip: Never ever use varchar.
Replace VARCHAR with CHAR whenever possible, or use TEXT if your strings are longer than 255 characters.
While VARCHAR offers slightly improved storage space (it will only take up n+1 characters on disk), there are a few major pitfalls with this variable type:
#1. If you have any field in the table that is a CHAR (fixed size) field, all VARCHARs will be stored as n+1, so you don't get any reduced storage benefits.
#2. If you're using MySQL replication, you may end up with a few nasty "master/slave table column type mismatch" errors - this is a big one
#3. CHAR fields offer better performance over VARCHAR
-
We first learned about the issues with VARCHAR when replication began breaking with these cryptic messages:
"Table definition on master and slave does not match: Column 37 size mismatch - master has size 255, testtable on slave has size 256. Master's column size should be <= the slave's column size."
Both tables were identical and mysqldump'ing one on top of the other didn't help. It seems to be rooted to MySQL's internal implementation of VARCHAR.
To convert all table VARCHAR fields to CHAR, use this PHP snippet:
@mysql_query("SET GLOBAL WAIT_TIMEOUT = 600000");
@mysql_Query("SET WAIT_TIMEOUT = 600000");
$Result1 = @mysql_query("SHOW TABLES FROM $DBNAME");
$cnt1 = @mysql_num_rows($Result1);
while ($cnt1)
{
$cnt1--;
if (!($Row1 = @mysql_fetch_array($Result1)))
{
continue;
}
$tablename = $Row1[0];
$Result3 = @mysql_query("DESCRIBE $tablename");
$cnt3 = @mysql_num_rows($Result3);
while ($cnt3)
{
$cnt3--;
if (!($Row3 = @mysql_fetch_array($Result3))) continue;
if (strpos($Row3['Type'],'varchar')!==false)
{
$pos = strpos($Row3['Type'],'(');
$length = substr($Row3['Type'],$pos+1);
$pos = strpos($length,')');
$length = substr($length, 0, $pos);
// If length is over 255, nothing we can do about it
if ($length>255) continue;
if ($length<1) continue;
// Length is below 255, we can convert
$sql = "alter table $tablename change ".$Row3['Field']." ".$Row3['Field']." char($length) default '' not null";
@mysql_query($sql);
}
}
}
@mysql_Query("SET WAIT_TIMEOUT = 600000");
$Result1 = @mysql_query("SHOW TABLES FROM $DBNAME");
$cnt1 = @mysql_num_rows($Result1);
while ($cnt1)
{
$cnt1--;
if (!($Row1 = @mysql_fetch_array($Result1)))
{
continue;
}
$tablename = $Row1[0];
$Result3 = @mysql_query("DESCRIBE $tablename");
$cnt3 = @mysql_num_rows($Result3);
while ($cnt3)
{
$cnt3--;
if (!($Row3 = @mysql_fetch_array($Result3))) continue;
if (strpos($Row3['Type'],'varchar')!==false)
{
$pos = strpos($Row3['Type'],'(');
$length = substr($Row3['Type'],$pos+1);
$pos = strpos($length,')');
$length = substr($length, 0, $pos);
// If length is over 255, nothing we can do about it
if ($length>255) continue;
if ($length<1) continue;
// Length is below 255, we can convert
$sql = "alter table $tablename change ".$Row3['Field']." ".$Row3['Field']." char($length) default '' not null";
@mysql_query($sql);
}
}
}
![]() |
hartmut, 10-01-2010 |
> #1. If you have any field in the table that is > a CHAR (fixed size) field, all VARCHARs will be
> stored as n+1, so you don't get any reduced
> storage benefits.
Where did you get that information from?
This is true for very short VARCHARS (i think up to 5 bytes was the limit there) in an otherwise fixed length row, in that case the VARCHAR is converted into a CHAR internally. I'm not aware of any other conversions of that kind though.
> #2. If you're using MySQL replication, you may
> end up with a few nasty "master/slave table
> column type mismatch" errors - this is a big one
> "Table definition on master and slave does not
> match: Column 37 size mismatch - master has
> size 255, testtable on slave has size 256.
> Master's column size should be <= the slave's
> column size."
never seen that before, would like to see a repeatable test case for that or at least the CREATE statement for that table
one thing that comes to mind though: different default character set settings on master and slave. If the master is using utf8 while the slave is using a single byte encoding like latin1
i could imagine something like this happening (the error message would be misleading then though)
> #3. CHAR fields offer better performance over VARCHAR
this may be true in some CPU bound scenarios, it definitely isn't if your workload is disk bound though (which is what database servers usually are, and CHAR instead of VARCHAR drives things into that direction even more as caches in RAM will be less effective due to the extra payload ...)
> stored as n+1, so you don't get any reduced
> storage benefits.
Where did you get that information from?
This is true for very short VARCHARS (i think up to 5 bytes was the limit there) in an otherwise fixed length row, in that case the VARCHAR is converted into a CHAR internally. I'm not aware of any other conversions of that kind though.
> #2. If you're using MySQL replication, you may
> end up with a few nasty "master/slave table
> column type mismatch" errors - this is a big one
> "Table definition on master and slave does not
> match: Column 37 size mismatch - master has
> size 255, testtable on slave has size 256.
> Master's column size should be <= the slave's
> column size."
never seen that before, would like to see a repeatable test case for that or at least the CREATE statement for that table
one thing that comes to mind though: different default character set settings on master and slave. If the master is using utf8 while the slave is using a single byte encoding like latin1
i could imagine something like this happening (the error message would be misleading then though)
> #3. CHAR fields offer better performance over VARCHAR
this may be true in some CPU bound scenarios, it definitely isn't if your workload is disk bound though (which is what database servers usually are, and CHAR instead of VARCHAR drives things into that direction even more as caches in RAM will be less effective due to the extra payload ...)
|
|
Subscribe Now to receive new posts via Email as soon as they come out.
Comments
Post your comments


