Data Types in MySQL
Data Types
Choosing the correct
data type can lead to a better performing database, for example comparing
numeric types takes less time than comparing character strings types because
character string types have character set and collation considerations, also
the smaller the data the faster it will be processed and less I/O is used
making queries perform even better.
MySQL includes many of
the ISO SQL 2003 standard data types and adds more data types of its own, the
data types can be categorized by the following
- Character string types
- National character string types
- Binary large object string types
- Numeric types
- Boolean types
- Datetime types
- Interval types
I am only going to give
a quick discussion on each as it is best to refer to the latest MySQL documentation
or to a specific version that you have, I will try and give examples but I do
not have examples for every possibility.
Before we begin a quick
review on how to create a table in MySQL, which can come in a number of flavors
Standard table
|
create table table_name ( ... );
|
Temporary table
|
create temporary table_name ( ... );
|
Memory table
|
create table table_name ( ... ) ENGINE = MEMORY;
|
You can also display how
the table was created
Display table construction
|
show create table table_name;
|
Just some things that
you should be aware of regarding tables
- they support automatic numbering and the sequence is a
property of the table
- you can define a table that is supported by any of the
storage engines
- there is no support for nested tables
- they only support scalar data types which include large
object types
- only two composite data types are supported enum and set
- partitioned tables are supported
- three tables types are supported: standard, temporary
and MEMORY
We start with character
types, I have already discussed data types in my Oracle data types section if you wish to compare.
String Type
|
Character string type
|
Size
|
Example
|
Fixed-width strings
|
character(length)
|
create table string_type (
fixwidth1 CHARACTER(10) NOT NULL DEFAULT '', fixwidth2 CHAR(10) NOT NULL DEFAULT'', fixwidth3 CHAR(10) CHARACTER SET utf16 COLLATE utf16_general_ci, varilength1 CHARACTER VARYING(10) NOT NULL DEFAULT '', varilength2 CHAR VARYING(10) NOT NULL DEFAULT '', varilength3 VARCHAR(10) NUL NULL DEFAULT '' ); |
|
char(length)
|
0-255
|
||
variable-length strings
|
character varying(length)
|
||
char varying(length)
|
|||
varchar(length)
|
0-65,535
|
||
tinytext
|
0-255
|
|
|
text
|
0-64Kb
|
|
|
meduimtext
|
0-16Mb
|
|
|
longtext
|
0-4Gb
|
|
|
Character objects
|
character largeobject
|
|
|
char large object
|
|
||
CLOB
|
|
Remember that the length
of the string does not mean the size of the string as this depends on the
character set used, for example latin1 character set uses one byte to store each character whereas the utf8 character set uses up to four bytes per
character, double check the MySQL version that you are using as the number of
bytes may vary and could have an impact on the size of your database.
The
system parameter pad_char_to_full_length changes the behavior of retrieving fixed-width
strings from a table, if set the string is retrieved is the full length of the
field.
find the length of a string and checking the padding
|
select concat('/',fixwidth1,'/'),
char_length(fixwidth1) from string_type;
|
National character sets
are pretty much the same as above except that the character set uses the utf8
String Type
|
Character string type
|
Size
|
Example
|
Fixed-width strings
|
national character(length)
|
create table string_type (
fixwidth1 NATIONAL CHARACTER(10) NOT NULL DEFAULT '', fixwidth2 NCHAR(10) NOT NULL DEFAULT'', varilength1 NCHAR(10) NOT NULL DEFAULT '', ); |
|
national char(length)
|
|
||
nchar(length)
|
0-255
|
||
variable-length strings
|
national character varying(length)
|
||
national char varying(length)
|
|||
nchar(length)
|
0-65,535
|
||
Character objects
|
national character largeobject
|
|
|
nchar large object
|
|
||
NCLOB
|
|
Binary large objects is
the least restrictive data type, they are byte strings and are ordered by each bytes
value.
Binary type
|
Size
|
Example
|
binary
|
0-255
|
create table string_type (
bin1 BINARY(100) NOT NULL DEFAULT '', blob1 BLOB(100) NOT NULL DEFAULT'', ); |
varbinary
|
0-65,532
|
|
tinyblob
|
0-255
|
|
blob
|
0-65,532
|
|
meduimblob
|
0-16Mb
|
|
longblob
|
0-4Gb
|
Numeric types is
probably the most vast, as there are many different types, these types all
affect data precision and depending on the application type this category is
the one that you will send more time on, I will categorize them in exact number
type and approximate numeric type
Numeric Type
|
Type
|
signed range
|
unsigned range
|
Size (bytes)
|
Example
|
||
Exact
|
numeric(g,f)
|
maximum limits depend on the hardware and O/S
max for g = 65, max for f = 30 |
depends
|
create table num_test (
bt1 BIT(10), fl1 FLOAT(10), dl DECIMAL(5,4), uty1 TINYINT(10) unsigned NOT NULL AUTO_INCREMENT ); |
|||
decimal(g,f)
|
maximum limits depend on the hardware and O/S
max for g = 65, max for f = 30 |
depends
|
|||||
smallint
|
-32,768 to 32,767
|
0 to 65,535
|
2
|
||||
integer
|
-2,147,483,648 to
2,147,483,647 |
0 to 4,294,967,295
|
4
|
||||
bigint
|
-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
|
0 to 18,446,744,073,709,551,615
|
8
|
||||
tinyint
|
-128 to 127
|
0 to 255
|
1
|
||||
meduimint
|
-8,388,608 to 8,388,607
|
0 to 16,777,215
|
3
|
||||
bit(x)
|
1 to 64
|
1-64
|
|||||
serial
|
alias for BIGINT
UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE KEY
|
n/a
|
|||||
approximate
|
float(p)
|
maximum limits depend on the hardware and O/S
max for p = 24 |
4
|
||||
real
|
alias for DOUBLE or
FLOAT
|
n/a
|
|||||
double(g,f)
|
maximum limits depend on the hardware and O/S
max for g = 53 max for f = 30 |
8
|
|||||
Boolean data types have
two options the standard BOOLEAN and the MySQL version called BOOL which is an
alias for TINYINT(1);
Datetime types have the following, you can specify any of the valid dates as below
Datetime types have the following, you can specify any of the valid dates as below
- YYYY-mm-dd HH:ii:ss
- yy-mm-dd HH:ii:ss
- yyyymmdd
- yymmdd
- YYYYmmddHHiiss
- yymmddHHiiss
you can also put
microseconds in as well
- append .uuuuuu
Datetime type
|
Range
|
Size
(bytes)
|
Zero value
|
date
|
'1000-01-01' to '9999-12-31'
|
3
|
'0000-00-00'
|
datetime
|
'1000-01-01 00:00:00' to '9999-12-31 23:59:59'
|
8
|
'0000-00-00 00:00:00'
|
timestamp
|
'1970-01-01 00:00:00' to '2038-01-18 22:14:07'
|
4
|
'0000-00-00 00:00:00'
|
time
|
'-838:59:59' to '838:59:59'
|
3
|
'00:00:00'
|
year(2)
|
00 to 99
|
1
|
'00'
|
year(4)
|
1901 to 2155
|
1
|
'0000'
|
You need to be aware of
a number of date, time and numeric conversion issues, here I point you to the
MySQL documentation.
Lastly we come to the
interval data types, these strictly speaking are not data types that can be
stored in a table, they are used as part of an expression in a time based
calculation.
Interval expressions
|
# below would translate to '2012-04-08'
select date_add('2012-03-08', interval 1 year_month) as test;
# below would translate to 2017-07-08
select date_add('2012-03-08', interval '5-4' year_month) as test; |
|
Interval
data types
|
||
Name
|
Format
|
Example
|
microsecond
|
n
|
INTERVAL 5 MICROSECOND
|
second
|
n
|
INTERVAL 5 SECOND
|
minute
|
n
|
INTERVAL 5 MINUTE
|
hour
|
n
|
INTERVAL 5 HOUR
|
day
|
n
|
INTERVAL 5 DAY
|
week
|
n
|
INTERVAL 5 WEEK
|
month
|
n
|
INTERVAL 5 MONTH
|
quarter
|
n
|
INTERVAL 5 QUARTER
|
year
|
n
|
INTERVAL 5 YEAR
|
second_microsecond
|
'n.n'
|
INTERVAL '5.4' SECOND_MICROSECOND
|
minute_microsecond
|
'n.n'
|
INTERVAL '5.4' MINUTE_MICROSECOND
|
minute_second
|
'n:n'
|
INTERVAL '5:4' MINUTE_SECOND
|
hour_microsecond
|
'n.n'
|
INTERVAL '5.4' HOUR_MICROSECOND
|
hour_second
|
'n:n:n'
|
INTERVAL '5:4:3' HOUR_SECOND
|
hour_minute
|
'n:n'
|
INTERVAL '5:4' HOUR_MINUTE
|
day_microsecond
|
'n.n'
|
INTERVAL '5.4' DAY_MICROSECOND
|
day_second
|
'n n:n:n'
|
INTERVAL '5 4:3:2' DAY_SECOND
|
day_minute
|
'n n:n'
|
INTERVAL '5 4:3' DAY_MINUTE
|
day_hour
|
'n n'
|
INTERVAL '5 4' DAY_HOUR
|
year_month
|
'n-n'
|
INTERVAL '5-4' YEAR_MONTH
|
MySQL has two data types
that are very different to the above data types, they are similar to foreign
keys in that the values inserted must be one of the field values set
ENUM
|
this is a enumerated list of 1 to 65,535 strings which
indicate the allowed values for the field, only one of the values can be
stored in the list
|
SET
|
this is a enumerated list of 1 to 64 strings which indicate
the allowed values for the field, any combination of the strings in the
enumerated list can be stored as a comma-delimited list.
|
All trailing spaces will
be removed from enum and set data types, the system parameter sys_mode can affect the data types, see below for more
information on sql modes.
sql_mode system
parameter
|
This parameter affects both the enum and set data types, if the parameter is
set to any of the below then an error will occur if a duplicate entry during
a field creation, otherwise you will get a warning
|
here are some examples
on how to use these data types
ENUM example
|
Example One
===================================================== CREATE TABLE Countries ( name char(30), continent ENUM('Asia', 'Europe', 'North America', 'Africa', 'Oceania','Antarctica', 'South America') );
INSERT INTO Countries
(name,continent) VALUES('Kenya','Africa');
INSERT INTO Countries (name,continent) VALUES('England','Europe'); INSERT INTO Countries (name,continent) VALUES('Brazil','South America');
select
name,continent,continent+0 'Index position' from countries;
Example Two
===================================================== CREATE TABLE Bike ( ID SMALLINT UNSIGNED, Model VARCHAR(40), Color ENUM('red', 'blue', 'green', 'yellow'), Options SET('rack', 'light', 'helmet', 'lock') );
insert into bike
values ('0001', 'chopper', 'red', 'rack,light');
insert into bike values ('0002', 'tomahawk', 'blue', 'lock,helmet,rack'); insert into bike values ('0003', 'grifter', 'green', 'light,helmet,rack'); insert into bike values ('0004', 'chopper', 'red', 'rack,light'); insert into bike values ('0005', 'grifter', 'red', 'rack,light,helmet,lock');
select id, model,
color, options from bike;
|
SET example
|
CREATE TABLE allergy (symptom
SET('sneezing','runny nose','stuffy head','red eyes') );
INSERT INTO allergy
(symptom) VALUES('');
INSERT INTO allergy (symptom) VALUES('stuffy head'); INSERT INTO allergy (symptom) VALUES('sneezing,red eyes'); INSERT INTO allergy (symptom) VALUES('sneezing,stuffy head,red eyes');
select
symptom,symptom+0 'Index position' from allergy;
|
There are number of
things that you should understand about these data types
- they are defined with a limited set of strings
- the enum data type indexes its string values sequentially
- the set data type treats its string values as digits in a
64-byte value
- to compare set values
a bitwise comparison can be done
- you can use the following attributes to both enum and set - NULL, NOT NULL and DEFAULT
Not many developers use enum and set mainly because they are hard to alter and do
require table rebuilds if altered, most developers will use foreign keys but I
reference them here so that you know that you have other options.
Finally the below is a
quick table detailing what attributes a data type can use, also check the MySQL
documentation of the version you are using.
Data Type
|
Attributes that can be
used
|
Character Strings
|
NOT NULL, NULL, DEFAULT and BINARY
|
National Character
Strings
|
NOT NULL, NULL, DEFAULT and BINARY
|
Binary
|
NOT NULL, NULL and BINARY (only binary can use this)
|
Numeric data
|
NOT NULL, NULL, DEFAULT, AUTO_INCREMENT, SIGNED, UNSIGNED,
ZEROFILL, SERIAL DEFAULT VALUE
|
DateTime
|
NOT NULL, NULL and DEFAULT
|
Enum and Sets
|
NOT NULL, NULL and DEFAULT
|
MySQL has a system
parameter that allows you to change the default behavior of data validation and
allowed SQL syntax, it can be set on a global level or on a session level, by
default the sql_mode is set to '' (blank).
sql_mode
|
# You can set it at a global or sessions level
show global variables like 'sql_mode'; show session variables like 'sql_mode'; SET GLOBAL sql_mode=<value>,<value>,<value>.... SET SESSION sql_mode=<value>,<value>,<value>.... # There are sort of four groups to choose from
Getting
rid of silent features, silent conversions and silently allowing invalid data
|
By default MySQL will
allow you to enter invalid data and will change it to make it valid, this of
course means that is may invalidate data integrity, lets see an example
invalid data entry
|
create table
test_sql_mode (
sm1 tinyint not null ); ## Lets enter some valid data insert into test_sql_mode (sm1) values (126); insert into test_sql_mode (sm1) values (127); ## Lets enter some invalid data, remember a tinyint values are -128 to 127 insert into test_sql_mode (sm1) values (128); ## Did we get any warning show warnings; ## Lets see whats in the table, as you can see the last value we inserted was truncated select * from test_sql_mode |
As you can see in the
example above, values that are too large to fit in a field are truncated to the
closest value that does fit, similarly for string data if they are too large.
So what you need MySQL is to throw an error instead of a warning this were the sql_modecomes into play.
You have to be careful
when using NULL values in tables, something's to beware of are
- A field that allows a NULL uses more storage space and
more resources to process storage, retrieval, comparisons and calculations
- Most operators and functions are not NULL safe and may
return NULL if one side has a NULL value
- Aggregate functions sum(), count(), min(), max() and avg() ignore NULL
- NULL values make field and index statistics more time
consuming to calculate
- Only MyISAM, InnoDB and MEMORY storage engines support
NULL
- NULL appear first in ascending sorts and last in
descending sorts
Remember that NULL means
no value has been determined it does not mean zero or an empty field, it means unknown.
Using the PROCEDURE
ANALYSE statement you can obtain the table information regarding used space on
each field, this will be displayed in the optimal_fieldtype field of the output, you may then want to reduce
certain columns to optimize them, when analyzing tables make sure that you have
a good sample size.
optimal data types
|
# The PROCEDURE
ANALYSE takes two options
## Lets create a
simple table using a high value for varchar thus we will be wasting space
create table accounts ( accountid int, accountname varchar(255) );
## Now insert some
data, the data will be well within the limits of the varchar
insert into accounts values (1,'Company one LTD'); insert into accounts values (2,'Company two LTD'); insert into accounts values (3,'Company three LTD'); insert into accounts values (4,'Company four LTD'); insert into accounts values (5,'Company five LTD'); insert into accounts values (6,'Company six LTD'); insert into accounts values (7,'Company seven LTD'); insert into accounts values (8,'Company eight LTD'); insert into accounts values (9,'Company nine LTD'); insert into accounts values (10,'Company ten LTD');
insert into accounts
values (11,'top shop inc 11');
insert into accounts values (12,'top shop inc 12'); insert into accounts values (13,'top shop inc 13'); insert into accounts values (14,'top shop inc 14'); insert into accounts values (15,'top shop inc 15'); insert into accounts values (16,'top shop inc 16'); insert into accounts values (17,'top shop inc 17'); insert into accounts values (18,'top shop inc 18'); insert into accounts values (19,'top shop inc 19'); insert into accounts values (20,'top shop inc 20');
insert into accounts
values (21,'top shop ltd');
insert into accounts values (22,'top shop plc'); insert into accounts values (23,'top shop inc'); insert into accounts values (24,'top shop partners'); insert into accounts values (25,'top shop corp'); insert into accounts values (26,'bottom shop ltd'); insert into accounts values (27,'bottom shop plc'); insert into accounts values (28,'bottom shop partners');
## Lets insert some
NULL's
insert into accounts values (29,NULL); insert into accounts values (30,NULL);
## Lets insert some
empty fields
insert into accounts values (31,''); insert into accounts values (32,'');
## Now run the
procedure analyse and see what we get, from the output we can see the NULL,s
and the
## empties fields, also the analyzer has picked up that the size of the accountname fields is too large ## and recommended's we shrink the size (see the Optimal_fieldtype). Normally I would add on a few bytes ## to make sure just in case a longer company name comes along. Other information is the min and max ## values ## One note is that if you increase the first argument to the procedure analyze eventually MySQL ## recommended's that you use a enum data type, this is because you are asking for more values to ## consider select * from accounts procedure analyse(5,256) \G *************************** 1. row *************************** Field_name: datadisk.accounts.accountid Min_value: 1 Max_value: 32 Min_length: 1 Max_length: 2 Empties_or_zeros: 0 Nulls: 0 Avg_value_or_avg_length: 16.5000 Std: 9.2331 Optimal_fieldtype: TINYINT(2) UNSIGNED NOT NULL *************************** 2. row *************************** Field_name: datadisk.accounts.accountname Min_value: bottom shop ltd Max_value: top shop plc Min_length: 12 Max_length: 20 Empties_or_zeros: 2 Nulls: 2 Avg_value_or_avg_length: 14.1667 Std: NULL Optimal_fieldtype: VARCHAR(20) 2 rows in set (0.03 sec) |
This tool can be useful
if you are planning to migrate to a new system and want to reduce table column
sizes so that not to waste space, if you already have a system you can always
alter the table providing it is not to large or you have an enough time to
alter the table. Be careful when using the procedure analyse function it does not always get it right,
sometimes you are the best person to know what is right for your data
optimization as you would know the data.
There are a number of
tasks that you can perform on an tables and indexes which are available in MySQL
- check table - used to determine if a table is corrupted (options
are extended, meduim, changed, fast, quick, for upgrade)
- repair table - used to repair a table (options are quick, extended
use_frm)
- checksum table - obtain a checksum for the table (options are quick,
extended), see above for checksum's on tables
- analyze table - updates the statistics of a table to help the
optimizer
- optimize table - resolves data and index fragmentation
check table
|
check table tablename [option]
## You can use one of
the following options
|
repair table
|
repair table tablename [option]
## You can use one of
the following options
|
checksum table
|
checksum table tablename [option]
## You can use one of
the following options
|
analyze table
|
analyze table tablename
|
optimize table
|
optimize table tablename
|
Thanks for the information.
ReplyDeleteMysql DBA Course