Every InnoDB
table has a special index called
the clustered index
where the data for the rows is stored. Typically, the clustered
index is synonymous with the
primary key. To get the
best performance from queries, inserts, and other database
operations, you must understand how InnoDB uses the clustered
index to optimize the most common lookup and DML operations for
each table.
If you define a
PRIMARY KEY
on your table,InnoDB
uses it as the clustered index. Define a primary key for each table that you create. If there is no logical unique and non-null column or set of columns, add a new auto-increment column, whose values are filled in automatically.If you do not define a
PRIMARY KEY
for your table, MySQL locates the firstUNIQUE
index where all the key columns areNOT NULL
andInnoDB
uses it as the clustered index.If the table has no
PRIMARY KEY
or suitableUNIQUE
index,InnoDB
internally generates a hidden clustered index on a synthetic column containing row ID values. The rows are ordered by the ID thatInnoDB
assigns to the rows in such a table. The row ID is a 6-byte field that increases monotonically as new rows are inserted. Thus, the rows ordered by the row ID are physically in insertion order.
How the Clustered Index Speeds Up Queries
Accessing a row through the clustered index is fast because the
row data is on the same page where the index search leads. If a
table is large, the clustered index architecture often saves a
disk I/O operation when compared to storage organizations that
store row data using a different page from the index record.
(For example, MyISAM
uses one file for data
rows and another for index records.)
How Secondary Indexes Relate to the Clustered Index
All indexes other than the clustered index are known as
secondary indexes.
In InnoDB
, each record in a secondary index
contains the primary key columns for the row, as well as the
columns specified for the secondary index.
InnoDB
uses this primary key value to search
for the row in the clustered index.
If the primary key is long, the secondary indexes use more space, so it is advantageous to have a short primary key.