InnoDB
maintains a storage area
called the buffer pool
for caching data and indexes in memory. Knowing how the
InnoDB
buffer pool works, and taking
advantage of it to keep frequently accessed data in memory, is
an important aspect of MySQL tuning.
Guidelines
Ideally, you set the size of the buffer pool to as large a value
as practical, leaving enough memory for other processes on the
server to run without excessive paging. The larger the buffer
pool, the more InnoDB
acts like an in-memory
database, reading data from disk once and then accessing the
data from memory during subsequent reads. The buffer pool even
caches data changed by insert and update operations, so that
disk writes can be grouped together for better performance.
Depending on the typical workload on your system, you might adjust the proportions of the parts within the buffer pool. You can tune the way the buffer pool chooses which blocks to cache once it fills up, to keep frequently accessed data in memory despite sudden spikes of activity for operations such as backups or reporting.
With 64-bit systems with large memory sizes, you can split the buffer pool into multiple parts, to minimize contention for the memory structures among concurrent operations. For details, see Section 13.7.7.18, “Improvements to Performance from Multiple Buffer Pools”.
Internal Details
InnoDB
manages the pool as a list,
using a variation of the least recently used (LRU) algorithm.
midpoint insertion strategy. When room is needed to add a new
block to the pool, InnoDB
evicts
the least recently used block and adds the new block to the
middle of the list. This “midpoint insertion
strategy” treats the list as two sublists:
At the head, a sublist of “new” (or “young”) blocks that were accessed recently.
At the tail, a sublist of “old” blocks that were accessed less recently.
This algorithm keeps blocks that are heavily used by queries in the new sublist. The old sublist contains less-used blocks; these blocks are candidates for eviction.
The LRU algorithm operates as follows by default:
3/8 of the buffer pool is devoted to the old sublist.
The midpoint of the list is the boundary where the tail of the new sublist meets the head of the old sublist.
When
InnoDB
reads a block into the buffer pool, it initially inserts it at the midpoint (the head of the old sublist). A block can be read in because it is required for a user-specified operation such as a SQL query, or as part of a read-ahead operation performed automatically byInnoDB
.Accessing to a block in the old sublist makes it “young”, moving it to the head of the buffer pool (the head of the new sublist). If the block was read in because it was required, the first access occurs immediately and the block is made young. If the block was read in due to read-ahead, the first access does not occur immediately (and might not occur at all before the block is evicted).
As the database operates, blocks in the buffer pool that are not accessed “age” by moving toward the tail of the list. Blocks in both the new and old sublists age as other blocks are made new. Blocks in the old sublist also age as blocks are inserted at the midpoint. Eventually, a block that remains unused for long enough reaches the tail of the old sublist and is evicted.
By default, blocks read by queries immediately move into the new
sublist, meaning they will stay in the buffer pool for a long
time. A table scan (such as performed for a
mysqldump operation, or a
SELECT
statement with no
WHERE
clause) can bring a large amount of
data into the buffer pool and evict an equivalent amount of
older data, even if the new data is never used again. Similarly,
blocks that are loaded by the read-ahead background thread and
then accessed only once move to the head of the new list. These
situations can push frequently used blocks to the old sublist,
where they become subject to eviction.
Configuration Options
Several InnoDB
system variables
control the size of the buffer pool and let you tune the LRU
algorithm:
Specifies the size of the buffer pool. If your buffer pool is small and you have sufficient memory, making the pool larger can improve performance by reducing the amount of disk I/O needed as queries access
InnoDB
tables.innodb_buffer_pool_instances
Divides the buffer pool into a user-specified number of separate regions, each with its own LRU list and related data structures, to reduce contention during concurrent memory read and write operations. The size specified by
innodb_buffer_pool_size
is divided among all the buffer pool instances. For best efficiency, specify a combination ofinnodb_buffer_pool_instances
andinnodb_buffer_pool_size
so that each buffer pool instance is at least 1 gigabyte.Specifies the approximate percentage of the buffer pool that
InnoDB
uses for the old block sublist. The range of values is 5 to 95. The default value is 37 (that is, 3/8 of the pool).Specifies how long in milliseconds (ms) a block inserted into the old sublist must stay there after its first access before it can be moved to the new sublist. The default value is 0: A block inserted into the old sublist moves immediately to the new sublist the first time it is accessed, no matter how soon after insertion the access occurs. If the value is greater than 0, blocks remain in the old sublist until an access occurs at least that many ms after the first access. For example, a value of 1000 causes blocks to stay in the old sublist for 1 second after the first access before they become eligible to move to the new sublist.
Setting innodb_old_blocks_time
greater than 0 prevents one-time table scans from flooding the
new sublist with blocks used only for the scan. Rows in a block
read in for a scan are accessed many times in rapid succession,
but the block is unused after that. If
innodb_old_blocks_time
is set
to a value greater than time to process the block, the block
remains in the “old” sublist and ages to the tail
of the list to be evicted quickly. This way, blocks used only
for a one-time scan do not act to the detriment of heavily used
blocks in the new sublist.
innodb_old_blocks_time
can be
set at runtime, so you can change it temporarily while
performing operations such as table scans and dumps:
SET GLOBAL innodb_old_blocks_time = 1000;
... perform queries that scan tables ...
SET GLOBAL innodb_old_blocks_time = 0;
This strategy does not apply if your intent is to “warm
up” the buffer pool by filling it with a table's content.
For example, benchmark tests often perform a table or index scan
at server startup, because that data would normally be in the
buffer pool after a period of normal use. In this case, leave
innodb_old_blocks_time
set to
0, at least until the warmup phase is complete.
Monitoring the Buffer Pool
The output from the InnoDB Standard Monitor contains several new
fields in the BUFFER POOL AND MEMORY
section
that pertain to operation of the buffer pool LRU algorithm:
Old database pages
: The number of pages in the old sublist of the buffer pool.Pages made young, not young
: The number of old pages that were moved to the head of the buffer pool (the new sublist), and the number of pages that have remained in the old sublist without being made new.youngs/s non-youngs/s
: The number of accesses to old pages that have resulted in making them young or not. This metric differs from that of the previous item in two ways. First, it relates only to old pages. Second, it is based on number of accesses to pages and not the number of pages. (There can be multiple accesses to a given page, all of which are counted.)young-making rate
: Hits that cause blocks to move to the head of the buffer pool.not
: Hits that do not cause blocks to move to the head of the buffer pool (due to the delay not being met).
The young-making
rate and
not
rate will not normally add up to the
overall buffer pool hit rate. Hits for blocks in the old sublist
cause them to move to the new sublist, but hits to blocks in the
new sublist cause them to move to the head of the list only if
they are a certain distance from the head.
The preceding information from the Monitor can help you make LRU tuning decisions:
If you see very low
youngs/s
values when you do not have large scans going on, that indicates that you might need to either reduce the delay time, or increase the percentage of the buffer pool used for the old sublist. Increasing the percentage makes the old sublist larger, so blocks in that sublist take longer to move to the tail and be evicted. This increases the likelihood that they will be accessed again and be made young.If you do not see a lot of
non-youngs/s
when you are doing large table scans (and lots ofyoungs/s
), to tune your delay value to be larger.
For more information about InnoDB Monitors, see
Section 13.6.14.2, “SHOW ENGINE INNODB
STATUS
and the InnoDB
Monitors”.