Rather than using a strictly LRU algorithm, InnoDB uses a technique to minimize the amount of data that is brought into the buffer cache and never accessed again. The goal is to make sure that frequently accessed (“hot”) pages remain in the buffer cache, even as read-ahead and full table scans bring new blocks in that might or might not be accessed afterward.
Newly read blocks are inserted into the middle of the list
representing the buffer cache. of the LRU list. All newly read
pages are inserted at a location that by default is
3/8
from the tail of the LRU list. The pages
are moved to the front of the list (the most-recently used end)
when they are accessed in the buffer cache for the first time.
Thus pages that are never accessed never make it to the front
portion of the LRU list, and “age out” sooner than
with a strict LRU approach. This arrangement divides the LRU list
into two segments, where the pages downstream of the insertion
point are considered “old” and are desirable victims
for LRU eviction.
For an explanation of the inner workings of the InnoDB buffer pool
and the specifics of its LRU replacement algorithm, see
Section 7.9.1, “The InnoDB
Buffer Pool”.
Starting with InnoDB 1.0.5, you can control the insertion point in the LRU list, and choose whether InnoDB applies the same optimization to blocks brought into the buffer pool by table or index scans.
The configuration parameter
innodb_old_blocks_pct
controls the percentage of “old” blocks in the LRU
list. The default value of
innodb_old_blocks_pct
is
37
, corresponding to the original fixed ratio
of 3/8. The value range is 5
(new pages in the
buffer pool age out very quickly) to 95
(only
5% of the buffer pool reserved for hot pages, making the algorithm
close to the familiar LRU strategy).
The optimization that keeps the buffer cache from being churned by
read-ahead can avoid similar problems due to table or index scans.
In these scans, a data page is typically accessed a few times in
quick succession and is never touched again. The configuration
parameter innodb_old_blocks_time
specifies the time window (in milliseconds) after the first access
to a page during which it can be accessed without being moved to
the front (most-recently used end) of the LRU list. The default
value of innodb_old_blocks_time
is 0
, corresponding to the original behavior of
moving a page to the most-recently used end of the buffer pool
list when it is first accessed in the buffer pool. Increasing this
value makes more and more blocks likely to age out faster from the
buffer pool.
Both the new parameters
innodb_old_blocks_pct
and
innodb_old_blocks_time
are
dynamic, global and can be specified in the MySQL option file
(my.cnf
or my.ini
) or
changed at runtime with the SET GLOBAL
command.
Changing the setting requires the SUPER
privilege.
To help you gauge the effect of setting these parameters, some
additional statistics are reported by SHOW ENGINE INNODB
STATUS
command. The BUFFER POOL AND
MEMORY
section now looks like:
Total memory allocated 1107296256; in additional pool allocated 0 Dictionary memory allocated 80360 Buffer pool size 65535 Free buffers 0 Database pages 63920 Old database pages 23600 Modified db pages 34969 Pending reads 32 Pending writes: LRU 0, flush list 0, single page 0 Pages made young 414946, not young 2930673 1274.75 youngs/s, 16521.90 non-youngs/s Pages read 486005, created 3178, written 160585 2132.37 reads/s, 3.40 creates/s, 323.74 writes/s Buffer pool hit rate 950 / 1000, young-making rate 30 / 1000 not 392 / 1000 Pages read ahead 1510.10/s, evicted without access 0.00/s LRU len: 63920, unzip_LRU len: 0 I/O sum[43690]:cur[221], unzip sum[0]:cur[0]
Old database pages
is the number of pages in the “old” segment of the LRU list.Pages made young
andnot young
is the total number of “old” pages that have been made young or not respectively.youngs/s
andnon-young/s
is the rate at which page accesses to the “old” pages have resulted in making such pages young or otherwise respectively since the last invocation of the command.young-making rate
andnot
provides the same rate but in terms of overall buffer cache accesses instead of accesses just to the “old” pages.
Because the effects of these parameters can vary widely based on your hardware configuration, your data, and the details of your workload, always benchmark to verify the effectiveness before changing these settings in any performance-critical or production environment.
In mixed workloads where most of the activity is OLTP type with
periodic batch reporting queries which result in large scans,
setting the value of
innodb_old_blocks_time
during the batch runs can help keep the working set of the normal
workload in the buffer cache.
When scanning large tables that cannot fit entirely in the buffer
pool, setting
innodb_old_blocks_pct
to a
small value keeps the data that is only read once from consuming a
significant portion of the buffer pool. For example, setting
innodb_old_blocks_pct=5
restricts this data
that is only read once to 5% of the buffer pool.
When scanning small tables that do fit into memory, there is less
overhead for moving pages around within the buffer pool, so you
can leave innodb_old_blocks_pct
at its default value, or even higher, such as
innodb_old_blocks_pct=50
.
The effect of the
innodb_old_blocks_time
parameter is harder to predict than the
innodb_old_blocks_pct
parameter, is relatively small, and varies more with the workload.
To arrive at an optimal value, conduct your own benchmarks if the
performance improvement from adjusting
innodb_old_blocks_pct
is
not sufficient.
For more information about the InnoDB buffer pool, see
Section 7.9.1, “The InnoDB
Buffer Pool”.