7.2.1.4. How to Avoid Table Scans

Frequently, a table scan is a danger sign that a query can be speeded up significantly. For tables with more than a few rows, consider redesigning the query by adding an index for one or more of the columns tested in the WHERE clause. Put extra effort into avoiding table scans for queries that perform joins or reference foreign keys. If the nature of the data means there is no way to avoid reading all the rows, then it might not be practical to make the query faster, or making it faster might involve extensive restructuring of your tables that is beyond the scope of this section.

The output from EXPLAIN shows ALL in the type column when MySQL uses a table scan to resolve a query. This usually happens under the following conditions:

  • The ON or WHERE clauses do not reference any indexed columns that the query can use. Consider adding an index, or refining those clauses to refer to an indexed column.

  • The table is so small that it is faster to perform a table scan than to bother with a key lookup. This is common for tables with fewer than 10 rows and a short row length. Don't worry in this case.

  • You are comparing indexed columns with constant values and MySQL has calculated (based on the index tree) that the constants cover too large a part of the table and that a table scan would be faster. See Section 7.2.1.2, “How MySQL Optimizes WHERE Clauses”. For example, to query census data only for males or only for females, MySQL must read most of the data blocks in the table, so locating the rows through the index would add unnecessary overhead. Don't worry if you encounter this condition for occasional big reports. If these reports are frequent or truly time-critical, and the table is huge, you might partition, shard, or create dimension tables using the relevant column.

  • You are using a key with low cardinality (many rows match the key value) through another column. In this case, MySQL assumes that by using the key it probably will do many key lookups and that a table scan would be faster.

For small tables, a table scan often is appropriate and the performance impact is negligible. For large tables, try the following techniques to avoid having the optimizer incorrectly choose a table scan:

  • Minimize the OR keywords in your WHERE clauses. If there is no index that helps to locate the values on both sides of the OR, any row could potentially be part of the result set, so all rows must be tested, and that requires a full table scan. If you have one index that helps to optimize one side of an OR query, and a different index that helps to optimize the other side, use a UNION operator to run separate fast queries and merge the results afterward.

  • With tables that use the MEMORY storage engine, if you run queries that examine ranges of values (using operators such as >, <=, or BETWEEN on the indexed columns), create the index with the USING BTREE clause. The default (USING HASH) is fast for retrieving individual rows with an equality operator (= or <=>), but is much slower (requiring a full table scan) to examine a range of column values. A MEMORY table created with the USING BTREE clause is still fast for equality comparisons, so use that clause for your MEMORY tables that handle a variety of queries.

  • Use ANALYZE TABLE tbl_name to update the key distributions for the scanned table. See Section 12.4.2.1, “ANALYZE TABLE Syntax”.

  • Use FORCE INDEX for the scanned table to tell MySQL that table scans are very expensive compared to using the given index:

    SELECT * FROM t1, t2 FORCE INDEX (index_for_column)
      WHERE t1.col_name=t2.col_name;
    

    See Section 12.2.9.2, “Index Hint Syntax”.

  • Start mysqld with the --max-seeks-for-key=1000 option or use SET max_seeks_for_key=1000 to tell the optimizer to assume that no key scan causes more than 1,000 key seeks. See Section 5.1.4, “Server System Variables”.

Copyright © 2010-2024 Platon Technologies, s.r.o.           Home | Man pages | tLDP | Documents | Utilities | About
Design by styleshout