Updating InnoDB Table Statistics Manually

InnoDB Tables

InnoDB TablesIn this post, we will discuss how to fix cardinality for InnoDB tables manually.

As a support engineer, I often see situations when the cardinality of a table is not correct. When InnoDB calculates the cardinality of an index, it does not scan the full table by default. Instead it looks at random pages, as determined by options innodb_stats_sample_pages, innodb_stats_transient_sample_pages and innodb_stats_persistent_sample_pages, or by the 

CREATE TABLE
 option
STATS_SAMPLE_PAGES
. The default value for persistent statistics is 20. This approach works fine when the number of unique values in your secondary key grows in step with the size of the table. But what if you have a column that has a comparatively small number of unique values? This could be a common service, many-to-many relationship table, for example, or just a table containing a list of sell orders that belong to one of a dozen shops owned by the company. Such tables could grow up to billions of rows with a small (less than 100) number of unique shop IDs.

At some point, InnoDB will report the wrong values for such indexes. Really! If 20 pages have 100 unique shop IDs, how many unique shop IDs would 20000 pages have? 100 times 1000? This seems logical, and after a certain number of rows such indexes will have extraordinarily large cardinality values.

ANALYZE TABLE
 will not help, because it uses the same algorithm. Increasing the number of “stats” sample pages would help, but it has its own downside: the more pages you have to examine, the slower
ANALYZE TABLE
 runs. While this command is not blocking, it still creates side effects as described in this blog post. And the longer it runs, the less control you have.

Another issue with InnoDB statistics: even if it is persistent and

STATS_AUTO_RECALC
 is set to 0, it still adds values for secondary indexes as shown in lp:1538765. Eventually, after you insert million of rows, your statistics get corrupted.
ANALYZE TABLE
  can fix it only if you specify a very large number of “stats” sample pages.

Can we do anything about it?

InnoDB stores statistics in the “mysql” database, in the tables

innodb_table_stats
 and
innodb_index_stats
. Since they are regular MySQL tables, privileged users can access them. We can update them and modify statistics as we like. And these statistics are used by the Optimizer!

I created a small example showing how to do this trick. I used Percona Server for MySQL version 5.7.19, but the trick will work on any supported MySQL and Percona Server for MySQL version.

First, let’s create test tables. The first table has shops, with a few shop profiles with the shop ID and name:

create table shops(
  shop_id int not null auto_increment primary key,
  name varchar(32)
) engine=innodb;

The second table refers to the “shops” table:

create table goods(
  id int not null auto_increment primary key,
  shop_id int not null,
  name varchar(32),
  create_date datetime DEFAULT NULL,
  key (shop_id, create_date)
) engine=innodb;

Let’s check how many unique shops we have:

mysql> select count(distinct shop_id) from shops;
+-------------------------+
| count(distinct shop_id) |
+-------------------------+
| 100                     |
+-------------------------+
1 row in set (0.02 sec)

With 100 distinct shops, and a key on

(shop_id, create_date)
, we expect cardinality in table goods to be not much different than this query result:
mysql> select count(distinct id) as `Cardinality for PRIMARY`,
    -> count(distinct shop_id) as `Cardinality for shop_id column in index shop_id`,
    -> count(distinct shop_id, create_date) as `Cardinality for create_date column in index shop_id`
    -> from goods
*************************** 1. row ***************************
Cardinality for PRIMARY: 8000000
Cardinality for shop_id column in index shop_id: 100
Cardinality for create_date column in index shop_id: 169861
1 row in set (2 min 8.74 sec)

However, 

SHOW INDEX
 returns dramatically different values for the column
shop_id
:
mysql> show index from goods;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| goods |          0 |  PRIMARY |            1 |          id |         A |     7289724 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            1 |     shop_id |         A |       13587 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            2 | create_date |         A |      178787 |     NULL |   NULL |  YES |      BTREE |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.09 sec)

ANALYZE TABLE
 does not help:
mysql> analyze table goods;
+------------+---------+----------+----------+
|      Table |      Op | Msg_type | Msg_text |
+------------+---------+----------+----------+
| test.goods | analyze |   status |       OK |
+------------+---------+----------+----------+
1 row in set (0.88 sec)
mysql> show index from goods;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| goods |          0 |  PRIMARY |            1 |          id |         A |     7765796 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            1 |     shop_id |         A |       14523 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            2 | create_date |         A |      168168 |     NULL |   NULL |  YES |      BTREE |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)

As a result, if we join the two tables, Optimizer chooses the wrong

JOIN
 order and query execution plan:
mysql> explain select goods.* from goods join shops using(shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
+----+-------------+-------+------------+-------+---------------+---------+---------+--------------------+------+----------+--------------------------+
| id | select_type | table | partitions |  type | possible_keys |     key | key_len |                ref | rows | filtered |                    Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+--------------------+------+----------+--------------------------+
|  1 |      SIMPLE | shops |       NULL | index |       PRIMARY | PRIMARY |       4 |               NULL |  100 |   100.00 | Using where; Using index |
|  1 |      SIMPLE | goods |       NULL |   ref |       shop_id | shop_id |       4 | test.shops.shop_id |  534 |    11.11 |    Using index condition |
+----+-------------+-------+------------+-------+---------------+---------+---------+--------------------+------+----------+--------------------------+
2 rows in set, 1 warning (0.13 sec)
mysql> P md5sum
PAGER set to 'md5sum'
mysql> select goods.* from goods join shops using(shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
4a94dabc4bfbfb7dd225bcb50278055b -
31896 rows in set (43.32 sec)

If compared to 

STRAIGHT_JOIN
 order:
mysql> explain select goods.* from goods straight_join shops on(goods.shop_id = shops.shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------+-------+----------+-----------------------+
| id | select_type | table | partitions |   type | possible_keys |     key | key_len |                ref |  rows | filtered |                 Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------+-------+----------+-----------------------+
|  1 |      SIMPLE | goods |       NULL |  range |       shop_id | shop_id |      10 |               NULL | 31997 |   100.00 | Using index condition |
|  1 |      SIMPLE | shops |       NULL | eq_ref |       PRIMARY | PRIMARY |       4 | test.goods.shop_id |     1 |   100.00 |           Using index |
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------+-------+----------+-----------------------+
2 rows in set, 1 warning (0.14 sec)
mysql> P md5sum
PAGER set to 'md5sum'
mysql> select goods.* from goods straight_join shops on(goods.shop_id = shops.shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
4a94dabc4bfbfb7dd225bcb50278055b -
31896 rows in set (7.94 sec)

The time difference for a small 8M row table is around six times! For a big table with many columns, it would be even larger.

Is

STRAIGHT_JOIN
 the only solution for this case?

No! It’s also not a great solution because if the query is complicated and involves more than two tables, it may be affected by bug fixes and improvements in the Optimizer code. Then the query order might not be optimal for new versions and updates. Therefore, you’ll need to test such queries at each upgrade, including minor ones.

So why does

ANALYZE TABLE
 not work? Because the default number of pages it uses to calculate statistics is too small for the difference. You can increase the table option
STATS_SAMPLE_PAGES
  until you find a proper one. The drawback is that the greater you set 
STATS_SAMPLE_PAGES
, the longer it takes for 
ANALYZE TABLE
 to finish. Also, if you update a large portion of the table, you are often affected by lp:1538765. At some point, the statistics will again be inaccurate.

Now let’s try our manual statistics update trick

InnoDB stores its persistent statistics in the tables

mysql.innodb_table_stats
  and
mysql.innodb_index_stats
:
mysql> alter table goods stats_persistent=1, stats_auto_recalc=0;
Query OK, 0 rows affected (0.11 sec)
Records: 0 Duplicates: 0 Warnings: 0
+---------------+------------+---------------------+---------+----------------------+--------------------------+
| database_name | table_name |         last_update |  n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+------------+---------------------+---------+----------------------+--------------------------+
|          test |      goods | 2017-09-05 00:21:12 | 7765796 |                34624 |                    17600 |
+---------------+------------+---------------------+---------+----------------------+--------------------------+
1 row in set (0.00 sec)
mysql> select * from mysql.innodb_index_stats where table_name='goods';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name |         last_update |    stat_name | stat_value | sample_size |                  stat_description |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
|          test |      goods |    PRIMARY | 2017-09-05 00:21:12 | n_diff_pfx01 |    7765796 |          20 |                                id |
|          test |      goods |    PRIMARY | 2017-09-05 00:21:12 | n_leaf_pages |      34484 |        NULL | Number of leaf pages in the index |
|          test |      goods |    PRIMARY | 2017-09-05 00:21:12 |         size |      34624 |        NULL |      Number of pages in the index |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 | n_diff_pfx01 |      14523 |          20 |                           shop_id |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 | n_diff_pfx02 |     168168 |          20 |               shop_id,create_date |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 | n_diff_pfx03 |    8045310 |          20 |            shop_id,create_date,id |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 | n_leaf_pages |      15288 |        NULL | Number of leaf pages in the index |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 |         size |      17600 |        NULL |      Number of pages in the index |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
8 rows in set (0.00 sec)

And we can update these tables directly:

mysql> update mysql.innodb_table_stats set n_rows=8000000 where table_name='goods';
Query OK, 1 row affected (0.18 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update mysql.innodb_index_stats set stat_value=8000000 where stat_description in('id', 'shop_id,create_date,id') and table_name='goods';
Query OK, 2 rows affected (0.08 sec)
Rows matched: 2 Changed: 2 Warnings: 0
mysql> update mysql.innodb_index_stats set stat_value=100 where stat_description in('shop_id') and table_name='goods';
Query OK, 1 row affected (0.09 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update mysql.innodb_index_stats set stat_value=169861 where stat_description in('shop_id,create_date') and table_name='goods';
Query OK, 1 row affected (0.08 sec)
Rows matched: 1 Changed: 1 Warnings: 0

I took index values from earlier, as calculated by this query:

select count(distinct id) as `Cardinality for PRIMARY`, count(distinct shop_id) as `Cardinality for shop_id column in index shop_id`, count(distinct shop_id, create_date) as `Cardinality for create_date column in index shop_id` from goods;

mysql> select * from mysql.innodb_table_stats where table_name='goods';
+---------------+------------+---------------------+---------+----------------------+--------------------------+
| database_name | table_name |         last_update |  n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+------------+---------------------+---------+----------------------+--------------------------+
|          test |      goods | 2017-09-05 00:47:45 | 8000000 |                34624 |                    17600 |
+---------------+------------+---------------------+---------+----------------------+--------------------------+
1 row in set (0.00 sec)
mysql> select * from mysql.innodb_index_stats where table_name='goods';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name |         last_update |    stat_name | stat_value | sample_size |                  stat_description |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
|          test |      goods |    PRIMARY | 2017-09-05 00:48:32 | n_diff_pfx01 |    8000000 |          20 |                                id |
|          test |      goods |    PRIMARY | 2017-09-05 00:21:12 | n_leaf_pages |      34484 |        NULL | Number of leaf pages in the index |
|          test |      goods |    PRIMARY | 2017-09-05 00:21:12 |         size |      34624 |        NULL |      Number of pages in the index |
|          test |      goods |    shop_id | 2017-09-05 00:49:13 | n_diff_pfx01 |        100 |          20 |                           shop_id |
|          test |      goods |    shop_id | 2017-09-05 00:49:26 | n_diff_pfx02 |     169861 |          20 |               shop_id,create_date |
|          test |      goods |    shop_id | 2017-09-05 00:48:32 | n_diff_pfx03 |    8000000 |          20 |            shop_id,create_date,id |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 | n_leaf_pages |      15288 |        NULL | Number of leaf pages in the index |
|          test |      goods |    shop_id | 2017-09-05 00:21:12 |         size |      17600 |        NULL |      Number of pages in the index |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
8 rows in set (0.00 sec)

Now the statistics are up to date, but not used:

mysql> explain select goods.* from goods join shops using(shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
+----+-------------+-------+------------+-------+---------------+---------+---------+--------------------+------+----------+--------------------------+
| id | select_type | table | partitions |  type | possible_keys |     key | key_len |                ref | rows | filtered |                    Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+--------------------+------+----------+--------------------------+
|  1 |      SIMPLE | shops |       NULL | index |       PRIMARY | PRIMARY |       4 |               NULL |  100 |   100.00 | Using where; Using index |
|  1 |      SIMPLE | goods |       NULL |   ref |       shop_id | shop_id |       4 | test.shops.shop_id |  534 |    11.11 |    Using index condition |
+----+-------------+-------+------------+-------+---------------+---------+---------+--------------------+------+----------+--------------------------+
2 rows in set, 1 warning (0.04 sec)

To finalize the changes, we need to run

FLUSH TABLE goods
:
mysql> FLUSH TABLE goods;
Query OK, 0 rows affected (0.00 sec)
mysql> explain select goods.* from goods join shops using(shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------+-------+----------+-----------------------+
| id | select_type | table | partitions |   type | possible_keys |     key | key_len |                ref |  rows | filtered |                 Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------+-------+----------+-----------------------+
|  1 |      SIMPLE | goods |       NULL |  range |       shop_id | shop_id |      10 |               NULL | 31997 |   100.00 | Using index condition |
|  1 |      SIMPLE | shops |       NULL | eq_ref |       PRIMARY | PRIMARY |       4 | test.goods.shop_id |     1 |   100.00 |           Using index |
+----+-------------+-------+------------+--------+---------------+---------+---------+--------------------+-------+----------+-----------------------+
2 rows in set, 1 warning (0.28 sec)
mysql> P md5sum
PAGER set to 'md5sum'
mysql> select goods.* from goods join shops using(shop_id) where create_date BETWEEN CONVERT_TZ('2015-11-01 00:00:00', 'MET','GMT') AND CONVERT_TZ('2015-11-07 23:59:59', 'MET','GMT') and goods.shop_id in(4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486);
4a94dabc4bfbfb7dd225bcb50278055b -
31896 rows in set (7.79 sec)

Now everything is good.

But

FLUSH TABLE
 is a blocking operation, right? Won’t it block queries and create a worse scenario than described for ANALYZE TABLE in this post?

At first glance this is true. But we can use the same trick Percona Toolkit uses: set

lock_wait_timeout
 to 1 and call
FLUSH
 in a loop. To demonstrate how it works, I use a similar scenario as described in the
ANALYZE TABLE
 blog post.

First, let’s reset the statistics to ensure our

FLUSH
 works as expected:
mysql> analyze table goods;
+------------+---------+----------+----------+
|      Table |      Op | Msg_type | Msg_text |
+------------+---------+----------+----------+
| test.goods | analyze |   status |       OK |
+------------+---------+----------+----------+
1 row in set (0.38 sec)
mysql> show indexes from goods;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| goods |          0 |  PRIMARY |            1 |          id |         A |     7765796 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            1 |     shop_id |         A |       14523 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            2 | create_date |         A |      168168 |     NULL |   NULL |  YES |      BTREE |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)

And then update

mysql.innodb_*_stats
 tables manually. Then check that Optimizer still sees outdated statistics:
mysql> update mysql.innodb_table_stats set n_rows=8000000 where table_name='goods';
Query OK, 1 row affected (0.09 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update mysql.innodb_index_stats set stat_value=8000000 where stat_description in('id', 'shop_id,create_date,id') and table_name='goods';
Query OK, 2 rows affected (0.09 sec)
Rows matched: 2 Changed: 2 Warnings: 0
mysql> update mysql.innodb_index_stats set stat_value=100 where stat_description in('shop_id') and table_name='goods';
Query OK, 1 row affected (0.11 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update mysql.innodb_index_stats set stat_value=169861 where stat_description in('shop_id,create_date') and table_name='goods';
Query OK, 1 row affected (0.10 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> show indexes from goods;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| goods |          0 |  PRIMARY |            1 |          id |         A |     7765796 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            1 |     shop_id |         A |       14523 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            2 | create_date |         A |      168168 |     NULL |   NULL |  YES |      BTREE |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)

Now let’s start a long running query in one session that blocks our

FLUSH TABLE
 command:
mysql> select sleep(1) from goods limit 1000, 300;

And let’s run

FLUSH TABLE
 in a loop:
sveta@Thinkie:~/build/ps-5.7/mysql-test$ until (`mysqlmtr -P13001 -e "set lock_wait_timeout=1; flush table goods;" test`); do sleep 1; done
ERROR 1205 (HY000) at line 1: Lock wait timeout exceeded; try restarting transaction
ERROR 1205 (HY000) at line 1: Lock wait timeout exceeded; try restarting transaction
ERROR 1205 (HY000) at line 1: Lock wait timeout exceeded; try restarting transaction
...

Now let’s ensure we can access the table:

mysql> select * from goods order by id limit 10;
^C

We cannot! We cannot even connect to the database where the table is stored:

sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysqlmtr -P13001 test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
^C

The reason for this is that while the 

FLUSH TABLE
 command was killed due to the metadata lock wait timeout, it also requested table lock for flushing and blocked other incoming queries.

But we can enclose

FLUSH TABLE
 into
LOCK TABLE ... WRITE; ... UNLOCK TABLES;
 operations. In this case, the 
LOCK TABLE
 command gets blocked until all queries release metadata lock on the table. Then it exclusively locks the table,
FLUSH TABLE
 runs and then the script immediately unlocks the table. Since closing the session causes an implicit unlock, I used a PHP one-liner to have everything in a single session:
$ php -r '
> $link = new mysqli("127.0.0.1", "root", "", "test", 13001);
> $link->query("set lock_wait_timeout=1");
> while(!$link->query("lock table goods write")) {sleep(1);}
> $link->query("flush table goods");
> $link->query("unlock tables");'

We can confirm if a parallel session can access the table:

mysql> select * from goods order by id limit 10;
+----+---------+----------------------------------+---------------------+
| id | shop_id |                             name |         create_date |
+----+---------+----------------------------------+---------------------+
|  1 |      58 | 5K0z2sHTgjWKKdryTaniQdZmjGjA9wls | 2015-09-19 00:00:00 |
|  2 |      17 | xNll02kgUTWAFURj6j5lL1zXAubG0THG | 2013-10-19 00:00:00 |
|  3 |      30 | clHX7uQopKmoTtEFH5LYBgQncsxRtTIB | 2017-08-01 00:00:00 |
|  4 |      93 | bAzoQTN98AmFjPOZs7PGfbiGfaf9Ye4b | 2013-02-24 00:00:00 |
|  5 |      20 | rQuTO5GHjP60kDbN6WoPpE2S8TtMbrVL | 2017-08-05 00:00:00 |
|  6 |      37 | WxqxA5tBHxikaKbuvbIF84H9QuaCnqQ3 | 2013-10-18 00:00:00 |
|  7 |      13 | DoYnFpQZSVV8UswBsWklgGBUc8zW9mVW | 2017-02-06 00:00:00 |
|  8 |      81 | dkNxMQyZNZuTrONEX4gxRLa0DOedatIs | 2015-07-05 00:00:00 |
|  9 |      12 | Z0t2uQ9itexpPf01KUpa7qBWlT5fBmXR | 2014-06-25 00:00:00 |
| 10 |      90 | 6urABBQyaUVVyxljvd11D3kUxbdDRPRV | 2013-10-23 00:00:00 |
+----+---------+----------------------------------+---------------------+
10 rows in set (0.00 sec)
mysql> update goods set name='test' where id=100;
Query OK, 1 row affected (0.08 sec)
Rows matched: 1 Changed: 1 Warnings: 0

After the PHP script finishes its job, statistics are corrected:

mysql> show index from goods;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| goods |          0 |  PRIMARY |            1 |          id |         A |     8000000 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            1 |     shop_id |         A |         100 |     NULL |   NULL |      |      BTREE |         |               |
| goods |          1 |  shop_id |            2 | create_date |         A |      169861 |     NULL |   NULL |  YES |      BTREE |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)

Conclusion

We can manually update persistent InnoDB statistics to fix Optimizer plans for our queries, with almost no impact on a live server.

关注dbDao.com的新浪微博

扫码加入微信Oracle小密圈,了解Oracle最新技术下载分享资源

TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569