jeudi 15 décembre 2011

Back on disk temporary tables

We already blog a lot on temporary tables on disk, how this could be bad for your workload and should be avoid.  Each temporary table will open a file descriptor,  external kernel call and by nature a well know file system slow operation.

We can point this benchmark simulating a working mail server


Monitoring of such queries can be trace via the status of

created_tmp_disk_tables 

Evidence of what is happening watching many Aria or MyISAM tables created on disk :

/usr/sbin/lsof  | grep "mysql" | grep "#"
mysqld     1855     mysql 1658u      REG                8,1       8192    1505932 /tmp/#sql_73f_1.MAI
mysqld     1855     mysql 1659u      REG                8,1       8192    1505936 /tmp/#sql_73f_1.MAD
/usr/sbin/lsof  | grep "mysql" | grep "#"
mysqld     1855     mysql 1650u      REG                8,1       8192    1505946 /tmp/#sql_73f_1.MAI
mysqld     1855     mysql 1654u      REG                8,1       8192    1505951 /tmp/#sql_73f_1.MAD

Now in MariaDB and Percona Server you can found which queries  causing you trouble with the slow query log

set global log_slow_verbosity=2; 

And you can retieve this information in the slow log :

# User@Host: root[root] @ localhost []
# Thread_id: 93  Schema: test  QC_hit: No
# Query_time: 0.00157  Lock_time: 0.000034  Rows_sent: 4  Rows_examined: 12
# Full_scan: No  Full_join: No  Tmp_table: Yes  Tmp_table_on_disk: Yes
# Filesort: Yes  Filesort_on_disk: No  Merge_passes: 0
SET timestamp=1323937380;
select * from (select user_id,created,  (select user_email from  login2 b where b.user_id=a.user_id ) from login2 a where a.user_id IN (100005, 100002, 100004, 100003)) as a order by created;

Solutions to fixe the issue can be :
  • Add an index to cover the resultset order 
  • Increasing tmp_table_size 
  • Increasing max_heap_table_size 
  • Removing blobs from your queries if you need to sort them in memory 
But somtimes this is not enouth let's have a deeper look :

select * from (
  select 
    user_id,
    created,  
    (select user_email from  login2 b where b.user_id=a.user_id ) as mail  
  from 
    login2 a 
  where 
    a.user_id IN (100005, 100002, 100004, 100003)
) as a order by created;
+---------+---------------------+---------------------+
| user_id | created             | mail                |
+---------+---------------------+---------------------+
|  100002 | 2011-01-11 17:16:45 | stephane@skysql.com |
|  100003 | 2011-01-11 17:16:45 | stephane@skysql.com |
|  100004 | 2011-01-11 17:16:46 | stephane@skysql.com |
|  100005 | 2011-01-11 17:16:46 | stephane@skysql.com |
+---------+---------------------+---------------------+



And inside the table :

CREATE TABLE `login2` (
  `user_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `user_name` varchar(50) DEFAULT NULL,
  `user_email` varchar(2000) CHARACTER SET utf8 DEFAULT NULL,
  `created` datetime DEFAULT NULL,
  PRIMARY KEY (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=102048 DEFAULT CHARSET=latin1;


What you can note here is :

`user_email` varchar(2000) CHARACTER SET utf8 DEFAULT NULL

That mail column is embeded in the derivated query and sorted later on the upper query

Well for Monty it tooks few secondes to point me that this is comming from a performance mechanism hardcoded in a constant of the source code

CONVERT_IF_BIGGER_TO_BLOB

This set to 512B and the reason is that in most case sorting would be less performant starting in memory and then converted to disk later.  

It is use like this :

sql/item.cc:  If max_length > CONVERT_IF_BIGGER_TO_BLOB create a blob @n
sql/item.cc:  if (max_length/collation.collation->mbmaxlen > CONVERT_IF_BIGGER_TO_BLOB)
sql/item.cc:    if (fixed_length && max_length < CONVERT_IF_BIGGER_TO_BLOB)
sql/item_sum.h:    if (max_length/collation.collation->mbmaxlen > CONVERT_IF_BIGGER_TO_BLOB )
sql/sql_select.cc:      if ((*tmp->item)->max_length >= CONVERT_IF_BIGGER_TO_BLOB)
sql/unireg.h:#define CONVERT_IF_BIGGER_TO_BLOB 512   


So the max lenght of varchar for not  touching this case would be  512

Alter table login2  modify column  user_email varchar(512) charset utf8;
# Full_scan: Yes  Full_join: No  Tmp_table: Yes  Tmp_table_on_disk: No

Alter table login2  modify column  user_email varchar(513) charset utf8; 
# Full_scan: Yes  Full_join: No  Tmp_table: Yes  Tmp_table_on_disk: Yes

Now you have a solution to fixe this in MariaDB 5.3 with the work done on the optimizer :

just set :

set global optimizer_switch='derived_merge=on'
# Full_scan: Yes  Full_join: No  Tmp_table: Yes  Tmp_table_on_disk: No

Kudo to Sergei Petrunia but wait this is not all 

derived_merge=off
 
mysqlslap --create-schema=test -q testPV2.sql -c2 --number-of-queries=100000 -uroot -ptoto
Benchmark
    Average number of seconds to run all queries: 137.111 seconds
    Minimum number of seconds to run all queries: 137.111 seconds
    Maximum number of seconds to run all queries: 137.111 seconds
    Number of clients running queries: 2
    Average number of queries per client: 50000


derived_merge=on
 
mysqlslap --create-schema=test -q testPV2.sql -c2 --number-of-queries=100000 -uroot -ptoto
Benchmark
    Average number of seconds to run all queries: 14.540 seconds
    Minimum number of seconds to run all queries: 14.540 seconds
    Maximum number of seconds to run all queries: 14.540 seconds
    Number of clients running queries: 2
    Average number of queries per client: 50000
 More the is a bonus here optimizer take less cpu cycles compare to Vanilla MySQL Oracle tm

+----------------------+----------+----------+-----------+
| c2                   | ma       | my       | diff      |
+----------------------+----------+----------+-----------+
|  cleaning up         | 0.000004 | 0.000004 |  0.000000 |
|  closing tables      | 0.000008 | 0.000028 | -0.000020 |
|  executing           | 0.000034 | 0.000036 | -0.000002 |
|  freeing items       | 0.000030 | 0.000030 |  0.000000 |
|  init                | 0.000118 | 0.000019 |  0.000099 |
|  logging slow query  | 0.000003 | 0.000003 |  0.000000 |
|  Opening tables      | 0.000049 | 0.000154 | -0.000105 |
|  optimizing          | 0.000061 | 0.000042 |  0.000019 |
|  preparing           | 0.000085 | 0.000050 |  0.000035 |
|  query end           | 0.000003 | 0.000005 | -0.000002 |
|  removing tmp table  | 0.000081 | 0.000058 |  0.000023 |
|  Sending data        | 0.000681 | 0.000609 |  0.000072 |
|  Sorting result      | 0.000264 | 0.000031 |  0.000233 |
|  starting            | 0.000165 | 0.000148 |  0.000017 |
|  statistics          | 0.000365 | 0.000794 | -0.000429 |
|  System lock         | 0.000014 | 0.000008 |  0.000006 |
|  Table lock          | 0.000015 | 0.000308 | -0.000293 |
+----------------------+----------+----------+-----------+

Aucun commentaire:

Enregistrer un commentaire