mercredi 21 décembre 2011

Fine tuning read_rnd_buffer_size and read_buffer_size

Those variables are  easy to tune.

The evidence popup here between 128 and 256K for in memory workload.

My SQL SQL layer is optimistic and think that data will come from the storage engine at the speed of the memory bus , but in practice for IO bound workload and range scan in the table order it could not be the case , for MyISAM when reading from tables on disk, IOs are aligned on the read buffer so it may be of an intrest to increase it here.

In myisam.cc my_default_record_cache_size is set to  global_system_variables.read_buff_size;

To show IO size :
strace -p id_mysqld -e read -e write

If  monitoring smaller IO  request size from iostat  -xm, it may be that you are suffering fragmentation that does not help the FS cache to serve you efficiently. Optimize table ,Alter table order is a great help in MyISAM

In InnoDB the difference come with prefetch trigger when no fragmentation. It is the case when rows are inserted in increasing primary key order. So conservative read and rnd buffer values will speed up  the data move from the buffer_pool to the SQL thread with almost no impact on the disk speed rate .

Note that the read_rnd_buff will size the thread record cache when rr_sequential and rr_quick in case of full scan and some range scan

In record.cc

static int init_rr_cache(THD *thd, READ_RECORD *info)
 info->cache_records= (thd->variables.read_rnd_buff_size /
                        (info->reclength+info->struct_length));
 

lundi 19 décembre 2011

SUM for the 5 best models in 5 best brands

As we can see in the solution:

Expression are evaluated in the column order in the select clause.

This is the same for the where clause.  An advice is to test first the expression that get more chance to return false at the head of the where conditions :   



  1. SET @BRAND=0;
  2. SET @ct=0;
  3. SELECT
  4.  *
  5. FROM
  6.  (SELECT
  7.    IF(@BRAND<>id_BRAND,@ct:=0 ,@ct:=@ct ) ,
  8.    @ct:=@ct+1 AS ct,   
  9.    @BRAND:=id_BRAND,
  10.    t2.*
  11.   FROM
  12.    (SELECT
  13.      st.ID_BRAND,
  14.      st.ID_MODEL,
  15.      SUM(COMPTE) AS total,
  16.      AVG(totalm)
  17.     FROM
  18.      STAT st JOIN
  19.      (SELECT
  20.        ID_BRAND,
  21.        SUM(COMPTE) AS totalm
  22.       FROM
  23.        STAT
  24.       GROUP BY
  25.        ID_BRAND
  26.       ORDER BY
  27.        totalm DESC
  28.       LIMIT 5
  29.      ) as t USING( ID_BRAND)
  30.     GROUP BY
  31.      st.ID_BRAND,
  32.      st.ID_MODEL
  33.     ORDER BY
  34.      st.ID_BRAND,
  35.      total DESC
  36.    ) AS t2
  37.  ) AS t3
  38. WHERE
  39.  ct <=5

jeudi 15 décembre 2011

Back on disk temporary tables

We already blog a lot on temporary tables on disk, how this could be bad for your workload and should be avoid.  Each temporary table will open a file descriptor,  external kernel call and by nature a well know file system slow operation.

We can point this benchmark simulating a working mail server


Monitoring of such queries can be trace via the status of

created_tmp_disk_tables 

Evidence of what is happening watching many Aria or MyISAM tables created on disk :

/usr/sbin/lsof  | grep "mysql" | grep "#"
mysqld     1855     mysql 1658u      REG                8,1       8192    1505932 /tmp/#sql_73f_1.MAI
mysqld     1855     mysql 1659u      REG                8,1       8192    1505936 /tmp/#sql_73f_1.MAD
/usr/sbin/lsof  | grep "mysql" | grep "#"
mysqld     1855     mysql 1650u      REG                8,1       8192    1505946 /tmp/#sql_73f_1.MAI
mysqld     1855     mysql 1654u      REG                8,1       8192    1505951 /tmp/#sql_73f_1.MAD

Now in MariaDB and Percona Server you can found which queries  causing you trouble with the slow query log

set global log_slow_verbosity=2; 

And you can retieve this information in the slow log :

# User@Host: root[root] @ localhost []
# Thread_id: 93  Schema: test  QC_hit: No
# Query_time: 0.00157  Lock_time: 0.000034  Rows_sent: 4  Rows_examined: 12
# Full_scan: No  Full_join: No  Tmp_table: Yes  Tmp_table_on_disk: Yes
# Filesort: Yes  Filesort_on_disk: No  Merge_passes: 0
SET timestamp=1323937380;
select * from (select user_id,created,  (select user_email from  login2 b where b.user_id=a.user_id ) from login2 a where a.user_id IN (100005, 100002, 100004, 100003)) as a order by created;

Solutions to fixe the issue can be :
  • Add an index to cover the resultset order 
  • Increasing tmp_table_size 
  • Increasing max_heap_table_size 
  • Removing blobs from your queries if you need to sort them in memory 
But somtimes this is not enouth let's have a deeper look :

select * from (
  select 
    user_id,
    created,  
    (select user_email from  login2 b where b.user_id=a.user_id ) as mail  
  from 
    login2 a 
  where 
    a.user_id IN (100005, 100002, 100004, 100003)
) as a order by created;
+---------+---------------------+---------------------+
| user_id | created             | mail                |
+---------+---------------------+---------------------+
|  100002 | 2011-01-11 17:16:45 | stephane@skysql.com |
|  100003 | 2011-01-11 17:16:45 | stephane@skysql.com |
|  100004 | 2011-01-11 17:16:46 | stephane@skysql.com |
|  100005 | 2011-01-11 17:16:46 | stephane@skysql.com |
+---------+---------------------+---------------------+



And inside the table :

CREATE TABLE `login2` (
  `user_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `user_name` varchar(50) DEFAULT NULL,
  `user_email` varchar(2000) CHARACTER SET utf8 DEFAULT NULL,
  `created` datetime DEFAULT NULL,
  PRIMARY KEY (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=102048 DEFAULT CHARSET=latin1;


What you can note here is :

`user_email` varchar(2000) CHARACTER SET utf8 DEFAULT NULL

That mail column is embeded in the derivated query and sorted later on the upper query

Well for Monty it tooks few secondes to point me that this is comming from a performance mechanism hardcoded in a constant of the source code

CONVERT_IF_BIGGER_TO_BLOB

This set to 512B and the reason is that in most case sorting would be less performant starting in memory and then converted to disk later.  

It is use like this :

sql/item.cc:  If max_length > CONVERT_IF_BIGGER_TO_BLOB create a blob @n
sql/item.cc:  if (max_length/collation.collation->mbmaxlen > CONVERT_IF_BIGGER_TO_BLOB)
sql/item.cc:    if (fixed_length && max_length < CONVERT_IF_BIGGER_TO_BLOB)
sql/item_sum.h:    if (max_length/collation.collation->mbmaxlen > CONVERT_IF_BIGGER_TO_BLOB )
sql/sql_select.cc:      if ((*tmp->item)->max_length >= CONVERT_IF_BIGGER_TO_BLOB)
sql/unireg.h:#define CONVERT_IF_BIGGER_TO_BLOB 512   


So the max lenght of varchar for not  touching this case would be  512

Alter table login2  modify column  user_email varchar(512) charset utf8;
# Full_scan: Yes  Full_join: No  Tmp_table: Yes  Tmp_table_on_disk: No

Alter table login2  modify column  user_email varchar(513) charset utf8; 
# Full_scan: Yes  Full_join: No  Tmp_table: Yes  Tmp_table_on_disk: Yes

Now you have a solution to fixe this in MariaDB 5.3 with the work done on the optimizer :

just set :

set global optimizer_switch='derived_merge=on'
# Full_scan: Yes  Full_join: No  Tmp_table: Yes  Tmp_table_on_disk: No

Kudo to Sergei Petrunia but wait this is not all 

derived_merge=off
 
mysqlslap --create-schema=test -q testPV2.sql -c2 --number-of-queries=100000 -uroot -ptoto
Benchmark
    Average number of seconds to run all queries: 137.111 seconds
    Minimum number of seconds to run all queries: 137.111 seconds
    Maximum number of seconds to run all queries: 137.111 seconds
    Number of clients running queries: 2
    Average number of queries per client: 50000


derived_merge=on
 
mysqlslap --create-schema=test -q testPV2.sql -c2 --number-of-queries=100000 -uroot -ptoto
Benchmark
    Average number of seconds to run all queries: 14.540 seconds
    Minimum number of seconds to run all queries: 14.540 seconds
    Maximum number of seconds to run all queries: 14.540 seconds
    Number of clients running queries: 2
    Average number of queries per client: 50000
 More the is a bonus here optimizer take less cpu cycles compare to Vanilla MySQL Oracle tm

+----------------------+----------+----------+-----------+
| c2                   | ma       | my       | diff      |
+----------------------+----------+----------+-----------+
|  cleaning up         | 0.000004 | 0.000004 |  0.000000 |
|  closing tables      | 0.000008 | 0.000028 | -0.000020 |
|  executing           | 0.000034 | 0.000036 | -0.000002 |
|  freeing items       | 0.000030 | 0.000030 |  0.000000 |
|  init                | 0.000118 | 0.000019 |  0.000099 |
|  logging slow query  | 0.000003 | 0.000003 |  0.000000 |
|  Opening tables      | 0.000049 | 0.000154 | -0.000105 |
|  optimizing          | 0.000061 | 0.000042 |  0.000019 |
|  preparing           | 0.000085 | 0.000050 |  0.000035 |
|  query end           | 0.000003 | 0.000005 | -0.000002 |
|  removing tmp table  | 0.000081 | 0.000058 |  0.000023 |
|  Sending data        | 0.000681 | 0.000609 |  0.000072 |
|  Sorting result      | 0.000264 | 0.000031 |  0.000233 |
|  starting            | 0.000165 | 0.000148 |  0.000017 |
|  statistics          | 0.000365 | 0.000794 | -0.000429 |
|  System lock         | 0.000014 | 0.000008 |  0.000006 |
|  Table lock          | 0.000015 | 0.000308 | -0.000293 |
+----------------------+----------+----------+-----------+