jeudi 12 novembre 2015

Big Data: InfiniDB vs Spider: What else ?

Many of my recent engagements have been all around strategy to implement Real Time Big Data Analytics: Computing hardware cost of extending a single table collection with MariaDB and Parallel Query found in the Spider storage engine to offload columnar MPP storage like InfiniDB or Vertica.

As of today Parallel Query is only available from releases of MariaDB Spider supported by spiral arms. The more efficient way to use parallel query with Spider can be done on group by, and count queries that use a single spider table. In such case Spider Engine will execute query push down AKA map reduce.

Spider gets multiple levels of parallel execution for a single partitioned tables.

First level is per backend server:
The way to actually tell spider to scan different backends in concurrency is to set  spider_sts_bg_mode=1

Other level is per partition:
The way to actually tell spider to scan different partitions in concurrency is by set  spider_casual_read=1

Hey but that's sometimes not enough! Per partition parallel query can conflict with the connection recycling feature of Spider. Behind the scene Spider search for some already open connections via a hash table that contains every partition connections: Host, User, Password, Port or Socket, and will reuse the same connection if found one. To really enable concurrent scan inside the same backend or inside localhost you need to create different users to abuse spider on connection recycling and the server should use TCP connection string and not socket or name pipes.

In real life scenario, it's always good to create a different user per partition table and grant only SELECT privileges to the single table accessed attaching the Spider partition. in DDL you later map to partitions via creating a unique server per partition that map every user.

Here is a local test case attach to this scenario inside a single server to use multiple cores to produce an aggregate.

CREATE OR REPLACE TABLE `test` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `a` varchar(1000) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

insert into test select *,'test' from seq_1_to_4000000; 

create or replace view v1 as select * from test where id>=0000000 and id<1000000;
create or replace view v2 as select * from test where id>=1000000 and id<2000000;
create or replace view v3 as select * from test where id>=2000000 and id<3000000;
create or replace view v4 as select * from test where id>=3000000 and id<4000000;

grant all on *.* to root1@'127.0.0.1' identified by 'mariadb' ;
grant all on *.* to root2@'127.0.0.1' identified by 'mariadb' ;
grant all on *.* to root3@'127.0.0.1' identified by 'mariadb' ;
grant all on *.* to root4@'127.0.0.1' identified by 'mariadb' ;

create or replace server l1 FOREIGN DATA WRAPPER mysql OPTIONS (USER 'root1',PORT 3307,PASSWORD 'mariadb',HOST '127.0.0.1',DATABASE 'test' );
create or replace server l2 FOREIGN DATA WRAPPER mysql OPTIONS (USER 'root2',PORT 3307,PASSWORD 'mariadb',HOST '127.0.0.1',DATABASE 'test' );
create or replace server l3 FOREIGN DATA WRAPPER mysql OPTIONS (USER 'root3',PORT 3307,PASSWORD 'mariadb',HOST '127.0.0.1',DATABASE 'test' );
create or replace server l4 FOREIGN DATA WRAPPER mysql OPTIONS (USER 'root4',PORT 3307,PASSWORD 'mariadb',HOST '127.0.0.1',DATABASE 'test' );

create or replace table test_spd  ( 
  `id` int(11) NOT NULL AUTO_INCREMENT, 
  `a` varchar(1000) DEFAULT NULL,   
PRIMARY KEY (`id`) ) ENGINE=SPIDER PARTITION BY RANGE(id) (
PARTITION P1 VALUES LESS THAN (1000000) COMMENT='table "v1", srv "l1"',
PARTITION P2 VALUES LESS THAN (2000000) COMMENT='table "v2", srv "l2"',
PARTITION P3 VALUES LESS THAN (3000000) COMMENT='table "v3", srv "l3"',
PARTITION P4 VALUES LESS THAN (4000000) COMMENT='table "v4", srv "l4"');

In this scenario queries on my 40 Million record table will use up to 4 cores or the number of views   that materialized each partition used inside the Spider table.

Now from a real use case, let's ask David Chanial, devops on the Beleive Digital platform that demonstrate this on a 8 node 64 cores cluster.

SVA: How do you initialize the cluster?

DCH: To manage the farm we are deploying using the python fabric library! We had create some  deployment scripts that take as input the node list, the number for cores and the number of replicates.

SVA: What type of tables definitions and data sizing are you using :

DCH: Let's have a look at the spider 2,5 Billion record table ,

MariaDB [spider01_ro]> show table status LIKE 'Sales' \G

           Name: Sales 
         Engine: SPIDER
        Version: 10
     Row_format: Dynamic
           Rows: 2500657155
 Avg_row_length: 91
    Data_length: 228106478935
   Index_length: 40839657478
 Auto_increment: 2618963872
      Collation: utf8_general_ci 
 Create_options: partitioned

             
SVA: What type of of partitioning?

DCH: That's a mixture of technical per auto increment and per business segments. Indeed we are using sub partitioning with double spider tables that point to TokuDB or InnoDB tables in this case and reduce that to modulo of number of cores in the cluster.

SVA: What performance can you get?

MariaDB [spider01_ro]> 
select count(*) from spider01_ro.Sales t; 
select idGenre, count(*) from spider01_ro.Sales GROUP BY idGenre;
+------------+
| count(*)   |
+------------+
| 2506437338 |
+------------+
1 row in set (8.87 sec)

+---------+-----------+
| idGenre | count(*)  |
+---------+-----------+
|       0 |      8137 |
|       1 |  56044584 |
|       2 |  21179162 |
|       3 |  25446110 |
|       4 |  31829221 |
|     293 |   1386236 |
|     294 |     47109 |
|     295 |     50776 |
|     296 |       988 |

..........
|     297 |     47589 |
|     298 |      9610 |
|     299 |      5215 |
|     300 |       224 |
+---------+-----------+
295 rows in set (16.00 sec)

Indeed 149M records read per sec on 8 nodes and inside a single node all the cores are working hard :


| 2848 | tsrc_p15_c02 | 10.93.1.12:51388 | spider01 | Query   |     1 | Queried about 730000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p15_c02` group by `idGenre` |    0.000 |
| 2849 | tsrc_p15_c10 | 10.93.1.12:51396 | spider01 | Query   |     1 | Queried about 720000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p15_c10` group by `idGenre` |    0.000 |
| 2850 | tsrc_p15_c18 | 10.93.1.12:51404 | spider01 | Query   |     1 | Queried about 950000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p15_c18` group by `idGenre` |    0.000 |
| 2851 | tsrc_p15_c26 | 10.93.1.12:51412 | spider01 | Query   |     1 | Queried about 740000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p15_c26` group by `idGenre` |    0.000 |
| 2852 | tsrc_p15_c34 | 10.93.1.12:51420 | spider01 | Query   |     1 | Queried about 1060000 rows                                                  | select count(0),`idGenre` from `spider01`.`tsrc_p15_c34` group by `idGenre` |    0.000 |
| 2853 | tsrc_p15_c42 | 10.93.1.12:51428 | spider01 | Query   |     1 | Queried about 920000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p15_c42` group by `idGenre` |    0.000 |
| 2854 | tsrc_p15_c50 | 10.93.1.12:51436 | spider01 | Query   |     1 | Queried about 530000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p15_c50` group by `idGenre` |    0.000 |
| 2855 | tsrc_p15_c58 | 10.93.1.12:51444 | spider01 | Query   |     1 | Queried about 790000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p15_c58` group by `idGenre` |    0.000 |
| 2856 | tsrc_p16_c02 | 10.93.1.12:51452 | spider01 | Query   |     1 | Queried about 760000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p16_c02` group by `idGenre` |    0.000 |
| 2857 | tsrc_p16_c10 | 10.93.1.12:51460 | spider01 | Query   |     1 | Queried about 660000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p16_c10` group by `idGenre` |    0.000 |
| 2858 | tsrc_p16_c18 | 10.93.1.12:51468 | spider01 | Query   |     1 | Queried about 940000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p16_c18` group by `idGenre` |    0.000 |
| 2859 | tsrc_p16_c26 | 10.93.1.12:51476 | spider01 | Query   |     1 | Queried about 930000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p16_c26` group by `idGenre` |    0.000 |
| 2860 | tsrc_p16_c34 | 10.93.1.12:51484 | spider01 | Query   |     1 | Queried about 910000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p16_c34` group by `idGenre` |    0.000 |
| 2861 | tsrc_p16_c42 | 10.93.1.12:51492 | spider01 | Query   |     1 | Queried about 800000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p16_c42` group by `idGenre` |    0.000 |
| 2862 | tsrc_p16_c50 | 10.93.1.12:51500 | spider01 | Query   |     1 | Queried about 770000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p16_c50` group by `idGenre` |    0.000 |
| 2863 | tsrc_p16_c58 | 10.93.1.12:51508 | spider01 | Query   |     1 | Queried about 740000 rows                                                   | select count(0),`idGenre` from `spider01`.`tsrc_p16_c58` group by `idGenre` |    0.000 |
       
SVA: Thanks David

Take away:

We conclude that columnar model in such scenario with less than 20 columns is only 10 time more efficient for a given hardware cost. But RDBMS take the lead on small range indexed queries.

Parallel and distributed queries is never an easy task but we can make it shine just on regular good old well stable OLTP storage engine.

Stay tuned, thanks to Spiral Arms Open Source spirit, a special to us foundation sponsor, and more and more supported clients inside MariaDB, we will get more of that spider release inside official MariaDB 10.2. branch.

If you feel to help finance such move in getting support ask your MariaDB sales rep some specific spider support or directly to a Spiral Arms sales rep.