mercredi 11 décembre 2013

MariaDB world record price per row 0.0000005$ on a single DELL R710

Don't look at an industry benchmark here, it's a real client story.

200 Billion records in a month and it should be transactional but not durable.

For regular workload we use LOAD DATA INFILE into partitioned InnoDB, but here we have estimated 15TB of RAID storage. This is a lot of disks and it can't no more stay inside a single server internal storage.

MariaDB 5.5 come with TokuDB storage engine for compression, but is it possible in the time frame impose by the workload?

We start benchmarking 380G of raw input data files,  6 Billion rows.

First let's check the compression with the dataset.

Great job my TokuDB 1/5, without tuning a single parameter other than durability! well i love you more every day my TokuDB.

My ex InnoDB, 30% compression missed in 8K, very bad compression ratio and slow insertion time. Don't worry InnoDB i still love you in memory :) 

Ok every love affair have a dark side :)

So now you can see that it works for 200 Billions rows because it give 277 hours of processing time at 200K insert/s.

In a month if we impose 12 hours, 6 days a week of processing with full capacity this is 288 hours.

That was very short, getting compression over 200 Billions records and without sharding will be hard.

Fortunately MariaDB 10 have native network partitioning using the spider contribution don't miss that.