lundi 29 juin 2015

Slave Election is welcoming GTID

Slave election is a popular HA architecture,  first MySQL MariaDB toolkit to manage switchover and failover in a correct way was introduce by Yoshinori Matsunobu into MHA.

Failover and switchover in asynchronous clusters require caution:

- The CAP theorem need to be satisfy. Getting strong consistency, require the slave election to reject transactions ending up in the old master when electing the candidate master.

- Slave election need to take care that all events on the old master are applied to the candidate master before switching roles.

- Should be instrumented to found a good candidate master and make sure it's setup to take the master role.

- Need topology detection, a master role can't be pre defined, as the role is moving around nodes .

- Need monitoring to escalate switchover to failover.

MHA as been coded at a time no unique event id was possible in a cluster,  each event was track as independent coordinate on each node, making MHA architecture to have an internal way to rematch coordinate on all nodes.

With introduction of GTID, MHA brings the heritage and looks like unnecessary complex, with an agent base solution and ssh connections requirement to all nodes .

A lighter MHA was needed for MariaDB when the replication is using GTID, and that's what my colleague Guillame Lefranc have been addressing inside a new MariaDB toolkit

In MariaDB GTID usage is as simple as:

#>stop slave;change master to master_use_gtid=current_pos;start slave; 

As a bonus, the code is in golang and do not require any external dependencies
We can enjoy a singe command line procedure in interactive mode.

mariadb-repmgr -hosts=9.3.3.55:3306,9.3.3.56:3306,9.3.3.57:3306 -user=admin:xxxxx -rpluser=repl:xxxxxx -pre-failover-script="/root/pre-failover.sh" -post-failover-script="/root/post-failover.sh" -verbose -maxdelay 15    
Don't be afraid default is to run in interactive mode and it does not launch anything yet.


In my post configuration script i usually update some haproxy configuration store in a NAS or a SAN and reload or shoot in the head all proxies

Note that the new elected master will be passed as second argument of the script.

I strongly advice not to try to auto failover base on some monitoring, get a good replication monitoring tool and analyze all master status alerts, checking for false positive situation before enjoying pre coded failover.

Loss less semi-synchronous replication in MDEV-162  and multiple performance improvements of semi-synchronous MDEV-7257, have made it to MariaDB 10.1, it can be use to greatly improve zero data lost in case of failure . Combine with parallel replication it's now possible to have an HA architecture that is as robust as asynchronous can be, and under replication delay control is crash safe as well.    

Galera aka MariaDB Cluster as a write speed limit bound to upper network speed, it come at the advantage to always offer crash safe consistency. Slave election HA have the master disk speed limit and do not suffer lower network speed but is losing consistency in failover when slave can't catch.

Interesting time to see how flash storage adoption flavor one or the other architecture.