A Quick Intro
In my last post I discussed the need to deploy a popular PHP CMS into the cloud which leveraged the scaleability and multi-az / multi-region redundancy offered by the excellent AWS. One of the first stumbling blocks to spreading the database load for our CMS (Concrete5 – which sure does generate a lot of queries for a page load! – ) was the inability to elegantly split our read / writes and send them to different MySQL nodes. After sifting through a few inadequate solutions, we settled on Galera as our solution.
Galera is a true multi-master, synchronous replication cluster for MySQL based on the well used InnoDB storage engine. Users can deploy the Galera cluster locally in LAN environments, as geo-clusters over the WAN or as a virtual cluster in cloud hosting platforms. Even better, Galera is offered as an Open Source software solution. It can be downloaded freely from http://www.codership.com.
The marketing blurb states:
“Galera Cluster is deployed by users who need highly available MySQL database back-ends with very fast fail-over time, consistent databases with no data loss and reduced investment in high availability architectures. Galera Cluster is ideal for business critical applications like Social networks, Gaming platforms, Telecom/VoiP solutions, Media and entertainment sites, Business Software as a Service (SaaS), Platform as a Service(PaaS), ERP systems, web-shops, e-commerce solutions or similar critical applications.”
Multi-master replication means that all slave nodes can be used as the master at any given time, that is to say we can write to any node in the pool indiscriminately. Unlike the single threaded nature of MySQL native replication which is liable for bottlenecks, Galera runs on a parallel replication approach which improves slave lag significantly. Applications connecting to any server in the cluster can be confident they are reading and writing to the same stateful dataset.
As all nodes communicate to all other nodes, transmitting writes as appropriate, we can forget about the read / write splitting otherwise required by Concrete5 for MySQL replication.
Galera also supports WAN clustering and synchronous replication within that. Unavoidably, and understandably, there may be short periods of lag due to the network round-trip-time required by the synchronous state. Our testing shows that the round trip between Singapore, Ireland and East-Cost U.S is still well within the milliseconds rather than seconds.
For those of you running a slightly tighter applications, Galera also assigns a Global Transaction ID for all replicated transactions, therefore transactions can be uniquely referenced in any node.
Galera also facilitates automatic node joining. The cluster chooses ‘donor’ for the ‘joiner’.
The path has been well trodden on EC2, Galera has been well tested and deployed on the Amazon EC2 environment already.
To start the cluster, Galera must have a minimum of 3 nodes in the cluster to operate effectively. As Galera will try and promote any server to a master server, a cluster of less (e.g. 2) would suffer from a ‘split brain’ scenario wherein both servers are alive but unable to communicate to each other. With 3 servers this ‘split brain’ scenario cannot happen, so long as two servers continue to communicate. To extend this, an odd number of nodes is recommended as an application scales up.
The number of Galera nodes in the pool should not exceed 10, the mathematics are sound, if a little complicated, and can be seen on Codership here: http://www.codership.com/content/multi-master-arithmetics.
As mentioned, replication only works with the InnoDB storage engine, writes to other table types will simply not be replicated. We are using this “limitation” to easily remove replication for tables we know we don’t need to see replicated, e.g. non-critical logging tables. Rest assured however, DDL statements are replicated in statement level, and so any changes to mysql.* tables are replicated. We can safely issue:CREATE USER… or GRANT… but issuing: INSERT INTO mysql.user… would not be replicated. By their nature, non-transactional engines cannot be supported in multi-master replication.
Concrete5 requires some MyISAM tables for its text-search facility, specifically PageSearchIndex. it is relevant to note that MySQL 5.6 supports text searching within InnoDB tables (http://blogs.innodb.com/wp/2011/12/innodb-full-text-search-in-mysql-5-6-4), Galera will be releasing an updated version soon to incorporate this. In the interim, we’ve removed these indexes and moved to InnoDB regardless. In our case, we’re already deepening the functionality of the search engine by incorporating SOLR, and it has no impact on our existing project roadmap.
The next blog post will cover the various cluster configurations we have open to us on EC2 and, time permitting, some other general considerations for EC2, such as disk io, disaster recovery and more.