Tag Archives: Galera

PHP on EC2 (AWS) with Multi A-Z and Multi Region (Part 3)

Looking for Part 1? See: http://www.headenergy.co.uk/2012/10/php-on-ec2-aws-with-multi-a-z-and-multi-region
Looking for Part 2? See: http://www.headenergy.co.uk/2012/10/php-on-ec2-aws-with-multi-a-z-and-multi-region-part-2
Looking for Part 4? See: http://www.headenergy.co.uk/2012/10/php-on-ec2-aws-with-multi-a-z-and-multi-region-part-4
 

A Quick Intro

In this part of the mini series, we progress into the depths of cloud deployment strategies for the now selected MySQL multi-master replication offering from Codership, Galera. In my previous post I went into detail on why this choice was made, I’d recommend giving it a read if you haven’t!

It is not enough simply to install Galera in the cloud, we need to decide how to leverage the technology within the cloud infrastructure to the best effect for our use case. In this part we’ll discuss 4 of the deployment configurations that are open to us and work out which is best for our needs.

Stack Cluster
So, lets start simple, in this Stack Cluster we are scaffolding an entire dedicated collection of nodes. To scale we simply add another collection of servers from every layer and connect them together.

Pros:

1. It’s exceedingly easy to manage. You also have the option of placing an entire branch (/stack) of software onto one box, making a very elegant deployment strategy.

2. There’s a direct connection from the applications to their database nodes, minimising latency overheads.

Cons

1. Inefficient use of resources for several reasons:

  • Overuse: Database servers usually offer more spare capacity than their application layer counterparts, so dedicating a DBMS to a single branch may be overkill.
  • Bad resource consolidation: One server with a 7Gb buffer pool is much faster than two servers with 4Gb.
  • Increased unproductive overheads: Each server duplicates the work of the others.

2. If a DBMS (Galera) node fails, you’ve lost the entire branch.

3. Increased roll back rate, due to cluster-wide conflicts.

4. Inflexibility: There’s no way here to limit the number of master nodes or perform intelligent load balancing.

DBMS Tier Clustering
To address the short comings of the Stack Cluster, we could move the DBMS tier away from the application branches and present them as a single virtual server, through the use of a load balancer (Head Energy use HA Proxy, even in the EC2 environment at this stage).

Pros:

1. If one of the DBMS nodes fails, it fails in isolation, as we no longer lose application servers.

2. Improved resource consolidation.

3. Greatly improved flexibility, as we can now dedicate individual DBMS nodes to perform particular roles, and intelligently load balance when required.

Cons:

1. We now have a new single point of failure in the system, the load balancer. If you deploy this configuration, you’ll need to deploy two load balancer nodes, and arrange a failover between them.

2. Increasing management complexity, as we now have to manage the new load balancer, configuring it properly whenever a node fails or joins the cluster.

3. As we’ve introduced a new layer, the load balancers, we’ll have increased the latency for each query too. This can produce a bottleneck in some applications, and certainly the load balancers should be powerful, well equipped nodes.

4. Spreading this configuration over multiple availability zones (or data centres) may reverse all the benefits of resource consolidation, as each database centre will require at least 2 DBMS nodes.

DBMS Tier Clustering (with a distributed load balancer)
Pros:

1. A slight modification to the above configuration. In this modified configuration the load balancer is no longer a single point of failure, indeed it can now scale with the application layer and is unlikely to be a performance bottleneck in and of itself.

2. The latencies on communication between the application and server tiers will be lessened.

Cons:

There are now N load balancers to manage, and reconfigure when nodes leave and enter the pools. This can be somewhat mitigated with software deployment strategies such as Puppet, or using some more advanced load balanced with replication support.

Aggregated Stack Cluster
Last, but not least, the aggregated stack cluster is a hybrid approach of the configurations we’ve seen above. It’s tailored to smaller sites which might not need much more than replication across multiple zones / data centres. This is essentially what the previous configuration would look like if we leave one DBMS per stack.

Pros:

1. It improves resource utilization of an entire stack cluster.

2. Maintains the benefits offers in the Stack Cluster configuration of simplicity and direct DBMS connections.

Cons:

1. Not suitable for larger sites due to the single DBMS node for each stack and lack of load balancing.

Our recommendation

We’ve chosen to go with the DBMS tier cluster along with a distributed load balancer as we cross into different regions. This is primarily due to our platform being EC2, as it provides a resilient, reliable and cost effective load balancer, removing the worry of having a single point of failure. This also simplifies the deployment as more database servers are removed or added to the pool, as we only have to configure a single load balancer.

In my next post I’ll show some basic load testing results, and justify the number of Galera boxes I chose to deploy, along with some basic nodes about EC2 and general EC2 thoughts regarding MySQL deployment.

 

 

PHP on EC2 (AWS) with Multi A-Z and Multi Region (Part 2)

Looking for Part 1? See: http://www.headenergy.co.uk/2012/10/php-on-ec2-aws-with-multi-a-z-and-multi-region
Looking for Part 3? See: http://www.headenergy.co.uk/2012/11/php-on-ec2-aws-with-multi-a-z-and-multi-region-part-3
Looking for Part 4? See: http://www.headenergy.co.uk/2012/11/php-on-ec2-aws-with-multi-a-z-and-multi-region-part-4
 

A Quick Intro

In my last post I discussed the need to deploy a popular PHP CMS into the cloud which leveraged the scaleability and multi-az / multi-region redundancy offered by the excellent AWS. One of the first stumbling blocks to spreading the database load for our CMS (Concrete5 – which sure does generate a lot of queries for a page load! – ) was the inability to elegantly split our read / writes and send them to different MySQL nodes. After sifting through a few inadequate solutions, we settled on Galera as our solution.

Galera

Galera is a true multi-master, synchronous replication cluster for MySQL based on the well used InnoDB storage engine. Users can deploy the Galera cluster locally in LAN environments, as geo-clusters over the WAN or as a virtual cluster in cloud hosting platforms. Even better, Galera is offered as an Open Source software solution. It can be downloaded freely from http://www.codership.com.

The marketing blurb states:

“Galera Cluster is deployed by users who need highly available MySQL database back-ends with very fast fail-over time, consistent databases with no data loss and reduced investment in high availability architectures. Galera Cluster is ideal for business critical applications like Social networks, Gaming platforms, Telecom/VoiP solutions, Media and entertainment sites, Business Software as a Service (SaaS), Platform as a Service(PaaS), ERP systems, web-shops, e-commerce solutions or similar critical applications.”

Benefits

Multi-master replication means that all slave nodes can be used as the master at any given time, that is to say we can write to any node in the pool indiscriminately. Unlike the single threaded nature of MySQL native replication which is liable for bottlenecks, Galera runs on a parallel replication approach which improves slave lag significantly. Applications connecting to any server in the cluster can be confident they are reading and writing to the same stateful dataset.

As all nodes communicate to all other nodes, transmitting writes as appropriate, we can forget about the read / write splitting otherwise required by Concrete5 for MySQL replication.

Galera also supports WAN clustering and synchronous replication within that. Unavoidably, and understandably, there may be short periods of lag due to the network round-trip-time required by the synchronous state. Our testing shows that the round trip between Singapore, Ireland and East-Cost U.S is still well within the milliseconds rather than seconds.

For those of you running a slightly tighter applications, Galera also assigns a Global Transaction ID for all replicated transactions, therefore transactions can be uniquely referenced in any node.

Galera also facilitates automatic node joining. The cluster chooses ‘donor’ for the ‘joiner’.

The path has been well trodden on EC2, Galera has been well tested and deployed on the Amazon EC2 environment already.

Limitations

To start the cluster, Galera must have a minimum of 3 nodes in the cluster to operate effectively. As Galera will try and promote any server to a master server, a cluster of less (e.g. 2)  would suffer from a ‘split brain’  scenario wherein both servers are alive but unable to communicate to each other. With 3 servers this ‘split brain’ scenario cannot happen, so long as two servers continue to communicate. To extend this, an odd number of nodes is recommended as an application scales up.

The number of Galera nodes in the pool should not exceed 10, the mathematics are sound, if a little complicated, and can be seen on Codership here: http://www.codership.com/content/multi-master-arithmetics.

As mentioned, replication only works with the InnoDB storage engine, writes to other table types will simply not be replicated. We are using this “limitation” to easily remove replication for tables we know we don’t need to see replicated, e.g. non-critical logging tables. Rest assured however, DDL statements are replicated in statement level, and so any changes to mysql.* tables are replicated. We can safely issue:CREATE USER… or GRANT… but issuing: INSERT INTO mysql.user… would not be replicated. By their nature, non-transactional engines cannot be supported in multi-master replication.

Concrete5 requires some MyISAM tables for its text-search facility, specifically PageSearchIndex. it is relevant to note that MySQL 5.6 supports text searching within InnoDB tables (http://blogs.innodb.com/wp/2011/12/innodb-full-text-search-in-mysql-5-6-4), Galera will be releasing an updated version soon to incorporate this. In the interim, we’ve removed these indexes and moved to InnoDB regardless. In our case, we’re already deepening the functionality of the search engine by incorporating SOLR, and it has no impact on our existing project roadmap.

The next blog post will cover the various cluster configurations we have open to us on EC2 and, time permitting, some other general considerations for EC2, such as disk io, disaster recovery and more.

PHP on EC2 (AWS) with Multi A-Z and Multi Region (Part 1)

Looking for Part 2? http://www.headenergy.co.uk/2012/10/php-on-ec2-aws-with-multi-a-z-and-multi-region-part-2
Looking for Part 3? See: http://www.headenergy.co.uk/2012/11/php-on-ec2-aws-with-multi-a-z-and-multi-region-part-3
Looking for Part 4? See: http://www.headenergy.co.uk/2012/11/php-on-ec2-aws-with-multi-a-z-and-multi-region-part-4
 

As part of my work with a large client, we are finding ourselves implementing and adapting an open source PHP CMS as part of an ongoing project to migrate existing content systems onto a new platform. The platform (Concrete5) uses MySQL as it’s data store and so we know we require a MySQL database infrastructure that is compatible with Concrete5 but also meets their own requirements for a scalable, robust and resilient infrastructure which can support their expected load and traffic levels. Concrete5 does come with a robust caching system, but due to the shear dynamicism of the website coupled with the low cost approach to design, we’re unable to eagerly scaffold the cache in a similar way to Basecamp (interesting post about that here), at least not without a much larger project lifecycle.

So,  lets move on to what we know we want:

  • Multi A-Z, Multi-Region AWS (Amazon Web Services) deployment that will be resilient to entire region failure worldwide.
  • Horizontally scaleable, preferably taking advantage of the elastic properties of EC2. This system also needs to be scalable within the initially UK focused userbase, but also to allow geographic disparate environments (e.g. the Americas and/or Asia) which would still be updated from the UK administrators and be required to share data between all locales, preferably without the enormously costly (and not necessarily well fitting) Akamai family of services.
  • Be compatible with a PHP CMS that is unable to split read/write database queries (in this case, the rather aged and questionable PHP ADODB abstraction layer).
  • Provide a single synchronous file mount for all nodes in any region.
  • The system will support up to an anticipated concurrency level of 250 database connections at any one time.
  • Resiliant, the database layer needs to be robust, ensuring that in the event of a database server going offline that access to the data is not limited. The Web Servers need to automatically failover to a working node without users noticing.

There is some light in these demanding (to say the least) requirements, in that although all end users might require database writes for registration, form filling or survey completion in addition to the requirements of the site administrators to update content, the nature of use is predominately a read-heavy environment.

Analysis of Methods

There are a number of key problems in our requirements that are driving us away from native mysql replication. The following ideas were considered but discounted:

Native MySQL Replication

Native MySQL replication is very easy to set up and does offer a fast replication mechanism, it has two key drawbacks in our context:

  1. It is asyncronous
  2. With more than two servers in the environment, reads and writes from the application layer would need to be split such that writes only go to the master(s). It’s not practical to modify the core of an open-source CMS to carry this separation out. Furthermore, the often touted solution of MySQL Proxy by the community isn’t mature and production ready (- and doesn’t appear as though it ever will be thanks to Oracle politics) and would introduce a fairly inflexible single point of failure, thereby increasing an already slightly complex architecture a little too much for my inner calm. PHP’s own mysqlnd-ms seems to be another possibility with an added bonus of being local to the box, but again amounts to little more than a check for “SELECT” at the beginning of outgoing statements.

Native MySQL Cluster

MySQL cluster offers a clustering technology which can be described, simply, as RAID for a database. While it is a fast, synchronous clustering technology which supports writes on all nodes, it also has a number of limitations which excludes it for consideration.

  1. It requires the use of the NDB cluster engine for all tables in the cluster.
  2. The maximum size for a table row is 14 kilobytes, not counting BLOB values.
  3. The NDBCLUSTER engine does not support foreign key constraints. As with MyISAM tables, if these are specified in a CREATE TABLE or ALTER TABLE statement, they are ignored.
  4. Fulltext searching is not supported.
  5. Indexes exist in memory on each cluster node, requiring signifcantly more RAM than on a non-cluster server with the same dataset.

Tungsten Replicator

While Tungsten replicator can provide a synchronous and parallel replication environment, it still requires reads and writes to be split when the cluster has more than 2 servers.

The recommendation

The recommended technology which meets all requirements is Galera Cluser from Codership. A follow up blog post will be written to discuss why in the next few days, stay tuned!