Codership is pleased to announce the release of Galera Cluster 5.5.48 and 5.6.29 with Galera Replication library 3.15, implementing wsrep API version 25.The library is now available as targeted packages and package repositories for a number of Linux distributions, including, RHEL, Ubuntu, Debian, Fedora, CentOS, OpenSUSE and SLES. Obtaining packages using a package repository removes the need to download individual files and facilitates the deployment and upgrade of Galera nodes.This and future releases will be available from http://www.galeracluster.com. The source repositories and bug tracking are now on http://www.github.com/codership .This release incorporates all changes up to MySQL 5.5.48 and 5.6.29.New features and notable fixes in Galera replication since last binary release by Codership (3.14):fixes for compiling on Alpha / HP PA / s390x architectures (codership/galera#389)A Contribution agreement was added in order to facilitate future contributions (codership/galera#386)New features and notable changes in MySQL-wsrep since last binary release by Codership (5.6.28): A new variable, wsrep_dirty_reads can be used to enable reading from a non-primary node A new variable, wsrep_reject_queries can be used to instruct a node to reject incoming queries or terminate existing connections Issuing FLUSH TABLES WITH READ LOCK will cause the node to stop participating in flow control so that the other nodes do not become blocked (MW-252) The wsrep_sst_xtrabackup script has been updated from the upstream project A Contribution agreement was added in order to facilitate future contributions to the projectNotable bug fixes in MySQL-wsrep:If wsrep_desync is already set, running DDL under RSU could fail (MW-246)Wrong auto_increment values could be generated if Galera was a slave to an asyncronous master that is using STATEMENT replication (MW-248)If a prepared statement was a victim of a conflict and Galera attempted to rerun it, the slave could abort (MW-255)New features and notable changes and bug fixes in MySQL 5.6.29:yaSSL has been upgraded to version 2.3.9A new session variable –innodb-tmpdir can be used to specify a separate temporary directory for ALTER TABLE operationsDROP TABLE statements that contain non-regular characters may cause replication to break (MySQL Bug #77249)
Codership is pleased to announce the release of Galera Cluster 5.5.47 and 5.6.28 with Galera Replication library 3.14, implementing wsrep API version 25.The library is now available as targeted packages and package repositories for a number of Linux distributions, including, RHEL, Ubuntu, Debian, Fedora, CentOS, OpenSUSE and SLES. Obtaining packages using a package repository removes the need to download individual files and facilitates the deployment and upgrade of Galera nodes.This and future releases will be available from http://www.galeracluster.com. The source repositories and bug tracking are now on http://www.github.com/codership .This release incorporates all changes up to MySQL 5.5.47 and 5.6.28.New features and notable fixes in Galera replication since last binary release by Codership (3.13):use system ASIO library when compiling, if available (codership/galera#367)improvements to Debian packages that allow a Galera library package downloaded from galeracluster.com to be used with MariaDB Enterprise Cluster and Percona XtraDB ClusterNew features and notable changes in MySQL-wsrep 5.6 since last binary release by Codership (5.6.27):If a query such as CREATE USER needs to be printed to the error log, any plaintext passwords will be obfuscated (codership/mysql-wsrep#216)All SHOW CREATE and SHOW CODE commands now observe the wsrep_sync_wait variable (codership/mysql-wsrep#228)Notable bug fixes in MySQL-wsrep 5.6.28:FLUSH TABLES could cause the cluster to hang (codership/mysql-wsrep#237)The following transaction could be needlessly aborted if the previous one was aborted (codership/mysql-wsrep#248)A node could hang under workload containing both DDL and DML statements (codership/mysql-wsrep#233)New features and notable changes in MySQL 5.6.28:Miscellaneous bug fixes in InnoDBMore error conditions when writing to the binary log are caught and handled based on the value of the binlog_error_action variable
The results in the survey were very favourable to Galera Cluster providing high availability for MySQL and OpenStack components. Galera Cluster with MariaDB received second place, Galera with MySQL fifth place and Galera with Percona Server (Percona XtraDB Cluster) sixth place. Altogether Galera Cluster with three different MySQL variants is clearly the most popular database high availability solution for OpenStack users.This survey report analyzes respondents who completed or updated the survey during a two-week window in September 2015, and questions represent some modifications from prior surveys in keeping with the evolution of the OpenStack platform. This survey represents a snapshot of 1,315 users and 352 deployments, provided voluntarily. The User Survey is not a market survey and does not express all OpenStack deployments worldwide.The full report can be read here .
Codership is pleased to announce the release of Galera Cluster 5.5.46 and 5.6.27 with Galera Replication library 3.13, implementing wsrep API version 25.Galera Cluster is now available as targeted packages and package repositories for a number of Linux distributions, including Ubuntu, Red Hat, Debian, Fedora, CentOS, OpenSUSE and SLES. Obtaining packages using a package repository removes the need to download individual files and facilitates the deployment and upgrade of Galera nodes.This and future releases will be available from http://www.galeracluster.com, while previous releases remain available on LaunchPad. The source repositories and bug tracking are now on http://www.github.com/codership.New features and notable changes in Galera Cluster and the Galera library:security fix for the LogJam issue. The key length used for creating Diffie-Hellman keys has been increased to 2,048 bits.a “compat” package is now provided to allow MySQL-wsrep to be installed without removing packages such as Postfix which depend on older MySQL versionsThe MySQL-wsrep packages are now built with OpenSSL rather than YaSSL (codership/mysql-wsrep#121)Galera error messages have been enhanced to contain the current schema name along with the query (codership/mysql-wsrep#202)IB atomic builtins are no longer used when compiling as they may cause the server to hang (codership/mysql-wsrep#221)query cache is now compatible with wsrep_sync_wait (codership/mysql-wsrep#201)a deadlock could occur between the applier thread and an aborted transaction (codership/mysql-wsrep#184)a memory leak could occur when using SHOW STATUS (codership/galera#308)DDL was not recorded in InnoDB header, causing InnoDB to recover to an earlier position (codership/mysql-wsrep#31)an assertion could happen with Prepared Statements (codership/mysql-wsrep#125, codership/mysql-wsrep#126)the ‘could not find key from cert index’ warning will no longer be printed in certain situations (codership/galera#361)fix compilation on the latest Debian release (codership/galera#321)fix compilation on FreeBSDseveral fixes to the build scripts to support various distros and architectures (codership/galera#321)
Investigation run by Marco TusaCONCLUSIONS MySQL/Galera was able to outperform Aurora in all tests — by execution time, number of transactions, and volumes of rows managed. Also, scaling up the Aurora instance did not have the impact I was expecting. Actually it was still not able to match the EC2 MySQL/Galera performance, with less memory and CPUs.In light of the tests, the recommendations consider different factors to answer the question, “Which is the best tool for the job?” If HA and very low failover time are the major factors, MySQL with Galera is the right choice”. READ THE FULL STORY AND INVESTIGATION
IntroductionThe purpose of this article is to describe how Galera Cluster multi-master replication provides high availability for MySQL beyond simply replicating all updates to multiple nodes.High availability has multiple dimensions, such as being able to detect and tolerate failures in individual components and be able to recover quickly. We will discuss the different failure modes that can happen in a cluster and how Galera facilitates the detection and recovery from each situation.Your load balancer and application may be governed by different timeouts and recovery mechanisms, but an operational Galera Cluster will provide a stable foundation to recover the rest of your infrastructure in case of a widespread outage.Failures of Individual NodesSynchronous replication requires the participation of all nodes but a Galera Cluster will detect and automatically remove a node that has gone down within the default timeout of 5 seconds (configurable using the evs.suspect_timeout option). The timeout places an upper bound on the time a transaction attempting to commit at the time of a node failure could be blocked.After the node has been evicted from the cluster, if enough other nodes remain in the majority, the cluster will continue to operate without further interruptions.RecoveryThe recovery procedure for a Galera Cluster node is streamlined so that there is the least number of surprises during a time of high stress. The procedure is identical regardless of which node has failed and the customary multi-step procedure of promoting a slave to master that is used in legacy MySQL replication is not used in Galera. if the node still has its data files, restarting the server via the init script will cause it to rejoin the cluster automatically (Internally, mysqld will first be called with the –wsrep-recover option to initiate InnoDB recovery, followed by the actual server restart.) If the downtime was short, Galera Cluster will bring the node up to speed by replaying any transactions that it missed while it was down, the so called Incremental State Transfer (IST). If the downtime was longer, the node may need to be instantiated anew using a complete copy of the database from another node (State Snapshot Transfer, or SST).if the node no longer has its data files, Galera will bring in a copy from another node automatically. While it is also possible to restore the node from a backup, it is not required.In all cases the procedure happens internally and can proceed without administrator intervention.MitigationA Galera cluster can tolerate the loss of up to half minus one of its nodes and remain operational. It is possible to create 7- and 9-node Galera clusters in order to increase the number of failed nodes that can be tolerated (3 and 4, respectively). On the smaller side of the scale, using the Galera Arbitrator allows a two-node cluster to survive the failure of a single node.It is possible to avoid complete snapshot transfers when restarting after non-destructive outages by giving Galera nodes more disk space to store pending database updates by increasing the value of the gcache.size variable.Node InstabilityA node which is behaving erratically or whose network connection is unstable can be detected and evicted from the cluster via the Auto Eviction mechanism. The administrator specifies the maximum number of incidents per unit of time that will be tolerated. When the limit is reached, the node is instructed to shut down as to not impact the operation of the cluster.Network IssuesIf there is a network issue that prevents just two nodes or groups of nodes from communicating directly to each other, while the rest of the cluster is being able to communicate correctly, Galera will internally reroute the traffic to work around the network failure.Whole Datacenter FailureGalera can be used to build geo-distributed clusters that can handle a failure of an entire datacenter, even if multiple nodes were located in that datacenter.MitigationGalera is not limited to two data centers, so it is recommended that you use three or more datacenters for maximum reliability. In a three-datacenter cluster the majority of the nodes will remain running, avoiding the case where the cluster is split exactly in the middle.In case of a two-datacenter setup, one of the data centers can be designated the main one by adding more nodes to it, installing the Galera Arbitrator in it or giving its nodes a higher weight using the pc.weight option. In case of a network split that causes the data centers to lose contact with one another, the data center having the bigger number of nodes by weight will remain running and continue to service requests.RecoveryWhile it is possible to simply restart the failed nodes together, data transfers over the WAN may be minimized if one node is restarted and left to join the cluster first. When the other nodes in the datacenter are restarted, they may elect to come up to speed using that node as a donor.Whole-cluster FailuresPreventionGalera does not require that nodes are located on the same physical network or physical location, or to share the same storage, so a truly shared-nothing cluster can be built even without using geo-distributed replication. It is possible to reduce the chance for a whole-cluster outage by having nodes in different availability zones or even different datacenters, providing the desired amount of isolation from common-cause failures.RecoveryIf the pc.recover option is used and all machines were successfully restarted after the outage is over, the only administrator action that is required is to restart the nodes in any order and they will find each other and recreate the cluster as it existed before the failure.If that option is not used, or some machines have not survived, it is important to determine which node died last and restart it first. The rest of the nodes can be restarted in any order.If all nodes lost their data files in a truly catastrophic outage, recovering Galera Cluster from a backup uses the same tools that are used to recover a stand-alone MySQL server. As soon as the first node of the cluster has been recovered from a backup, it can be made immediately available for servicing requests. More nodes can then be started and they will automatically fetch a copy of the database.Configuration and Procedural ErrorsGalera Cluster has various characteristics that make it simple to configure and maintain. These same features are very useful in emergency situations when trying to restore service under a high level of stress. A reduced number of steps to perform, a smaller number of files to take care of decrease the potential for mistakes and result in faster recovery times.Limited External DependenciesGalera requires only working TCP networking in order to run. There is no requirement for things such as functioning multicast or dedicated network interfaces. Galera has no dependence on third-party services or libraries that are unlikely to be present on a fresh server or will not be brought in by the package manager when Galera is installed on a replacement server.Galera can even start up without a functioning DNS service if IP addresses are present in the wsrep_cluster_address variable. It does not require SANs, shared storage, or user or file permissions that are not present by default.Single ConfigurationThe entire configuration for Galera Cluster is contained within MySQL’s my.cnf file (or files included from it), which enables it to be quickly restored or recreated in case of an emergency. The configuration file is not required to contain any host-specific entries, so can be reused across nodes.No Requirement for SQL commandsGalera Cluster does not require issuing commands such as CHANGE MASTER or RESET SLAVE via the command-line interface. There is no need to figure out where replication stopped in order to determine the proper parameters to specify to such commands.Limited File “Sprawl”The only Galera file that is located outside of the MySQL directory hierarchy is the libgalera_smm.so library. Galera stores its data files in the MySQL’s data directory, but does not require them to survive restarts or be restored from a backup. Galera prints its log messages to MySQL’s error log.Automatic Replication of AuthenticationA newly-joining Galera node will obtain the authentication database for the SQL users from the cluster, so it will be able to accept incoming connections from the application without the need to set up users manually.ConclusionGalera Cluster is not merely a way to replicate data from one MySQL server to another. It is a complete clustering solution for MySQL high availability that is designed to handle all possible failure scenarios and allow for speedy recovery from each one. The database can be kept running and available under very challenging circumstances.
Codership is pleased to announce the release of Galera Cluster 5.5.42 and 5.6.25 with Galera Replication library 3.12, implementing wsrep API version 25.
Galera Cluster is now available as targeted packages and package repositories for a number of Linux distributions, including Ubuntu, Debian, Fedora, CentOS, OpenSUSE and SLES. Obtaining packages using a package repository removes the need to download individual files and facilitates the deployment and upgrade of Galera nodes.
This and future releases will be available from http://www.galeracluster.com, while previous releases remain available on LaunchPad. The source repositories and bug tracking are now on http://www.github.com/codership.
New features and notable changes in Galera Cluster and the Galera library:
Various forms of FLUSH that are replicated in traditional MySQL async replication are now also replicated in Galera under TOI (codership/mysql-wsrep#67)
The applier thread will now honor FLUSH TABLES WITH READ LOCK, FLUSH FOR EXPORT and will block until the lock is released (codership/mysql-wsrep#113)
Support for Debian Jessie (galera/mysql-wsrep#127,codership/galera#264)
The SST password is no longer passed via the command line or visble in the error log or ‘ps’ output (codership/mysql-wsrep#141)
The xtrabackup SST script has been updated from the upstream source (codership/mysql-wsrep#143)
Galera will abort gracefully if there is no disk space to write the required gcache files (codership/galera#324)
Gcache files are removed faster than before in order to reduce Galera disk usage (codership/galera#317)
Better error logging in case of SSL errors or misconfiguration (codership/galera#290)
The configuration in /etc/sysconfig/garb is now properly honored by the garbd systemd service (codership/galera#267)
Arbitrator service no longer starts automatically on package installation, giving the user the opportunity to configure it first (codership/galera#266)
Miscellaneous fixes in the garb startup script (codership/galera#186)
In this post, we will describe the Primary Component, a central concept in how Galera ensures that there is no opportunity for database inconsistency or divergence between the nodes in case of a network split.
What is the Primary Component?
The Primary Component is that set of Galera nodes that can communicate with each other over the network and contains the majority of the nodes. In case of a network partition, it is those nodes that can safely commit a transaction. A cluster can only have one such set of nodes, as there can only be one majority. No other set of nodes will commit transactions, thus removing the possibility of two parts of the cluster committing different transactions and thus diverging and becoming inconsistent.
The Healthy Cluster
In a healthy cluster, all nodes nodes can communicate with each other, so they all belong to the Primary Component and can all receive updates. There are no network partitions and therefore there are no nodes which have become separated. The wsrep_cluster_status status variable reports Primary on all nodes.
MySQL [test]> show status like ‘wsrep_cluster_status’;
| Variable_name | Value |
| wsrep_cluster_status | Primary |
wsrep_cluster_status is a good variable to monitor on every node using your monitoring application or load balancer.
On any node that is in the Primary Component, the wsrep_cluster_size status variable shows the current number of nodes in the cluster:
MySQL [test]> show status like ‘wsrep_cluster_size’;
| Variable_name | Value |
| wsrep_cluster_size | 3 |
1 row in set (0.00 sec)
If you have a need for the data to be replicated to N servers or locations for reliability reasons, configure your monitoring framework to alert you if the value of wsrep_cluster_size drops below N.
Handling Network Partitions
If one or more nodes becomes separated from the Cluster by a network partition, each node in the cluster will decide if it is on the majority (primary) or the minority side of the partition.
The nodes that detect they are in the minority will transition to a state of Non-Primary and refuse further queries. Writes to those nodes will be prevented as they can no longer guarantee that a conflicting write is not being performed on the Primary Component at the same time.
Reading from the non-Primary nodes will also be disabled, as they are no longer up-to-date with respect to the authoritative data held on the majority portion of the cluster.
Any transactions that were being committed while the network outage was in the process of being detected will return an error and must be retried by the application.
The nodes that detect they are in the majority will remain in a state of Primary and will continue to process future transactions. The value of the wsrep_cluster_size on those nodes will reflect the size of the now reduced primary component of the cluster.
Recovery After A Network Partition
As soon as the network partition or the outage is healed, any nodes not in the Primary component that have continued to run will synchronize with the nodes from the Primary component and will rejoin the cluster. The wsrep_cluster_size will increase accordingly with the number of nodes that have rejoined.
Any nodes where the mysqld processes have terminated will need to be restarted in order to rejoin.
The Split Brain Problem
A problem that happens both in theory and in practice is the so called split-brain situation, where the cluster gets split by a network outage into two exactly equal parts. A software system that is not prepared to handle that eventuality could allow conflicting transactions to be executed on the separate parts of the cluster while they are not coordinating. This would cause the databases on each side to diverge without the possibility of an automatic reconciliation later.
Galera Cluster safeguards against this particular problem. As no set of nodes will have the majority, no part of the cluster can be considered Primary, so all parts of the cluster will transition to state of Non-Primary, all refusing further queries in order to protect the integrity of the database.
To prevent split-brain scenarios, never use an even number of nodes or data centers in your setup. If it is not practical to do so, Galera provides several alternatives:
Install a Galera Arbitrator process to serve for the purpose of breaking ties. Note that this process, even though not a fully-featured database, continues to receive all replication traffic, so must be secured appropriately and provided with sufficient bandwidth.
Use the pc.weight setting of wsrep_provider_options to assign a weight greater than 1 to one of the nodes; This weight will then be considered in majority calculations and ties may be avoided;