What is MySQL High Availability?

MySQL High Availability is an option you can select to enable your MySQL database to remain available in the event of a failure or disruption. The feature allows you to set higher uptime requirements and zero data loss tolerance. In this article, we’ll take a look at what the general concept of high availability means, and how MySQL’s High Availability option works.

What Is High Availability?

High availability is the ability of a system or service to continue functioning and remain available when a failure or outage occurs. A highly available system ensures an organisation’s mission-critical systems and applications are always up and running. It’s especially important for organisations in sectors like healthcare, finance, and aviation where the failure of a mission-critical system could have serious consequences.

High availability is typically expressed as a percentage of uptime defined by service level agreements (SLAs), with a score of 100 representing a system that never fails. Because this is virtually unachievable, most organisations aim for “five nines” or 99.999% availability.

How MySQL Achieves High Availability

A highly available system must be able to recover instantly if a fault occurs. A high availability architecture requires at least three basic elements that work together to ensure recoverability and high availability:

Failure Detection

MySQL has a High Availability option that enables applications to meet requirements for higher uptime (and zero data loss tolerance). When the High Availability option is on, the MySQL system creates three instances across different fault domains or availability zones.

Data is replicated among the three instances using MySQL Group Replication, and the application connects to the primary instance for reading and writing data to and from the database. If a failure occurs, the system triggers an automatic failover to a secondary instance within minutes.

Failover

The failover mechanism transfers services to a replicated instance. If more than one backup instance is available, the failover mechanism chooses the best one to promote to the primary node.

A Redirection Mechanism

Once failover to a secondary instance occurs, the High Availability feature redirects all application and user connections to what is now the new primary node. It also redirects all queries from the old primary node to the new primary database.

MySQL High Availability: Uptime

Uptime is the time that a system is available and functioning correctly and is expressed as a percentage of the total time the system is expected to be operational. High uptime means that the system is available and functioning as expected most of the time.

The uptime you can expect with different levels of MySQL High Availability will depend on the specific high availability (HA) solution you implement.

MySQL Replication

MySQL Replication allows you to set up multiple servers to provide redundancy and failover to support higher uptimes than a MySQL server with no HA capability. A master-slave configuration uses a single master server that accepts reads and writes and one or more read-only slave servers. Data from the master server is replicated asynchronously to the slave servers.

To implement failover, you’ll need to set up one or more slave servers as standbys that can be promoted to master in the event of a failure. Failover is generally a manual process where you have to promote the slave node to the master node by changing the status of the promoted slave to read-write mode so that it can accept queries.

Because failover is done manually, it will take longer and could be prone to human error, leading to a longer outage. MySQL Replication also uses asynchronous replication, which means that if the master fails, transactions committed on the master may not yet have replicated to the slave servers. If there is critical data loss, data will need to be restored, adding to the time the system is down.

MySQL Group Replication

MySQL Group Replication allows you to achieve higher uptimes than MySQL Replication. Using MySQL Group Replication, you set up multiple MySQL servers in a group with one server designated as the primary server and the other servers as secondary servers. Each server in the group maintains a copy of the data and uses replication to ensure that the copies remain in sync.

If the primary server goes down, the secondary servers in the group automatically detect the failure and begin the failover process. One of the secondary servers is automatically promoted to the new primary server and begins serving requests from clients. The other secondary members in the group now receive updates from the new primary server and continue processing client-read requests.

If the failed server comes back online, it automatically rejoins the group as a secondary server.

Because failure detection and failover occur automatically with MySQL Group Replication, downtime is minimal, and users and applications are typically unaware that an outage has occurred.

MySQL Cluster

A MySQL Cluster HA solution offers the highest level of uptime. This highly available, distributed database system, along with automatic failover and load balancing, provides high levels of availability, performance, and scalability and is designed to provide near-zero downtime.

MySQL Cluster uses three types of nodes that work together to store and manage data:

Data nodes: Store data and handle read and write queries.
MySQL server nodes: Receive queries from the client applications, process them on the data nodes, and then return the result to the clients.
Management nodes: Manage the operation of the cluster and handle failover and recovery if a failure occurs.

If one or more nodes in a cluster fail, the cluster automatically detects the issue and triggers the failover process. The entire process typically happens within one second of a failure, without interrupting service to client applications. The cluster continues to operate as normal with virtually no downtime.

MySQL High Availability: Recovery Time

Recovery time is the measurement of the time it takes for a MySQL system to recover from an outage. A longer recovery time results in reduced availability and can directly affect the business’s ability to generate revenue, employee productivity, and customer satisfaction.

In MySQL, recovery times will vary depending on the replication type that you use:

MySQL Replication recovery times for master-slave replication will be affected by the manual failover process. After you promote the slave server to the new primary node, you’ll need to restart it so that it can begin replicating data to the remaining slave servers. Then, you’ll need to account for missing transactions and resolve any conflicts that might occur.
Group Replication uses an automatic failure detection and failover process that results in lower recovery times than master-slave replication. Conflict detection and resolution mechanisms ensure that the data on each server is always in sync across all servers in the group. Group Replication also uses conflict-free replicated data types (CRDTs) to reconcile data automatically when a conflict occurs. With Group Replication, the system can recover from a failure with very little downtime.
MySQL Cluster uses a “shared nothing” approach, where each node in the cluster is assigned its own memory and disk storage and communicates with other nodes using a high-speed connection. MySQL Cluster continues operating even if one or more nodes fail. The cluster automatically detects the issue and triggers the failover process to recover with virtually no downtime.

How to Determine Your MySQL HA Requirements

To determine your MySQL High Availability requirements, you’ll need to consider several factors, including:

Your current system architecture: What components does your current system contain, and how are they configured? Can they support MySQL High Availability?
Budget: How much will you need to invest in resources such as hardware, software, and personnel? Also, consider costs associated with training and ongoing maintenance.
Business needs: Consider your recovery time objectives (RTO) and recovery point objectives (RPO). What is your ideal recovery time? How quickly do you need to recover from a failure? Consider whether your organisation is subject to any specific regulatory or compliance requirements that require high availability.
Data criticality: How critical is your business data? How important is it to be up-to-date? How much data loss can you afford?

When to Use MySQL High Availability

Let’s consider a couple of use cases that require MySQL High Availability solutions:

High-traffic Websites

High-traffic websites handle thousands of queries and transactions per second, not to mention thousands of concurrent users. High availability measures such as server redundancy and load balancing can ensure that the database remains available and can handle the load.

Redundant servers will ensure the website remains available even if a server fails, and load-balancing incoming requests across multiple servers will prevent a single server from overloading and going offline.

Mission-critical Applications and Workloads

Businesses with mission-critical systems and applications require a high level of availability and uptime. Most times, these systems cannot afford to experience any downtime, and the database needs to remain available at all times.

MySQL HA solutions such as Group Replication or Cluster are ideal in this use case because they employ an automatic failover mechanism that results in little or no downtime.

How Everpure Supports MySQL High Availability

Everpure Evergreen™ is a portfolio of subscriptions that deliver zero-downtime deployments. In combination with Everpure’s unique storage array architecture, Evergreen enables you to upgrade storage infrastructure without interrupting service workloads.

Pure also supports active-active clustering and automatic, transparent failover with Purity ActiveCluster™, a multi-site active-active stretch cluster, for zero RPO and RTO.

Also consider Everpure Cloud Dedicated, which delivers enterprise-level cloud reliability for mission-critical applications. Non-disruptive upgrades and high availability across availability zones achieve high availability for multi-cloud business continuity and disaster recovery.

What is MySQL High Availability?

What Is High Availability?

How MySQL Achieves High Availability

Failure Detection

Failover

A Redirection Mechanism

MySQL High Availability: Uptime

MySQL Replication

MySQL Group Replication

MySQL Cluster

Test Drive FlashArray

MySQL High Availability: Recovery Time

How to Determine Your MySQL HA Requirements

When to Use MySQL High Availability

High-traffic Websites

Mission-critical Applications and Workloads

How Everpure Supports MySQL High Availability

Browse key resources and events

What is MySQL High Availability?

What Is High Availability?

How MySQL Achieves High Availability

Failure Detection

Failover

A Redirection Mechanism

MySQL High Availability: Uptime

MySQL Replication

MySQL Group Replication

MySQL Cluster

Test Drive FlashArray

MySQL High Availability: Recovery Time

How to Determine Your MySQL HA Requirements

When to Use MySQL High Availability

High-traffic Websites

Mission-critical Applications and Workloads

How Everpure Supports MySQL High Availability

We Also Recommend...

Browse key resources and events