As more companies run their workloads in the cloud, cloud database services are increasingly being used to manage data. One of the advantages of using a cloud database service instead of maintaining your database is that it reduces the management overhead. Database services from the leading cloud vendors share many similarities, but they have individual characteristics that may make them well-, or ill-suited to your workload. Developers are always looking for convenient ways of running their databases, whether it is to obtain more profound insight into database performance, to perform a migration efficiently, to simplify backup and restore processes, or to do many other "day to day" tasks. Among the number of available cloud services, it may not be easy to figure out which is the best one for our use case. In this article, we’ll compare two of the most popular cloud database services on the market - Google Cloud SQL and Amazon RDS.
Amazon RDS provides a web interface through which you can deploy MySQL. The RDS service manages the provisioning of the instance and configuration. Additionally, it also provides a console to monitor and perform basic database administration tasks. Google Cloud SQL similarly provides a predefined MySQL setup that is automatically managed. Predefined services can be a comfortable way to manage your databases however at the same time they can limit functionality. Let's take a closer look then at these management features.
Database logs and metrics monitoring
Amazon RDS and Google Cloud don't provide access to the shell. Your primary concern here may be access to essential log files. Amazon CloudWatch is a monitoring service for cloud resources which you can use to solve this problem. It collects metrics, collects and monitor log files or automatically react to changes in your AWS resources. Using CloudWatch, you can gather and processes error log, audit log and other logs from RDS into metrics presented in the web console. These statistics are recorded for 15 months so you can maintain a history. CloudWatch can take actions such as sending a notification to a notification recipient or if needed - autoscaling policies, which in turn may automatically handle an increase in load by adding more resources.
Google cloud also provides log processing functionality. You can view the Google Cloud SQL logs in the operations panel or through Google console. The operations panel logs every operation performed on the instance with pretty basic information. It could be extended with manually added metrics based on data from a file source. Unfortunately, the operations log does not include activities performed using external management tools, such as the mysql client. To extend basic functionality Google has another service - Stackdriver. The Stackdriver service can be used to create alerts for metrics defined in operational panel. Stackdriver embraces not only Google Cloud Platform (GCP) but also AWS and local services. You can use it for cross-cloud platform monitoring without additional agents. Stackdriver requires the installation of an open source based collected agent to access non-cloud metrics.
There are various ways in which you could monitor the MySQL instances metrics. It can be performed by querying the server all the time for the metrics values or with predefined services. You can get more in-depth visibility into the health of your Amazon RDS instances in real time with Enhanced Monitoring for Amazon RDS. It provides metrics so that you can monitor the health of your DB instances and DB clusters. You can monitor both DB instance metrics and operating system (OS) metrics.
It provides a set of over 50 database instance metrics and aggregated process information for your instances, at the granularity of 1 second. You can visualize the metrics on the RDS console.
Both CloudWatch and Stackdriver provides functionality to create alarms based on metrics. Amazon does it with Amazon Simple Notification Service (SNS) for notification. In Stackdiver it's done directly in this service.
Data Migration into Cloud
At this moment backup based migration to Google Cloud SQL is quite limited. You can only use logical dump, which may be a problem for bigger databases. The SQL dump file must not include any triggers, views, or stored procedures. If your database needs these elements, you should recreate them after shipping the data. If you have already created a dump file that holds these components, you need manually edit the file. The database you are importing into must exist up front. There is no option to migrate to Google cloud from other RDBMS. It all makes the process quite limited, not to mention that there is no option for cross-platform migration in real time (AWS RDS).
Amazon Database Migration Service (DMS) supports homogenous migrations such as MySQL to MySQL, as well as heterogeneous migrations between different database platforms. AWS DMS can help you in planning and migration of on-premises relational data stored in Oracle, SQL Server, MySQL, MariaDB, or PostgreSQL databases. DMS works by setting up and then managing a replication instance on AWS. This instance dumps data from the source database and loads it into the target database.
Achieving High Availability
Google use semisynchronous replicas to make your database highly available. Cloud SQL provides the ability to replicate a master instance to one or more read replicas. If the zone where the master is located experiences an outage and the backup server is set, Cloud SQL fails over to the failover replica.
The setup is straightforward, and with a couple of clicks, you can achieve a working slave node. Nevertheless, configuration options are limited and may not fit your system requirements. You can choose from the following replica scenarios:
- read replica - a read replica is a one to one copy of the master. This is the base model where you create a replica to offload read requests or analytics traffic from the master,
- external read replica - this option is to configure an instance that replicates to one or more replicas external to Cloud SQL,
- external master - setup replication to migrate to Google Cloud SQL.
Amazon RDS provides read replica services. Cross-region read replicas gives you the ability to scale as AWS has its services in many areas in the world. RDS asynchronous replication is highly scalable. All read replicas are accessible and can be used for reading in a maximum number of five regions. These nodes are independent and can be used in your upgrade path or can be promoted to a standalone database.
In addition to that, Amazon offers Multi-AZ deployments based on DRBD, synchronous disk replication. How is it different from Read Replicas? The main difference is that only the database engine on the primary instance is active, which leads to other architectural variations.
Automated backups are taken from standby. That significantly reduces the possibility of performance degradation during a backup.
As opposed to read replicas, database engine version upgrades happen on the primary. Another difference is that AWS RDS will failover automatically while read replicas will require manual operations from you.
Multi-AZ failover on RDS uses a DNS change to point to the standby instance, according to Amazon this should happen in 60-120 seconds of unavailability during the failover. Because the standby uses the same storage data as the primary, there will probably be transaction/log recovery. Bigger databases may spend a significant amount of time on innoDB recovery, so please consider that in your DR plan.
Security compliance is one of the critical concerns for enterprises whose data is in the cloud. When dealing with production databases that hold sensitive and vital data, it is highly recommended to implement encryption to protect the data from unauthorized access.
In Google Cloud SQL, customer data is encrypted when stored in database tables, temporary files, and backups. Outside connections can be encrypted by SSL certificates (especially for intra-zone connections to Cloud SQL), or by using the Cloud SQL Proxy. Google encrypts and authenticates all data in transit and data at rest with AES-256.
With RDS encryption enabled, the data is stored on the instance underlying storage, the automated backups, read replicas, and snapshots all become encrypted. The RDS encryption keys implement the AES-256 algorithm. Keys are being managed and protected by the AWS key management infrastructure through AWS Key Management Service (AWS KMS). You do not need to make any modifications to your code or operating model to benefit from this critical data protection feature. AWS CloudHSM is a service that helps meet stringent compliance requirements for cryptographic operations and storage of encryption keys by using a single tenant Hardware Security Module (HSM) appliances within the AWS cloud.
Instance pricing for Google Cloud SQL is credited for every minute that the instance is running. The cost depends on the device type you choose for the instance, and the area where it's placed. Read replicas and failover replicas are charged at the same rate as stand-alone instances. The pricing starts from $0.0126 per hour of micro instance to $8k, db-n1-highmem-64 with 64 vCPUs, 416 GB RAM, 10,230 GB disk and limit of 4,000 connections.
Like other AWS products, users pay for what they use with RDS. But, this pay-as-you-go model has a specific billing construct that can, if left unchecked, yield questions or surprise billing elements if no one’s aware of what’s actually in the bill. You may bill your database options starting from 0.175$ per hour to upfront thousands of dollars. Both platforms are quite flexible, but you will see more configuration options in AWS.
As mentioned in the pricing section, Google Cloud SQL can be scaled up to 64 processor cores and more than 400GB of RAM. The maximum size of the disk is 10TB per instance. You can configure your instance settings to increase it automatically. That should be plenty for many project requirements. Nevertheless if we take a look on what Amazon offers, there is still a long way for Google. RDS not only offers power instances but also long list of other services around it.
RDS supports storage volume snapshots, which you can use for point-in-time recovery or share with other AWS accounts. You can also take advantage of its provisioned IOPS feature to increase I/O. RDS can also be launched in Amazon VPC, Cloud SQL doesn’t yet support a virtual private network.
RDS generates automated backups of your DB instance. RDS establishes a storage volume snapshot of your DB instance, backing up the entire DB instance and not individual databases. Automated backups occur daily during the preferred backup window. If the backup requires more time than allotted to the backup window, the backup continues after the window ends, until it finishes. Read replication doesn't have backup enabled by default.
When you want to do a restore, the only option is to create a new instance. It can be restored to last backup or point in time recovery. Binary logs will be applied automatically, there is no possibility to get access to them. RDS PITR option is quite limited as it does not allow you to choose an exact time, or transaction. You will be limited to a 5 minutes interval. In most case scenarios, these settings may be sufficient however if you need to recover your database to the single transaction or exact time you need to be ready for manual actions.
Google Cloud SQL backup data is stored in separate regions for redundancy. With the automatic backup function enabled, database copy will be created every 4 hours. If needed you can create on-demand backups (for any Second Generation instance), whether the instance has automatic backups enabled or not. Google and Amazon approach for backups is quite the same however with Cloud SQL it is possible to perform the point in time recovery to the specific binary log and position.