Comparing AWS Cloud Database Technologies: Relational Databases
After completing my AWS Database specialty certification, I felt it was a good time to provide my thoughts on the database services offered by AWS. This blog not only includes a listing of the core features for each AWS database but also, the pragmatic use case which may help on your next cloud IT initiative. These database technologies are divided into two categories Relational and Non-Relational Databases. In this blog I will go over the Relational Databases.
These Databases have predefined relationships among their tables and the tables store data in columns and rows with a key to uniquely identify each row in a table. Examples include PostgreSQL, MySQL, Microsoft SQL Server, IBM DB2, Oracle, and others.
These are the Relational databases available in AWS.
A) Amazon RDS (Relational Database Service)
Amazon RDS is a managed Database service available on AWS. It uses SQL (Structured Query Language) for querying the data and it's available for PostgreSQL, MySQL, MariaDB, Oracle, Microsoft SQL Server, and Aurora databases. These databases can also be hosted on a EC2 instance which means it will be managed by the Client and not AWS. These are the core features of Amazon RDS
- Launched within a VPC (virtual private cloud), usually in private subnet, control network access using security groups
- Storage by EBS (gp2 or io1)
- Supports Multi-AZ deployments
- Backup and restore with Point-in-time Recovery
- Manual Snapshots
- Notifications via SNS for events (RDS Events)
For a company where data availability or data management is a problem in their small/medium On-prem relational database environment (example: MySQL DB), the solution will be to migrate to an Amazon RDS for MySQL DB Instance which requires less management and provides data availability by the use of reading replicas. A transition can be easily done by having a backup of the MySQL DB in s3 using mysqldump and then restoring the DB backup into the new instance. This approach has some downtime so in cases where you want minimal downtime other approaches will be needed.
B) Amazon Aurora
This is another kind of Relational database that is only compatible with MySQL and PostgreSQL database engines. This means Amazon Aurora works just like a Postgres or MySQL Database. Some of the features include
- 5x faster than standard MySQL databases and 3x faster than standard PostgreSQL databases
- Can have Up to 15 read replicas (Multi AZ, Auto Scaling Read Replicas)
- Aurora Serverless option i.e Automatic start/stop, Autoscaling and Self-healing storage
- Aurora Global DB: Supports multi-region read replication
- Maintains 6 copies across 3 AZs
- Backups are stored on S3 and Fast backtracking option for PITR
A software company with a relational database setup (PostgreSQL or MySQL) is having issues with data storage and availability. This is because their development team of 7 developers need to run multiple tests on the database at the same time. The solution will be to have multiple copies of the database using either read replicas (up to 15 read replicas) or cloning the database (up to 15 clones) and once you are done, they can be deleted.
C) Amazon Redshift
Amazon Redshift is an OLAP database solution that is based on PostgreSQL. Redshift allows users to query petabytes of structured and semi-structured data using standard SQL. It uses a Leader Node for query planning, results aggregation, and multiple Compute nodes for performing the queries, after which it sends results to the Leader Node. These are some of the features:
- Highly performant analytics database technology using columnar storage (storage organized on rows rather than columns)
- It allows Massively Parallel Query Execution (MPP) and its highly available
- Allows querying of external file objects in amazon file object storage
For companies with a very large workload that need to perform analytical operations, this is the AWS service to use. It makes it easy to get real-time insights on petabytes of structured /semi-structured data by using business intelligence tools such as AWS quicksight, tableau etc.
I have highlighted the differences between the databases in this table:
|5 read Replicas
|Allows up to 15 read replicas
|Replication is not available. Snapshot and restore to new cluster
|Low TB range
|Mid TB range
|(OLTP) Transactional purpose and simple Analytical purpose
|(OLTP)Transactional purpose and simple Analytical purpose(Aurora Parallel queries for running faster analytical queries )
|(OLAP) Analytical purpose
|Mid-to-high throughput, low latency
|High throughput, low latency
|Mid to High Latency
In this blog post, I discussed the relational databases that are available on AWS, I also highlighted use cases for each of them. If you are looking to migrate into AWS cloud and want the best solution for your business use case, you can reach out to us directly.