The world of database management systems (DBMS) is constantly evolving, offering a wide range of options to meet the diverse needs of businesses. Database-as-a-Service (DBaaS) has become a common choice as more and more companies move to or get started with their technology infrastructure on public cloud platforms. This is not surprising, given the broad range of available options provided by cloud providers.
In this blog, we’ll take a look at the relational database offerings for MySQL and PostgreSQL provided by Amazon Web Services (AWS) and Google Cloud Platform (GCP). We’ll then compare those offerings as product verticals and demonstrate how TiDB—an advanced open-source, distributed SQL database—fits into these verticals.
Why Companies Choose a DBaaS Solution
A company might choose a DBaaS solution for several reasons. However, at the top of the list for many are scalability, availability, and performance. It’s important to note that these three factors are present in all DBaaS product offerings, though each offering provides a different level of each factor. The level of any one of these is defined by the underlying technology that the DBaaS offering is based on. It’s helpful to think of these characteristics as tunables, with each on its own sliding scale. We’ll define the scale with the stop points of Basic, Advanced, and Extreme.
In the example below, we use this scale to represent defined product classifications or verticals with the AWS offerings and TiDB.
Figure 1. Product classifications between Amazon RDS, Amazon Aurora, and TiDB.
With the understanding that each of these products has measurable and distinct patterns of scalability, availability, and performance, we can take our labels of Basic, Advanced, and Extreme and use those as product verticals in the DBaaS market.
Let’s now take a look at the three characteristics of scalability, availability, and performance and how they map to the three verticals of Basic, Advanced, and Extreme. We’ll start with Scalability in the following diagram.
Figure 2. Scaling capabilities mapped across Amazon RDS, Amazon Aurora, and TiDB.
From the diagram above, we can see that the Basic Scaling vertical is primarily based on vertical scaling and the scaling of read traffic. Vertical scaling is the ability to move from smaller database instances to larger database instances. For many, this works until they out grow the capacity of the largest available cloud database instance. They can squeeze out another level of scalability by separating out their read-only traffic and sending it to replicated databases known as read replicas. But this only helps them with read traffic. Their write workload is still bound by the capacity of the single node designated as the primary in their replication architecture.
At this point, growing further means they’ll need to move up to something with higher scalability and more advanced features. In the Advanced Scaling vertical, we have features such as the separation of compute and storage as well as the ability to scale data volumes horizontally. These capabilities provide significant improvements in the ability to scale-out data volume. However, the Advanced Scaling product class is still bound by the inability to scale writes. Even at this level of product sophistication, the architecture of the systems in this class still only allows for read workload scaling. This helps with read concurrency, but it does not help with write workload concurrency.
When considering availability, we notice that the same architectural characteristics impacting the scaling capabilities also impact the availability capacity within each product class.
Figure 3. Availability capabilities mapped across Amazon RDS, Amazon Aurora, and TiDB.
The primary / secondary replication architecture of the Basic and Advanced tiers define the minimum failover time during an outage. Contrast this with the Extreme product class—where all database nodes are capable of reads and writes simultaneously—and we see a clear advantage from an availability perspective.
In keeping with the established theme, we again see how similar platform characteristics also directly impact performance.
Figure 4. Performance capabilities mapped across Amazon RDS, Amazon Aurora, and TiDB.
Only scaling read traffic means that there is a performance plateau measured in writes per second. This plateau will also create a defined limitation on the number of concurrent transactions that can be processed at any given time. Extreme Scaling DBaaS products do not experience this write or concurrency limitation. In the case of TiDB, all database-compute nodes service read and write traffic simultaneously. Plus, you have the ability to scale the number of database-compute servers up or down, whenever you choose.
The final advantage that a platform in the Extreme category—such as TiDB—delivers is the ability to handle mixed workloads. TiDB allows for the storage of data in row and columnar formats, facilitating both transactional and analytical workloads on the same data.
Comparing AWS and GCP DBaaS Offerings with TiDB
Now that we understand the unique capabilities within each product vertical, let’s align the product offerings from both AWS and Google GCP to these verticals. While both cloud providers do offer a breadth of database vendor options in the Basic DBaaS category, we are going to focus exclusively on their MySQL and PostgreSQL offerings, as illustrated in the below diagram.
Figure 5. Comparing AWS and GCP MySQL and PostgreSQL offerings with TiDB.
The first thing we notice in the chart above is that both cloud providers are pretty much equal in their offerings for a Basic DBaaS. AWS offers RDS in MySQL and PostgreSQL. Meanwhile, GCP offers CloudSQL in MySQL and PostgreSQL variants. But we start to notice differences when we scale up to the middle tier, or Advanced levels of scaling, performance, and availability.
The Amazon RDS Aurora option is available in both MySQL and PostgreSQL. However, on the GCP side, the only comparable option in this category is AlloyDB, which is a PostgreSQL-compatible DBaaS. There is no MySQL option offered in GCP for this product class. The gap becomes even more interesting when we look at the Extreme vertical. This is the vertical where distributed SQL databases shine. Between both vendors, the only option offered in this vertical is Spanner from GCP.
TiDB: Scalability, Availability, and Performance for Extreme Workloads
The areas where we have identified product gaps are telling us a few things. First, we know that there are workloads in production in those product gaps. Those workloads are running on bespoke custom-built, self-managed solutions. Second, those solutions often employ technologies like manual sharding to accomplish scale numbers measured in 10s to 100s of terabytes of data. The costs and risk levels associated with these solutions are immense. This is where an option such as TiDB brings significant value.
PingCAP, the company behind TiDB, offers a scalable, highly-available, and performant cloud DBaaS called TiDB Dedicated—and it’s deployable in both AWS and GCP. The TiDB database employs a distributed, horizontally scalable architecture that enables it to handle massive workloads across multiple nodes. Its use of distributed transactions and fully-automated horizontal sharding allows for linear scalability and high availability. This makes TiDB Dedicated suitable for applications with demanding performance requirements.
TiDB Dedicated handles operational tasks, such as database setup, configuration, monitoring, and maintenance, by default. This can relieve the burden of managing and maintaining the database infrastructure, allowing companies to focus on their core business activities.
TiDB Dedicated also incorporates real-time analytical capabilities, allowing companies to perform complex analytical queries on their operational data without impacting database performance. This capability enables businesses to gain valuable insights from their data in real-time, supporting data-driven decision-making and real time analytics.
Coming to grips with the idea that your company has to take decisive action to scale its data platform is not an easy task. It’s a mixed bag. On the one hand, growth is a positive thing. Hopefully it’s a reflection of the growing performance of your business as a whole. On the other hand, it can represent a huge effort to find, test, and migrate to a new category of DBaaS.
While this second aspect can sound like an endeavor lacking any form of attractive appeal, it can also be looked at as a big opportunity. When considering a migration, it’s important for your company to evaluate their specific requirements, workload characteristics, and long-term business goals. Conducting a thorough analysis and potentially performing a proof of concept (PoC) can help assess whether a DBaaS like TiDB Dedicated is the right fit for your organization’s needs, or if there is evidence to even justify a migration.
Choosing the right option depends on many factors including workload requirements, scalability, budget, and management preferences. By understanding the similarities and differences outlined in this comparative analysis, you can make an informed decision that aligns with your specific business objectives.
If you want to learn more about TiDB or TiDB Dedicated, have questions about database migrations, or simply want to better understand your options as you plan ahead, please don’t hesitate to book a demo with one of our distributed SQL experts.
A fully-managed cloud DBaaS for predictable workloads
A fully-managed cloud DBaaS for auto-scaling workloads