My Journey from Traditional Monolithic Architecture to Distributed SQL

Like many Data Management Engineers, I began my career many years ago focused on the Oracle database. At the time, it was a very pragmatic decision. Oracle was very mature, reliable, and feature-rich. I could build very fast and very large vertically scalable architectures. If I needed more resources for the database, I simply scaled up, deploying a larger machine, all the way up to Oracle’s flagship engineered system, Oracle Exadata.

But then everything changed with the advent of the cloud and I found myself faced with moving very large on-premise applications and databases to various types of cloud infrastructure. Moving the applications to the cloud was relatively trivial, I could distribute them geographically where they were needed, and by that point, most of my clients were moving to microservices architectures that were designed to be cloud-native and horizontally distributed.

The Oracle database, as it turns out, was far more difficult to transition to the cloud. It was designed from the beginning to be vertically scalable. I could scale up but not out to where I needed it in the cloud, and I resigned myself to the fact that Oracle, as well as their traditional RDBMS systems, were not my choice for cloud-native architectures. There is a requirement for a database that can horizontally scale out and be accessed quickly by applications that are geographically dispersed to service users no matter where they reside. Traditional databases were becoming the ultimate bottleneck for mission-critical applications in the cloud. I needed something radically different, so I pivoted in a big way.

The advent of Distributed SQL

As it turned out, many of us were facing the same challenges, and it sparked an effort to build a next-generation cloud-native database that solved these key challenges. In the beginning of 2014, I was sent a newly released paper published by Google Research called Spanner: Google’s Globally-Distributed Database. This article is a very detailed description of a clustered distributed database architecture that can span a single data center or multiple data centers. Data is automatically sharded and multiple copies of the data are distributed and balanced across the cluster. Applications can connect to any node in the cluster and access any data residing in the distributed database. In addition, additional nodes can be added to easily allow a cluster to scale up and nodes can be removed to scale down.

Here at PingCAP, we took the core concept of Distributed SQL’s multi-node auto-sharding architecture and extended the architecture to allow for greater flexibility and functionality beyond Spanner and other Distributed SQL databases. We call this next-generation database TiDB.

TiDB design tenets

In 2015, we built TiDB from scratch, leveraging a very large and active GitHub open source community. Working with the collaboration of the community, our founders focused on a set of design tenets that would allow for the creation of the next generation, cloud-native, Distributed SQL architecture. These are our core tenets:

Scale without complexity
We built our database to scale horizontally without complexity. Nodes can easily be added to the cluster and the database will automatically rebalance the data across the new node(s). Scaling in is just as easy.
Resiliency always on
Our users require a database that is highly resilient and always on. We built TiDB to survive node failures while allowing applications to continue processing transactions on the surviving nodes. This drives the time to recovery to zero.
Consistency ACID transactions
For transactional databases, it is critical that you always retrieve the most recent and correct copy of the data. This is especially true for financial services and inventory systems. TiDB provides this through guaranteeing high levels of isolation at the transactional level to ensure consistency.
Support for both OLTP and OLAP transactions
Traditional Distributed SQL databases can only support OLTP (transactional) workloads. At PingCAP we developed capabilities in the TiDB database to handle OLTP (transactional), OLAP (analytical), and mixed workloads. We accomplished this by creating a unique column store in addition to the row store to handle massive aggregations and OLAP functions. This functionality is called TiFlash and leverages the concept of Hybrid Transactional and Analytical Processing (HTAP). This is all controlled by our cost based optimizer.
Common API interface
Here at PingCAP, our TiDB database was built by developers, for developers. So that required making the database easy to access through a common SQL based interface. We decided to go with a robust wire compatible MySQL interface. This allows application developers and DBAs to access the database leveraging one of the most common and well established SQL protocols currently used.
Install and run anywhere
For the database to be flexible, it needs to be able to be deployed and run anywhere. On premise, in a cloud provider, or in a multi-cloud environment. You can choose between VMs, or even containerize with Kubernetes. This allows you to easily change your deployment architecture to meet the ever changing needs of your organization.

TiDB operational features

All of the tenets above make for a robust database, however, operational features are critical to complete the offering. TiDB provides rich metrics and monitoring, leveraging Prometheus and Grafana with an additional built-in TiDB dashboard. Role-based security and TLS encryption are provided to keep information secure in-flight and at rest. Lastly, we provide built-in asynchronous CDC replication to easily move data into, and out of, the cluster databases.

Putting it all together

After being involved in Oracle for over twenty years, I can tell you Distributed SQL has changed everything. I now have opportunities to build solutions for organizations that were simply not possible before Distributed SQL. I have seen a burst of creativity in the community and watched teams build solutions that not only solve critical business problems, but also give organizations the opportunity to be more competitive, agile, and create new revenue streams. For those of you that are still struggling with the pains of traditional database architectures, I invite you to explore TiDB.

Full documentation can be found on our docs page for a quick setup and install of your own TiDB cluster. Please feel free to join our community Slack channel to see how others are using TiDB and to get any questions you have answered. For those looking to develop data-intensive applications, developers can open a TiDB Cloud account here and use our Developer Tier free trial for 1 year, or apply for a Proof of Concept (POC).

Book a Demo