OSS Insight's Journey to a Serverless Database

OSS Insight is a powerful tool that provides online data analysis for users based on nearly 6 billion rows of GitHub events data. However, due to the complexity of analytics queries (scanning large amounts of data and then aggregating or window function processing), its database workload was significantly impacted by user traffic fluctuations.

In particular, in January 2023, we launched Data Explorer (AI-powered Ad Hoc querying), which uses AI to convert questions asked by users into analytical SQL to be executed on the database. In order to support higher QPS, we scaled up the underlying TiDB Dedicated cluster, but with this came a higher bill. As a result, reducing the cost without compromising the performance of the database has become an urgent priority.

Recently, the development team of OSS Insight migrated the database from TiDB Dedicated cluster to TiDB Serverless cluster. It has brought us significant benefits in terms of performance, scalability, and cost.

In this article, we will share how TiDB Serverless helps us handle massive and fluctuating data workloads with ease and efficiency while lowering costs.

No more challenges with selecting database instances

Previously, we had to make trade-offs between performance and cost when choosing cloud databases. We were torn between choosing which specification was best suited for our application.

Small size vs large size instance

Now, with the TiDB Serverless database, we can create a database cluster instantly without having to choose an instance specification (CPUs, memory, disks, or the number of nodes) in advance. TiDB Serverless will automatically scale out/in to accommodate the application workload.

A screenshot of the page creating the TiDBserverless cluster

No more manual scaling for traffic spikes

Previously, when our application faced a potential peak in traffic, the engineers usually had to estimate the traffic peaks in advanceand manually scale up / out the database manually to cope with the expected peak. After the traffic peak was over, we also needed to scale down the database to save costs. If faced with sudden traffic spikes, it could catch us off guard and we might not be able to scale the database in a timely manner.

Now, with TiDB Serverless, we can cope with sudden traffic spikes without doing anything.

Recently, OSS Insight unexpectedly made it to Hacker News’s Top 10, and the website’s traffic suddenly increased by 7x over the previous day. The database requests experienced a period of sustained traffic peak at the same time. For engineers, we don’t need to manually intervene to handle sudden increases in traffic and worry about performance degradation due to system resource constraints. TiDB Serverless will automatically and seamlessly scale out or in according to the actual workload.

A screenshot of theHacker News homepage on April 13th (Link)

User Traffic Panel on Google Analytics Dashboard

For more details about the auto-scaling feature, the TiDB Cloud Serverless team will provide answers in upcoming blog articles. If you are interested in this feature, please subscribe to the TiDB Cloud team’s official blog.

No more overpaying for idle resources

Previously, TiDB Dedicated clusters (as well as most cloud databases) were charged by the hour at a fixed price. This meant that even if the database utilization was low or even idle, we had to pay the full price for it.

Now, TiDB Serverless provides a pay-as-you-go billing model that charges precisely based on the actual resource consumption of each SQL execution. Similar to the concept of “tokens” in OpenAI API billing, TiDB Serverless uses “Request Units” to represent the resources consumed by each SQL query. You only need to pay for what you use($1 = 18.18M Request Units/month). When the database is idle, there will be no additional computing fees.

This billing model is especially suitable for databases used in development, staging environments, or online applications where the database workload varies with user traffic.

The monitoring chart shows OSS Insight’s request units per second

In OSS Insight’s use case, its database resource usage is affected by user traffic fluctuations. Usually, the workload of the database is not too high. Still, when user access traffic is high or background task processing is carried out, the workload of the database will significantly increase. This flexible billing model can help us reduce cost consumption during non-peak periods.

Before and during March, we had to spend $11,000 per month (an average of $15 per hour) on the TiDB Dedicated cluster.

A screenshot of the billing of TiDB Dedicated on March 2023

After migrating to the TiDB Serverless cluster, the usage cost of the cluster has been reduced to $3,000 per month (an average of $4 per hour), which is a 72.7% decrease.

A screenshot of the billing of TiDBServerless on April 28th, 2023

Conclusion

TiDB Serverless is a game-changer for OSS Insight and many other applications that need a scalable, reliable, and cost-effective cloud database.

It frees us from worrying about choosing specifications, scaling resources, or paying for idle time. It allows us to focus on our core business logic and deliver value to our users.

If you are also facing similar challenges, take TiDB Serverless for a spin. This fully-managed service provides permanent free 5GB data storage.

Book a Demo