Empower Web3 Business with a Scalable HTAP Database for Hybrid Workloads

NFTScan is a multi-chain NFT data infrastructure that supports access to leading blockchain networks such as Ethereum, Solana, and BNBChain. It provides efficient and concise NFT asset searching and querying services for Web3 users, an OpenAPI platform for Web3 developers, as well as NFT data analytical services for financial technology companies.

Challenges with the previous MySQL solution

Previously, NFTScan used MySQL and Elasticsearch on Amazon Web Services (AWS) as their core database solution. MySQL stored all application data, including that for analytics and processing, from enterprises and end consumers. The data related to NFT transactions and assets was synchronized to Elasticsearch in a fully-indexed way to respond to multi-dimensional queries.

This solution was good but could not keep up with NFTScan’s business growth. It had the following drawbacks:

Poor scalability and high storage and maintenance costs. The volume of new blockchain data increased sharply every day, but MySQL could not automatically scale out to cope with increasing workloads. Instead, NFTScan had to manually shard tables and add new MySQL clusters to share and balance the use of CPU and memory. This greatly increased storage and maintenance costs.
Declining utilization rate with increasing costs. Elasticsearch was deployed on AWS. Due to the limitations of the AWS native cluster configuration, NFTScan had to add more high-configuration data nodes of Elasticsearch to provide online query services. This led to rising costs and a lower utilization rate.
Recurrent precision errors. The Elasticsearch database is designed more for searching than calculating, so there were precision errors in aggregation calculations.

Why TiDB?

After nearly a month of researching and testing, NFTScan chose TiDB to replace their legacy database system. It is because TiDB:

Is highly MySQL compatible. NFTScan could easily migrate their data to TiDB. MySQL compatibility also greatly reduces the time and effort of their R&D team to use a new database.
Is elastically scalable, which ensures that the server resources can be flexibly scaled according to real-time changes in read and write traffic. This maximizes resource efficiency.
Adopts a distributed architecture that separates computing and storage. NFTScan can scale in or out the compute and storage resources separately according to their changing business needs. This improves storage efficiency and decreases costs.
Has a simplified HTAP architecture that can handle both transactional and analytical workloads at the same time. This not only perfectly meets NFTScan’s growing business needs but also reduces their overall operational costs.
Is highly available thanks to its data replication mechanism and built-in disaster recovery solutions.

Migration to TiDB

NFTScan has already switched their underlying database system to TiDB. They deployed two TiDB servers, nine TiKV servers, and two TiFlash servers, spanning three availability zones in the same geographical region.

As of November 2022, their TiDB database stores about six terabytes of business data. Queries per second (QPS) have reached 5,000, and the average query duration is 40 ms. Various applications are running stably on TiDB.

What impressed NFTScan during the migration

According to NFTScan, they are not only satisfied with TiDB’s performance but also impressed by the smoothness of the data migration.

TiDB provides a series of data synchronization suites such as Dumpling and TiDB Data Migration (DM) to help migrate customers’ historical data from MySQL to TiDB. For example, some NFTScan business data cannot be directly migrated to TiDB. Their scheme has to be adjusted first before migration. In such cases, TiDB’s synchronization tools can concurrently write large amounts of data. When parsing and storing real-time NFT data, execution efficiency increased by about 30% compared with the previous storage plan.

TiDB’s online scheme update allows NFTScan to perform data definition language (DDL) operations such as changing fields and asynchronously adding indexes asynchronously during migration without blocking the reads and writes of the entire table. This greatly improves the flexibility of the data schema when the business logic is adjusted.

What impressed NFTScan during their usage of TiDB

TiDB supports multi-dimensional real-time queries with short query time. This perfectly meets NFTScan’s core requirements of high throughput and low latency. Take the API service on the business side as an example. The average query time dropped from 10-100 ms to 10 ms or less. Such query speed stays stable even when they process 1,000 QPS.

TiFlash, the columnar storage server inside TiDB, handles analytical workloads efficiently. For example, you can perform a complex query on a table with hundreds millions of rows and get your results in seconds.

TiDB’s smart SQL optimizer can select the most cost-effective data query execution plan according to the data distribution, and allows developers to flexibly adjust and optimize SQL execution plans.

Wrapping up

By using TiDB, NFTScan expanded their data storage, processing, and analytical capability with lowered storage and maintenance costs, and more efficient service performance. TiDB’s HTAP capability also perfectly meets their growing business needs. In the future, NFTScan expects more data services to run on TiDB and deeper cooperation with PingCAP.

This article is created based on a talk given by Cathy Ray at the Virtual HTAP Summit.

Industry

Web3