tidb_feature_1800x600 (1)

TiDB focuses on simplicity, transparency, and high availability, providing a robust foundation for distributed SQL workloads. While concerns about cross-shard queries and availability zone (AZ) outages are valid, TiDB’s architecture minimizes these risks, ensuring reliable and efficient application performance. However, when comparing TiDB and Vitess — an orchestrator for sharded MySQL instances — the architectural differences result in distinct approaches and varying perspectives on associated risk levels.

In this post, we’ll examine the unique approaches used by both database platforms across four key areas: Sharding ranges and granularity, replication, cross-shard query performance, and deployment across cloud AZs.

Comparing Vitess and TiDB: Sharding Ranges and Granularity

In Vitess, a shard range refers to the range of values stored on an entire shard, corresponding to a single MySQL instance.

A shard is a logical division of data based on a keyspace ID or sharding key. For example, a keyspace might be divided into shards with ranges like 0-80 and 81-160, where each range maps to an entire MySQL server.

Granularity: Vitess operates at the shard level, where each shard is a full MySQL instance containing its own database, tables, and data for the specified keyspace ID range.

Range in TiDB

In TiDB a range contains an ordered collection of rows from a single table. We call this range a region. The default size of a region is 96MB. A single table will be broken into many regions. Each Region replicates across the cluster using the Raft consensus protocol. Each Region, plus its replicas form a separate Raft group. This level of granularity means that an election in one Region doesn’t impact the other Regions.

Granularity: TiDB operates at a much finer granularity, with regions being small, manageable chunks of data distributed and replicated across the cluster, allowing for better balance and scalability.

Sharding Range and Granularity Implications

In Vitess, sharding follows an explicit model, requiring manual intervention to define sharding keys. This coarse approach divides tables across entire MySQL instances based on keyspace ID ranges. Scaling and resharding often necessitate manual adjustments.

With TiDB, database sharding is automated, granular, and dynamic. As tables grow, TiDB automatically splits and balances regions across TiKV nodes, ensuring smoother scalability and efficient resource utilization without the need for manual resharding or application changes. Cluster expansion triggers automatic data rebalancing, eliminating downtime during scaling. 

Key Differences in Sharding Systems

AspectVitess (Traditional Sharding)TiDB (Automatic Sharding)
Sharding KeyUser-defined sharding key or keyspace IDNo manual sharding key; uses TableID + RowID
Sharding ControlUser must design and manage shardsTiDB automatically splits and balances data (regions)
Granularity of ShardingShards correspond to entire MySQL instancesFine-grained data sharding via regions (~96MB each)
Cross-Shard QueryingRequires scatter-gather queries across shards and only using READ-COMMITTED isolation levelTransparent to the user; distributed SQL engine handles it
ScalingManual resharding required as data grows plus possible application changes needed for new topologyAutomatic region splitting and rebalancing; transparent to the application, just add nodes

Comparing Vitess and TiDB: Replication Strategies

In this section, we dive into the replication strategies used in both database platforms. We’ll focus on their strengths, trade-offs, and impact on distributed database performance.

Replication in Vitess

Vitess uses one MySQL primary server per shard and relies on MySQL’s native replication to maintain replicas for each shard. Vitess enhances this setup by managing replication topologies, automating failovers, and abstracting query routing complexities. However, it does not provide strong consistency between shards or replicas, as MySQL’s native replication is typically asynchronous. Applications needing stronger consistency guarantees between replicas must implement additional mechanisms or tolerate eventual consistency.

Replication in TiDB

TiDB achieves strong consistency through consensus algorithms like Raft, ensuring all replicas remain consistent even during failures.

Raft in TiDB:

  • TiDB uses Raft for replication within the TiKV storage layer. By default, each region replicates to two additional TiKV nodes. Unlike MySQL, there are no full-server replicas. All TiKV nodes hold a combination of their own primary regions and replica regions from other servers.
  • A transaction only commits after a majority of replicas (quorum) agree, ensuring strong consistency.

Replication Strategy Implications

Vitess’s reliance on MySQL’s asynchronous binlog replication introduces potential replication lag, leading to eventual rather than strong consistency. While Vitess automates replication management and failovers, it doesn’t enhance MySQL’s consistency guarantees. Applications requiring strict consistency must implement additional safeguards or tolerate stale reads when reading from replicas.  

Additionally, replication at the shard level can lead to data silos, complicating cross-shard consistency.  When this happens, costly scatter-gather queries become necessary, distributing queries across shards and aggregating the results. Scatter-gather queries are a workaround rather than a solution, introducing performance overhead and consistency challenges, especially in systems like Vitess that rely on asynchronous replication.

As mentioned above, TiDB’s replication model ensures strong consistency through Raft. Data is replicated at the region level, with transactions committed only after quorum agreement. This model improves fault tolerance, enables automatic load balancing, and allows dynamic scaling without manual intervention. TiDB’s architecture simplifies operations, offering a global, consistent SQL interface without explicit data distribution management.

Key Differences Between Vitess with MySQL Replication and TiDB Replication

FeatureVitess with MySQL ReplicationTiDB Replication
Replication MechanismPull-based binlog replicationPush-based Raft consensus
Replication ConsistencyAsynchronous by default, eventual consistencyStrong consistency through Raft consensus
Semi-Synchronous OptionRequires manual configuration; still not fully consistentNot applicable; always uses Raft for consistency
FailoverRequires external tools or manual intervention (handled by Vitess)Automatic leader election and failover
Cross-Shard ConsistencyNot fully supported – has 2fc for atomicity but not isolationStrong consistency across shards
LatencyReplication lag possibleMinimal due to Raft quorum requirements
Failure HandlingData loss possible during failoverNo data loss as long as a majority of replicas are available
Use of Consensus AlgorithmNone; relies on primary-replica modelUses Raft consensus for data replication and consistency
Replication LevelEntire MySQL server replicatedStriped data ranges replicated across nodes

Comparing Vitess and TiDB: The Impact of Sharding Granularity and Replication Strategy on Cross-Shard Query Performance

TiDB and Vitess employ fundamentally different approaches to sharding and replication, significantly affecting cross-shard query performance. Understanding these influences is crucial for selecting the right platform for distributed SQL workloads.

Sharding Granularity and Cross-Shard Query Performance

In Vitess, coarse-grained sharding simplifies single-shard query routing but complicates cross-shard queries. Data distributed across discrete MySQL servers requires Vitess to coordinate among multiple independent databases, introducing overhead and increasing query latency. Poorly chosen sharding keys can exacerbate these inefficiencies. Additionally, each MySQL server must independently parse incoming queries, leading to performance variability due to differing server workloads, configurations, or hardware.  Each query optimizer in each MySQL instance can only optimize a query within that instance. This dramatically increases performance tuning overhead and significantly impairs optimization accuracy.  A query that must pass through multiple optimization layers without visibility between optimization layers cannot be optimized to its full potential.

In contrast, TiDB employs fine-grained, dynamic sharding, dividing data into small regions dynamically distributed across TiKV nodes. Cross-region queries are handled transparently, with TiDB parsing queries once and generating targeted execution plans distributed to relevant nodes. This is possible because TiDB makes all of the metadata needed to optimize the query available to the TiDB server doing the query parsing and optimization. This centralized parsing reduces processing overhead and ensures consistent query performance across the cluster. TiDB’s dynamic sharding allows real-time data rebalancing, mitigating hotspots and maintaining consistent query performance as workloads evolve.

Replication Strategy and Cross-Shard Query Performance

Vitess’s reliance on MySQL’s asynchronous binlog replication introduces replication lag, affecting cross-shard query performance, especially when queries require up-to-date data from multiple shards. Ensuring consistency across shards is challenging and can degrade cross-shard transaction performance.

TiDB’s replication strategy, based on the Raft consensus algorithm, eliminates replication lag issues, ensuring up-to-date data without needing application-level consistency mechanisms. Replicating data at the region level allows even distribution of query workloads, enhancing cross-shard query performance.

Comparing Vitess and TiDB: Deployment Across Cloud Availability Zones

In Vitess, shards are distributed across different AZs to avoid single points of failure. For example, with three shards and three AZs, each shard’s primary resides in a different AZ, with replicas in separate AZs for failover capability.

TiDB distributes TiKV nodes across multiple AZs, ensuring high availability and fault tolerance through Raft. Data is inherently replicated across AZs, maintaining strong consistency and automatic failover if an AZ becomes unavailable. The TiDB stateless SQL layer and Placement Driver nodes are also deployed across AZs to ensure continuous query processing and cluster management during AZ failures.

AZ Outage Survivability and Considerations

Both Vitess and TiDB are designed to survive AZ outages, but their architectures affect downtime and recovery differently. In Vitess, an AZ outage triggers failover of the affected shard’s primary to a replica in another AZ. Since Vitess relies on MySQL’s asynchronous replication, brief downtime and potential data loss can occur if the replica lags behind the primary.

On the other hand, TiDB’s use of Raft ensures strong consistency across AZs. If an AZ goes down, TiDB can continue operating without downtime as long as a majority of replicas (quorum) remain available. This resilience minimizes data loss risk and enables faster recovery during AZ outages.

Conclusion

TiDB and Vitess represent two distinct philosophies in distributed SQL database design. Vitess offers a more legacy sharding approach, requiring manual intervention and careful planning for scalability and consistency. This can lead to complexities in cross-shard query performance and potential risks in high-availability scenarios. 

TiDB’s model provides fully automated fine-grained sharding, strong consistency, and hot spot resolution. This means it can provide a seamless and resilient experience for distributed SQL workloads. Its architecture ensures reliable performance, minimal manual intervention, and robust fault tolerance. This makes it an ideal choice for applications demanding consistent, scalable, and highly-available solutions across cloud environments.

Want to learn more about how TiDB compares to traditional MySQL? Download our popular comparison white paper to gain the knowledge you need to scale your growing application workloads with zero downtime.


Download Now


Experience modern data infrastructure firsthand.

Try TiDB Serverless

Have questions? Let us know how we can help.

Contact Us

TiDB Cloud Dedicated

A fully-managed cloud DBaaS for predictable workloads

TiDB Cloud Serverless

A fully-managed cloud DBaaS for auto-scaling workloads