Distributed SQL Database: Architecture, Scale, and High Availability

A distributed database is any system that spreads data across multiple nodes. However, a distributed SQL database is a stricter subset: it keeps full SQL semantics and ACID transactions, automatically partitions data for horizontal scale, and uses consensus replication (e.g., Raft) so writes are consistent and failover is predictable. In short, distributed SQL gives you […]

How Atlassian Scaled to 3M+ Tables: Multi-Tenant Control with TiDB

Atlassian is an enterprise software company that runs one of the world’s largest SaaS platforms. Best known for Jira, Confluence, Trello, and Bitbucket, the company helps teams plan, build, and run software. As tenant counts and compliance demands grew, Atlassian hit the limits of shared and siloed multi-tenancy models on a massive sharded PostgreSQL estate.  […]

Effective MySQL Online DDL: Making Critical Database Schema Changes with Zero Downtime

Online Data Definition Language (DDL) is a crucial feature for modern databases and a cornerstone of MySQL modernization strategies. It allows schema changes without significant downtime or locking that could disrupt database operations. This means these operations carry out while the database continues to be available for reads and writes, minimizing downtime and avoiding disruption […]

Zero-Downtime Upgrades: How TiDB Powers Always-On Databases

In the vast landscape of databases, ensuring zero-downtime upgrades and operation continuity remains a challenge. Due to inherent design limitations, traditional databases often introduce significant downtime during upgrades – a challenge that can spell operational chaos for businesses reliant on real-time data access.  Enter TiDB, a cutting-edge distributed SQL database that offers a solution to […]

Supercharging Real-Time Applications with TiDB and DragonflyDB

Data-intensive applications demand scalability, low latency, and resilience. However, traditional databases often struggle to handle both transactional consistency and fast in-memory caching at scale. But that’s where TiDB and DragonflyDB shine together: In this tutorial, we’ll walk through setting up a TiDB + Dragonfly stack, show how they complement each other, and build a hands-on […]

How to Scale TiDB Locally with Online DDL

Data-intensive applications outgrow single-node MySQL long before product-market fit is “done.” Hot partitions, schema change windows, and manual sharding slow teams down. But TiDB solves this with a MySQL-compatible, distributed SQL architecture that scales storage and compute independently and keeps applications online during change. In this quick tutorial, we’ll spin up TiDB locally with TiUP […]

How Distributed ACID Transactions Work in TiDB

Transactions—especially distributed ACID transactions—are ubiquitous. Protocols around transactions are equally ubiquitous, even if we don’t immediately realize it. Take, for example, a common marriage ceremony. It’s essentially a two-phase commit (2PC) protocol. The officiant is the transaction coordinator (TC), and the couple getting married are the active participants. In the first phase, the TC asks […]

Rethinking Scale: TiDB’s Evolution Into an AI Agent Database

Recently, I’ve been meeting customers across industries including AIaaS, Web3, and FinTech. No matter where I go, the conversation always turns to one thing: AI. Everyone’s asking the same question: “How will AI transform our business, and how does TiDB embrace this change?” So, I started sharing what had been on my mind lately. “The era […]

How to Stream Data from Kafka to TiDB

Modern applications generate enormous amounts of event data with user actions, transactions, logs, and metrics all happening in real time. To handle this scale, many teams rely on Apache Kafka, a distributed messaging system that decouples applications from their data pipelines and ensures reliable, high-throughput data delivery. On the storage side, TiDB provides a distributed SQL database that […]

Database Sharding Explained: Strategies for Scalable SQL Performance

Database sharding is a data architecture strategy that increases database performance by splitting up data into chunks and then spreading these chunks “intelligently” across multiple database servers (or database instances). These chunks of data are called shards, while each shard contains a subset of our data. All shards represent the entire set of data, and […]

Change Data Capture (CDC): A Complete Guide for Modern Data Teams

In the areas of data management and real-time analytics, Change Data Capture (CDC) has become an indispensable tool. CDC is a software operation that enables you to monitor and record changes in your source database. From there, you can subsequently apply those changes to your target database. These changes could be new records, updates, or […]

Introducing TiDB X: A New Foundation for Distributed SQL in the Era of AI

Ask any engineering team managing a modern application today, and you’ll hear the same frustrations: traffic spikes that can’t be predicted, autoscalers that react too slowly, and workloads that look nothing like they did yesterday. One hour you’re processing transactions, the next you’re serving analytical or vector queries — all in real time, all under […]
129