What companies are using TiDB in production?

TiDB is trusted by over 3000 global enterprises across a variety of industries, such as financial services, gaming, and e-commerce. Users include Square (US), Shopee (Singapore), and China UnionPay (China).

How is TiDB different from other relational databases like MySQL?

TiDB is a next-generation, distributed relational database that can independently scale both computing and storage capacity by adding new nodes. Unlike traditional relational databases that only scale vertically, TiDB offers horizontal scalability, high availability with automatic failover, HTAP capabilities for both OLTP and OLAP workloads, and MySQL protocol compatibility so you can replace MySQL without changing application code.

What is the relationship between TiDB and TiDB Cloud?

TiDB is an open-source database best suited for organizations that want to run it on-premises or in their own data centers. TiDB Cloud is a fully managed cloud Database-as-a-Service (DBaaS) built on TiDB, with an easy-to-use web-based management console for managing TiDB clusters in mission-critical production environments.

Is TiDB compatible with MySQL?

TiDB is highly compatible with the MySQL protocol and the common features and syntax of MySQL 5.7 and MySQL 8.0. Ecosystem tools for MySQL such as PHPMyAdmin, Navicat, MySQL Workbench, and DBeaver can all be used with TiDB. Some MySQL features are not supported in TiDB due to architectural differences in a distributed system.

What programming languages can I use to work with TiDB?

You can use any programming language supported by the MySQL client or driver, including Java, Go, Python, Ruby, PHP, and more.

How does TiDB support strong consistency?

TiDB implements Snapshot Isolation consistency, delivering REPEATABLE-READ for MySQL compatibility. Data is redundantly copied between TiKV nodes using the Raft consensus algorithm to ensure recoverability in the event of node failure. TiDB uses a replication log and State Machine model — write requests go to a Leader node which replicates the command to Followers as a log, and once the majority of nodes receive the log, it is committed and applied.

Where can I run TiDB?

TiDB is available for bare metal, cloud-based, or hybrid installations. A Kubernetes Operator is available, and you can also use TiUp to quickly deploy a test environment on your laptop or a full production cluster across many nodes.

How does TiDB ensure high availability?

TiDB uses the Raft consensus algorithm to ensure data is highly available and safely replicated throughout storage in Raft Groups. Data is redundantly copied between TiKV nodes across different Availability Zones to protect against machine or data center failure. Automatic failover ensures your service stays online continuously.

What support is available for TiDB customers?

TiDB is supported by a team with experience running mission-critical use cases for over 3000 global enterprises across financial services, e-commerce, enterprise applications, and gaming. 24/7 support is available for TiDB Enterprise Subscription users.

What are PD, TiDB, TiKV, and TiFlash nodes in a TiDB Cluster?

PD (Placement Driver) is the brain of the TiDB cluster, storing metadata and sending data scheduling commands to TiKV nodes. TiDB is the SQL computing layer that aggregates query results and is horizontally scalable. TiKV is the transactional store for OLTP data, maintained in multiple replicas with native high availability. TiFlash is the analytical storage layer that replicates data from TiKV in real-time to support OLAP workloads using columnar storage.

How does TiDB replicate data between TiKV nodes?

TiKV divides the key-value space into key ranges called Regions. Data is distributed across all nodes using Regions as the basic unit, with PD responsible for spreading Regions evenly. TiDB uses the Raft consensus algorithm to replicate data by Regions — multiple replicas of a Region form a Raft Group, and each data change is recorded as a Raft log that is reliably replicated across nodes.

How do I make use of TiDB HTAP capabilities?

As a Hybrid Transactional Analytical Processing (HTAP) database, TiDB automatically replicates data between the OLTP store (TiKV) and OLAP store (TiFlash) in real-time. This eliminates the need for a separate data warehouse and supports real-time analytics on transactional data. Typical HTAP use cases include user personalization, AI recommendations, fraud detection, business intelligence, and real-time reporting.

Is there an easy migration path from another RDBMS to TiDB?

Yes. TiDB provides TiDB Lightning and a Data Migration Tool to migrate data from MySQL databases. Since TiDB implements the MySQL wire protocol, you can use the MySQL client directly. TiKV APIs are also available for Java, Go, Rust, and Python.

What is the difference between TiDB Community Edition and the Enterprise Subscription?

Some features such as audit logging are not included in the Community Edition. The most significant difference is the inclusion of Enterprise Support at the Enterprise Subscription level, providing 24/7 professional support for production environments.

How does TiDB protect data privacy and ensure security?

TiDB includes Transport Layer Security (TLS) and Transparent Data Encryption (TDE) for encryption at rest. It operates across two network planes: one for application-to-TiDB server communication and one for internal data communication. TiDB also supports extended syntax for Subject Alternative Name verification and TLS context for internal communication.

What companies are using TiDB Cloud in production?

TiDB Cloud is trusted by enterprises including Catalyst (US), KNN3 Network (Singapore), and CAPCOM (Japan), alongside thousands of other global organizations across financial services, SaaS, Web3, gaming, and e-commerce.

TiDB Cloud is a fully managed cloud Database-as-a-Service (DBaaS) built on TiDB. It allows developers and DBAs to deploy on Amazon Web Services or Google Cloud through an intuitive console, handling infrastructure management and cluster deployment so teams can focus on building applications. Clusters can be scaled in or out with a simple click.

Is TiDB Cloud compatible with MySQL?

TiDB Cloud is highly compatible with the MySQL protocol and the common features and syntax of MySQL 5.7 and MySQL 8.0. MySQL ecosystem tools including PHPMyAdmin, Navicat, MySQL Workbench, and DBeaver can all be used with TiDB Cloud.

Where can I run TiDB Cloud?

TiDB Cloud is currently available on Amazon Web Services (AWS) and Google Cloud.

How does TiDB Cloud ensure high availability?

TiDB Cloud uses the Raft consensus algorithm to replicate data safely across TiKV nodes in different Availability Zones, protecting against machine or data center failure. As a SaaS provider, PingCAP meets SOC 2 Type 2, ISO 27001, ISO 27701, PCI DSS, GDPR, and HIPAA standards to ensure data security, availability, and confidentiality.

What support is available for TiDB Cloud customers?

TiDB Cloud is supported by the same team behind TiDB, with experience running mission-critical workloads for over 3000 global enterprises. 24/7 support is available for all TiDB Cloud users.

How do I make use of TiDB Cloud HTAP capabilities?

TiDB Cloud automatically replicates data between the OLTP store (TiKV) and OLAP store (TiFlash) in real-time, enabling real-time analytics on transactional data without a separate data pipeline. Typical use cases include AI recommendations, fraud detection, business intelligence, and real-time reporting.

Is there an easy migration path from another RDBMS to TiDB Cloud?

Yes. TiDB provides TiDB Lightning and a Data Migration Tool for migrating from MySQL. TiDB Cloud implements the MySQL wire protocol so existing MySQL clients work directly. TiKV APIs are also available for Java, Go, Rust, and Python.

AI Memory Architecture: How to Build One That Actually Works

Key Takeaways

Off-the-shelf memory frameworks can silently discard the details that matter most.

A three-layer AI memory architecture delivers far better recall than any single abstraction.

TiDB’s native vector search eliminates the two-database overhead of a Postgres + Pinecone setup.

Model choice for synthesis tasks is a trust decision, not a cost decision.

I was talking to Claude the other day — not about code or some technical problem. I was venting about work, about life. And Claude responded with something so personal, so specific to my situation, that I stopped and stared at it. It referenced my daughter by name. It brought up something I’d been stressed about from a conversation weeks earlier. It connected dots between completely separate chats.

That feeling of being truly remembered by an AI? That’s a product.

So I built Speak2Me, a voice-first AI journal companion. You talk to it like a friend, and it actually remembers your story — not with generic responses like “that sounds frustrating,” but with real, personal context that references your life, your people, and your patterns.

The first version took about two hours to build. Making it actually work took the rest of the week. Because here’s the thing nobody tells you about AI memory: It’s really hard to get right.

AI Memory Architecture: The Promise vs. The Reality

The concept was straightforward: Open the app and it just gets you. It remembers your partner’s name, asks about that job stress you mentioned last week, and checks whether the baby is sleeping through the night yet.

I wired everything up — Hume EVI for voice, Mem0 for long-term memory, TiDB for the database (relational data and vector search in one), Claude as the reasoning layer, and Vercel for deployment. Sent the link to a few testers. Felt good about myself.

Then I used it for real. Told it personal details — my income, my family, my goals for the year. Opened it the next session expecting a deeply personal experience.

It had no idea who I was. Zero context. The entire product promise was broken.

When Your AI Memory Architecture Layer Forgets

I was using Mem0 for long-term memory. If you haven’t encountered it, Mem0 is an open-source memory framework with over 40,000 GitHub stars. The idea is compelling: Feed it conversations, it extracts important facts, and you recall those facts later.

During a test conversation, I provided exact financial details — my base salary and bonus, down to the dollar. I then checked what Mem0 actually stored.

It had extracted a vague sentence about “wanting to discuss income.” The actual numbers were gone.

This isn’t a bug in Mem0’s design — it’s a limitation of how memory extraction works. Mem0 uses a smaller language model internally (GPT-4o-mini) to decide what’s worth remembering, and smaller models are aggressive summarizers. They capture the gist and discard the specifics. For casual chatbot memory, that tradeoff might be acceptable. For a product where remembering exact life details is the value proposition, it’s a dealbreaker.

I ran more tests with family details, career plans, specific names and dates. Some things it captured. Others it mangled or skipped entirely. There was no way to predict what it would retain, because I didn’t control the extraction model.

If the memory layer is the product, you can’t outsource it to someone else’s black box.

The Hallucination Problem: Who Is Lily?

While debugging the Mem0 issue, I made another mistake that could have been far worse.

To save costs, I was using GPT-4o-mini to synthesize user profiles — taking all conversations and generating a summary document of who the user is, what they care about, and who’s important in their life. This profile gets injected into every future conversation as context.

I ran the synthesis on my test data and read the output. It said my daughter’s name was “Lily” and my partner was “Sarah.”

Neither name is correct. GPT-4o-mini fabricated plausible-sounding names when the real names simply hadn’t been mentioned yet. Instead of writing “not yet mentioned,” it invented details and presented them as fact.

Imagine opening your personal journal companion and hearing it say “How’s Lily doing?” when your daughter’s actual name is completely different. That’s not a bug — it’s a trust-destroying moment you can never recover from.

I switched immediately to Claude Haiku 3.5 for profile synthesis and added strict guardrails: Never invent, guess, or infer names, numbers, or details not explicitly stated in the conversations. If something hasn’t been mentioned, write “not yet mentioned.”

Model choice for synthesis tasks isn’t a cost optimization. It’s a trust decision. One hallucinated family member name and your user is gone forever.

Building a Three-Layer AI Memory Architecture

After these failures, I rethought the entire memory architecture from scratch. The solution required three complementary layers.

Layer 1: The User Profile

After every conversation, Claude Haiku reads all past transcripts and generates a synthesized document — who the user is, their job, the important people in their life, their stressors, their goals. This document gets injected into the system prompt for every future session. It’s how the AI “knows” you before you say a word.

Layer 2: Per-Exchange Vector Search

This is where the biggest improvement happened.

Originally, I was embedding entire conversation transcripts as single vectors. A 20-minute conversation covering salary, weekend plans, and a family wedding all became one vector — a single point in mathematical space representing the average of all those topics blended together.

When I searched for “salary,” it would find that conversation, but it also pulled up every other long conversation with similarly diluted vectors. The signal was buried.

The fix was chunking at the exchange level. One user message plus its AI response equals one chunk. Each chunk gets its own embedding vector. Now when I search for “salary,” it finds the exact exchange where salary was discussed — not the whole conversation, but the precise moment.

It’s the difference between searching a book by title versus having every individual page indexed. The recall quality improvement was dramatic. (For even better retrieval, TiDB also supports full-text search for hybrid retrieval — combining keyword matching with vector similarity — which I’m planning to integrate next.)

I’m using OpenAI’s text-embedding-3-large model (3,072 dimensions) and storing the vectors in TiDB, which supports vector search natively. When the AI needs to recall something during a live conversation, it searches these chunks using cosine distance. The cost is negligible — less than ten cents per user per year for embeddings.

Layer 3: Raw Transcripts

Every word, stored unmodified. This is the ground truth that never gets summarized, compressed, or distorted by a model. If the profile synthesis misses something or the vector search returns an unexpected result, the raw data is always there.

After validating this three-layer approach, I removed Mem0 entirely. Not because it’s bad software — but once the architecture was working, it wasn’t adding value. It was just another dependency sitting between me and my data.

Why I Chose TiDB Over Postgres + Pinecone

The database choice deserves its own section because it addresses one of the most common architectural patterns in RAG applications — and why I think that pattern is wrong for many use cases.

Every RAG tutorial prescribes the same stack: Postgres for your relational data, Pinecone for your vectors. Two databases. Two bills. Sync jobs between them.

Here’s the actual query that runs when the AI needs to recall a memory in Speak2Me:

SELECT
  e.title,
  e.top_emotions,
  c.chunk_text,
  VEC_COSINE_DISTANCE(c.embedding, ?) AS relevance
FROM s2m_transcript_chunks c
JOIN s2m_journal_entries e ON c.entry_id = e.id
WHERE c.user_id = ?
  AND e.created_at > DATE_SUB(NOW(), INTERVAL 30 DAY)
ORDER BY relevance
LIMIT 5

Vector search. Date filtering. User scoping. A JOIN to pull full context. One query. One network hop. (See the full list of vector functions and operators TiDB supports.)

With a Postgres + Pinecone setup, that same operation becomes: Call Pinecone with the vector, get back chunk IDs, call Postgres with those IDs, and join the results in your application code. Two round trips, two failure points, and the join logic lives in JavaScript instead of the database optimizer.

Pre-Filtering Changes Everything

Vector search is computationally expensive — comparing a query vector against millions of stored vectors takes real compute. TiDB filters by user_id and date range first using standard indexes. Fast and cheap. Then it runs the vector comparison on that much smaller subset.

Most dedicated vector databases do the opposite: They search all vectors first, then filter out non-matching metadata after the fact. At scale, that difference is significant.

Strong Consistency for Real-Time AI

During a conversation, the AI extracts a fact from something you just said, stores it, and may need to recall it 30 seconds later in the same session. With a Postgres + Pinecone architecture, you’re managing sync lag — write to Postgres, trigger a job to update Pinecone, hope it finishes before the next recall. Eventual consistency headaches.

With TiDB, I write the embedding and it’s immediately searchable. Same transaction. No lag, no sync jobs, no “read your own writes” issues.

One database. Vectors next to the data they describe. Ship faster, debug easier. (For a deeper look, see our architecture guide: Why unified data architectures matter for GenAI.)

AI Memory Architecture: Solving the Latency Problem

Even after fixing the memory architecture, there was a UX-breaking issue: Latency.

When the AI needed to recall something, it would start responding immediately — confidently, specifically, and often wrong. Then, 10–20 seconds later when the vector search results arrived, it would correct itself mid-sentence.

That moment destroys the product promise. You’re not talking to something that knows you — you’re watching a computer look you up.

The solution was to move memory retrieval from query time to session start. Now when a conversation ends, Claude Haiku extracts key facts synchronously in about 500ms. Not just names and dates, but the kind of details a friend would remember: Specific restaurants, upcoming interviews, goals mentioned in passing.

When you open the app next time, the dashboard prefetches your profile summary and the last 20 entries of quick facts in the background. By the time you speak, the AI has everything in context. No tool calls. No waiting.

	Session End	Session Start	Memory Recall
Before	Instant	~2s	5–10s (tool call)
After	+500ms	Instant	Rarely needed

The recall tool still exists for older memories — “What did I say three months ago about…” — but for anything recent, the AI just knows. It costs more tokens, but the first time the AI remembers something instantly, with no pause or correction, that’s the product.

The Voice Echo From Hell

Speak2Me is voice-first, powered by Hume EVI — which handles speech-to-text, emotion detection, LLM routing, and text-to-speech in a single WebSocket connection. When the AI speaks, Hume detects 48+ dimensions of vocal expression, so when you sound stressed, the AI adjusts its response accordingly.

But here’s a problem nobody documents: When the AI speaks through your phone’s speaker, the microphone picks up that audio, the AI transcribes its own speech, and responds to itself. An infinite feedback loop.

On a native iOS app, the OS provides hardware-level acoustic echo cancellation. On a web app running in a mobile browser, you’re at the mercy of whatever the browser implements — and mobile Safari is inconsistent at best.

After trying microphone muting (which kills the ability to interrupt naturally), I settled on the browser’s built-in audio constraints:

echoCancellation: true,
noiseSuppression: true,
autoGainControl: true

On desktop, this works well. On mobile, it’s acceptable at lower volumes. The real solution is a native iOS app with system-level echo cancellation — that’s coming.

If you’re building real-time voice AI on the web, budget significantly more time for audio engineering than you expect. This problem space is essentially uncharted.

What’s Next

Speak2Me is live. The immediate priority is encryption — users are sharing their most personal thoughts, and journal transcripts need to be encrypted at rest. After that, native iOS to solve the echo problem at the hardware level and add push notifications, background audio, and biometric authentication.

The memory system will keep improving, but only with real conversation data flowing through it. If you’re a developer building anything with AI memory, I hope the architectural failures I documented here save you some time. If you want to go deeper on choosing the right data infrastructure for AI applications, or see how I applied similar patterns in a privacy-first voice-to-text app and an AI-powered life simulator, those deep dives are worth a read.

And if you want to try Speak2Me, go talk to it. Tell it something real. Come back tomorrow and see if it remembers.

Start building with TiDB Cloud Starter — vector search, SQL joins, and strong consistency in one MySQL-compatible database.

Try for Free