Where TiDB fits in the vibe coding tech stack
Each subsection below maps a specific TiDB capability to a specific vibe coding workflow need.
TiDB Cloud for builders who start fast and scale later
TiDB Cloud Serverless is the starting point. You connect with a standard MySQL-compatible driver. Your ORM (Prisma, Drizzle, TypeORM) connects without modification. The AI-generated code that targets MySQL works against TiDB Cloud without changes to queries, migrations, or connection strings. You ship the MVP on the free tier, and the database scales with you without a migration event.
New to TiDB? See What is TiDB for an overview of the architecture and the use cases it fits. A free tier is available at signup with no credit card required.
Distributed SQL with MySQL compatibility for AI code generation
MySQL compatibility matters for AI code generation because MySQL is one of the most-represented database dialects in AI training corpora, alongside PostgreSQL. When you use a MySQL-compatible database like TiDB, the queries the AI generates are more likely to be correct on the first attempt. Standard MySQL drivers, ORMs (Prisma, Drizzle, TypeORM), migration tools (Prisma Migrate, Flyway), and monitoring integrations all work with TiDB without additional configuration.
That ecosystem familiarity is what makes TiDB work well with the patterns the AI has trained on. The AI understands MySQL's SQL dialect, the MySQL error format, and the MySQL wire protocol, all of which TiDB supports. Generated code, generated queries, and generated error-handling logic land closer to correct without requiring TiDB-specific prompting. Where TiDB's behavior differs from MySQL, such as around certain optimizer behaviors or distributed transaction semantics, the TiDB documentation and the tidbx-nextjs and tidbx-prisma skill files in pingcap/agent-rules cover the gap.
See MySQL compatible database alternatives for scaling teams for a deeper comparison.
The common alternative to HTAP is building a data stack: replicate your app database to a data warehouse (Snowflake, BigQuery, Redshift), run analytics there, and build a sync pipeline to keep them in alignment. That is three systems instead of one, and three operational costs.
TiDB's HTAP model lets a single TiDB Cloud cluster handle transactional writes from your Next.js app, real-time analytical queries for product dashboards, and the batch aggregations that feed AI feature context. You do not need to redesign the data layer as the product grows. You add TiFlash replicas (the columnar engine) to the tables that need analytical acceleration, and the rest of the cluster continues serving transactional workloads without interference.
TiDB Cloud includes a native vector data type and an ANN index backed by the HNSW (Hierarchical Navigable Small World) algorithm. See the TiDB vector search docs for the full reference. You define an embedding column in a standard CREATE TABLE statement, insert embedding vectors alongside the rest of your row data, and query with a cosine distance function.
The operational implication: your documents table, your embeddings column, and your relational metadata all live in the same database. You write one schema migration. You maintain one backup policy. Your AI feature does not require a separate operational runbook.
Retrieval augmented generation with LangChain on TiDB
LangChain has a Python-based TiDB integration. The LangChain TiDB vector store uses TiDB Cloud as a vector store backend, meaning LangChain can store document embeddings in TiDB Cloud and execute ANN retrieval queries as part of a RAG chain. The workflow:
- Embed source documents using an embedding model (OpenAI
text-embedding-3-small is the common choice, producing 1536-dimensional vectors by default; dimension must match the column definition).
- Store embeddings in TiDB using the LangChain TiDB vector store integration.
- At query time, embed the user's question and retrieve the top-k most similar document chunks using ANN search.
- Pass the retrieved chunks as context in the language model prompt.
- Return the grounded, context-aware completion to the user.
The same TiDB cluster that handles your transactional data handles the retrieval step. There is no cross-system join, no sync lag, and no separate API call to a vector database service.
CDC patterns to keep AI context fresh
TiDB's CDC functionality uses the TiCDC component to stream row-level change events to downstream systems. In a vibe coding architecture the primary downstream consumer is the embedding pipeline.
A typical pattern: TiCDC streams changes from TiDB to a Kafka topic. An embedding worker subscribes to that topic, generates new embeddings for changed documents, and writes them back to TiDB's vector store table. The result is that your RAG pipeline always retrieves context that reflects the current state of your application data, not a stale snapshot from a batch job that ran six hours ago.
CDC also powers real-time UI features (activity feeds, live dashboards, notification triggers) without polling the application database. One stream, multiple consumers.
Practical build recipe you can copy
This section is structured as a step-by-step playbook. Each step includes the configuration decisions that matter and the reason they matter for an AI-assisted workflow.
Step 1: Scaffold a Next.js and TypeScript app for AI-first iteration
Start with create-next-app and enable TypeScript, ESLint, and the App Router. Then configure strict mode immediately, before any generated code lands in the repo.
// tsconfig.json
{
"compilerOptions": {
"strict": true,
"noImplicitAny": true,
"strictNullChecks": true,
"noUncheckedIndexedAccess": true
}
}
Establish the folder layout before you open Cursor or Copilot. The AI will replicate whatever structure it sees:
/app — Next.js App Router pages and layouts
/lib — Server-only utilities: db client, auth, external API wrappers
/components — UI components (never import server code here)
/types — Shared TypeScript interfaces and Zod schemas
/prisma or /drizzle — Database schema and migrations
Step 2: Use an AI coding assistant safely with repo rules and review gates
Create a .cursorrules file (or AGENTS.md for Copilot Workspace) in the project root before writing a single line of application code. For TiDB-specific patterns, copy the tidbx-nextjs and tidbx-prisma skill files from the pingcap/agent-rules repository into your project. These files pre-load the agent with correct TiDB Cloud conventions, including the right Prisma type mappings, connection string format, and vector column patterns.
# .cursorrules
## Stack
Next.js 15 App Router, TypeScript 5 strict, Prisma ORM, TiDB Cloud (MySQL compatible)
## Forbidden actions
- Never modify files in /prisma/migrations/ directly
- Never store secrets in code: use environment variables
- Never remove TypeScript strict mode flags
- Never call the database from /components/
## Required patterns
- Every new function gets a corresponding test in __tests__/
- All server actions use zod schema validation on input
- Database queries always go through /lib/db.ts
Set up a GitHub Actions CI workflow that runs type checking and tests on every PR. Do not merge AI-generated code that fails type checking. The agent will learn to generate passing code if you enforce the gate consistently.
Step 3: Add a database schema that won't collapse later
The schema is where the AI most often gets it wrong. It will generate schemas that work for a single tenant and fail for multiple tenants, or that use JSON columns where relational structure would be safer and more queryable.
Start with a multi-tenant-ready pattern from day one, even if you have one tenant at launch. Note that Prisma does not have a native VECTOR type, so the embedding column uses the Unsupported() escape hatch. The tidbx-prisma skill file in pingcap/agent-rules includes this pattern so the agent generates it correctly:
// Prisma schema (TiDB Cloud compatible)
model Organization {
id String @id @default(cuid())
slug String @unique
createdAt DateTime @default(now())
users User[]
documents Document[]
}
model Document {
id String @id @default(cuid())
organizationId String
title String
content String @db.LongText
embedding Unsupported("VECTOR(1536)")? // dimension must match embedding model output
updatedAt DateTime @updatedAt
organization Organization @relation(fields: [organizationId], references: [id])
@@index([organizationId])
}
TiDB Cloud accepts standard Prisma migrations without modification for all standard types. The Unsupported() vector field requires a raw SQL migration step for the HNSW index, shown in Step 4. Run migrations in CI before the deployment step, not after. A migration failure stops the deploy.
Step 4: Add RAG with embeddings, vector search, and LangChain
The minimal RAG data model requires two things: a table to store source documents with their embeddings, and a retrieval function that returns the top-k most similar chunks for a given query embedding.
In TiDB Cloud, you add a VECTOR column and create an HNSW index. The dimension in VECTOR(N) must match the output dimension of your embedding model. OpenAI text-embedding-3-small produces 1536 dimensions at its default setting.
-- SQL for TiDB Cloud vector search
ALTER TABLE documents
ADD COLUMN embedding VECTOR(1536);
-- Correct TiDB HNSW index syntax: distance function goes in the index expression
CREATE VECTOR INDEX idx_doc_embedding
ON documents ((VEC_COSINE_DISTANCE(embedding)))
USING HNSW;
The official LangChain TiDB vector store integration is Python-based. Use it from a Python embedding service or background worker that handles document ingestion and retrieval:
# rag/store.py
from langchain_community.vectorstores import TiDBVectorStore
from langchain_openai import OpenAIEmbeddings
import os
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vector_store = TiDBVectorStore.from_existing_index(
embedding=embeddings,
table_name="documents",
connection_string=os.environ["TIDB_DATABASE_URL"],
distance_strategy="cosine",
)
def retrieve(query: str, k: int = 5):
return vector_store.similarity_search(query, k=k)
Step 5: Deploy and observe, then iterate
Production readiness for a vibe-coded app is not a destination. It is an ongoing practice of observing what breaks and narrowing the feedback loop.
The deployment checklist before shipping to production:
- Preview environment tested and approved by at least one reviewer.
- CI passes: type check, lint, unit tests, integration tests against a migrated schema.
- Migration applied to a TiDB Cloud branch and verified before promotion to production.
- Environment variables validated with a startup check (fail fast if
DATABASE_URL is missing).
- OpenTelemetry tracing connected to a Grafana dashboard so you see latency and error rates from the first request.
For the Kubernetes deployment path: containerize the Next.js app with a multi-stage Docker build that produces a minimal image. Use a Helm chart to manage replicas, resource limits, and environment config. Keep the migration Job as a pre-deploy hook. TiDB Cloud handles the database Kubernetes layer so your cluster only manages stateless application pods.