Tutorial: Vector Search with Real-time Data
This example demonstrates TiDB’s vector search capabilities with real-time data updates, simulating an e-commerce recommendation system. It showcases how TiDB automatically embeds product descriptions and uses vector similarity search to provide personalized recommendations based on user preferences.
Features
- 🔍 Vector Search: Semantic similarity search using auto-embedding
- 📱 Real-time Updates: Instant recommendation refresh when products are added, updated, or deleted
- 🛍️ Shopping Experience: Mobile app UI showing personalized product recommendations
- ⚙️ Admin Panel: Full CRUD operations on products
- 🎯 Smart Filtering: Adjustable similarity threshold for recommendation precision
- 🤖 Auto-Embedding: TiDB automatically generates vector embeddings for product descriptions
UI Layout
The application features a two-column layout:
Left Column: Shopping App (User View)
- Mobile phone mockup showing personalized recommendations
- Displays up to 5 products that match the user’s profile
- Results filtered by similarity threshold
- Clean, modern design with product cards showing name, description, price, and category
Right Column: Admin Panel
- Top Section: Product list with edit and delete functionality
- Edit products inline with a form
- Delete products with one click
- Bottom Section: Add new products form
- Input fields for name, description, category, and price
- Auto-embedding happens automatically when products are added
How It Works
- User Profile: A text description of user preferences (e.g., “a user likes sports”)
- Auto-Embedding: Product descriptions are automatically converted to vector embeddings by TiDB
- Vector Search: The system searches for products similar to the user profile using vector similarity
- Threshold Filtering: Only products within the similarity threshold are shown
- Real-time Updates: Any changes to products trigger automatic recommendation refresh
Initial State
The application starts with 5 sample products:
- 3 sports-related items (matching the default profile “a user likes sports”)
- Professional Basketball
- Running Shoes
- Yoga Mat
- 2 unrelated items (filtered out by default threshold)
- Cooking Pot Set
- Gardening Tools Kit
With the default threshold (0.5), only the 3 sports items appear in recommendations.
Prerequisites
- Python 3.10+
- A TiDB Cloud Serverless cluster: Create a free cluster here: tidbcloud.com
- OpenAI API Key: Required for embedding generation
Setup Instructions
Step 1: Clone the repository
git clone https://github.com/pingcap/pytidb.git cd pytidb/examples/vector-search-with-realtime-data/
Step 2: Install dependencies
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate pip install -r requirements.txt
Step 3: Configure environment variables
The .env.example file is already in the directory. Create a .env file based on it:
cp .env.example .env
Then edit .env with your credentials:
# TiDB Connection (get from https://tidbcloud.com/clusters) TIDB_HOST=gateway01.ap-southeast-1.prod.aws.tidbcloud.com TIDB_PORT=4000 TIDB_USERNAME=your-username.root TIDB_PASSWORD=your-password TIDB_DATABASE=test # OpenAI API Key (for embeddings) OPENAI_API_KEY=your-openai-api-key # User Profile (customize as needed) USER_PROFILE=a user likes sports
Step 4: Run the application
streamlit run app.py
Step 5: Open in browser
Open your browser and visit http://localhost:8501
Usage Guide
Viewing Recommendations
- The left side shows a mobile app interface with personalized recommendations
- Recommendations are based on the USER_PROFILE from your
.envfile - Only products within the similarity threshold are displayed
Adjusting Settings
Click the “⚙️ Settings” expander at the top to:
- Change User Profile: Modify preferences to see different recommendations
- Try: “a user likes cooking”, “someone interested in fitness”, “outdoor enthusiast”
- Adjust Threshold: Control how strictly products must match the profile
- Lower values (0.3-0.4): Stricter matching, fewer results
- Higher values (0.6-0.8): Looser matching, more results
Managing Products
Adding Products
- Scroll to the bottom of the right panel
- Fill in the “Add New Product” form:
- Product Name
- Description (this will be embedded automatically)
- Category
- Price
- Click “Add Product”
- Watch recommendations update automatically!
Editing Products
- Find the product in the list
- Click the ✏️ (edit) button
- Modify the fields in the form
- Click “💾 Save” to update, or “❌ Cancel” to discard changes
- Updated products will be re-embedded and recommendations will refresh
Deleting Products
- Find the product in the list
- Click the 🗑️ (delete) button
- Product is removed immediately
- Recommendations update automatically
Example Experiments
Experiment 1: Add a Sports Item
- Add a new product: “Tennis Racket” with description about tennis and sports
- Observe it appears in recommendations (if profile is sports-related)
Experiment 2: Add an Unrelated Item
- Add a product: “Classical Music CD” with description about music
- Observe it doesn’t appear in recommendations (filtered by threshold)
Experiment 3: Change User Profile
- Open Settings and change profile to “a user likes cooking”
- Watch recommendations switch to show the Cooking Pot Set
- Sports items should no longer appear
Experiment 4: Adjust Threshold
- Lower threshold to 0.3 (stricter matching)
- Fewer products appear in recommendations
- Raise threshold to 0.7 (looser matching)
- More products appear, including less relevant ones
Technical Details
Database Schema
class Product(TableModel):
id: int # Primary key (auto-increment)
name: str # Product name
description: str # Product description
description_vec: list[float] # Auto-generated embedding vector
category: str # Product category
price: float # Product price
Vector Search Query
The application uses TiDB’s vector search with distance threshold:
results = (
table.search(user_profile) # Search using user profile as query
.distance_threshold(distance_threshold) # Filter by similarity
.limit(5) # Return top 5 results
.to_list()
)
Embedding Model
- Model: OpenAI
text-embedding-3-small - Dimensions: 1536
- Provider: OpenAI API
- Auto-embedding: Triggered automatically on insert/update
Learn More
License
This example is part of the PyTiDB project and follows the same license.