In the rapidly evolving landscape of machine learning and database technologies, combining the strengths of different tools can lead to innovative solutions. One such powerful combination is using Jina AI’s embedding capabilities with TiDB’s vector search functionality. This blog will guide you through building a semantic cache service using Jina AI Embeddings and TiDB Vector.

What is a Semantic Cache?

A semantic cache stores the results of expensive queries and reuses them when the same or similar queries are made. This type of cache uses semantic understanding rather than exact key matching, making it particularly useful in applications requiring natural language processing or similar complex data retrieval tasks.

Why Jina AI and TiDB?

  • Jina AI: Provides robust embedding capabilities, converting text into high-dimensional vectors that capture semantic meaning.
  • TiDB Vector: Extends the TiDB database to support efficient vector operations, enabling fast similarity searches on high-dimensional data.

Setting Up the Environment

Prerequisites

Ensure you have the following installed:

  • Python 3.8 or higher
  • TiDB Serverless cluster setup and running
  • An API key from Jina AI

Step-by-Step Implementation

1.Configuration

First, set up your environment configuration. Create a .env file to store your database URI and TTL (Time to Live) settings.

DATABASE_URI=mysql+pymysql://<username>:<password>@<host>:<port>/<database>?ssl_mode=VERIFY_IDENTITY&ssl_ca=/etc/ssl/cert.pem
TIME_TO_LIVE=604800  # Default is 1 week

2.Install Required Libraries

Install the necessary Python packages:

pip install fastapi requests sqlmodel sqlalchemy python-dotenv tidb-vector

3.Define the Cache Model

Use SQLModel to define your cache model, incorporating vector fields and automatic timestamping.

from sqlmodel import SQLModel, Field, Column, DateTime, String, Text
from sqlalchemy import func
from tidb_vector.sqlalchemy import VectorType
from typing import Optional
from datetime import datetime

class Cache(SQLModel, table=True):
    __table_args__ = {
        # Setting the TTL (Time to Live) for the cache entries
        'mysql_TTL': f'created_at + INTERVAL {TIME_TO_LIVE} SECOND',
    }

    id: Optional[int] = Field(default=None, primary_key=True)
    key: str = Field(sa_column=Column(String(255), unique=True, nullable=False))
    key_vec: Optional[list[float]] = Field(
        sa_column=Column(
            VectorType(768),  # Define the vector type with 768 dimensions
            default=None,
            comment="hnsw(distance=l2)",  # Using HNSW (Hierarchical Navigable Small World) algorithm for distance calculation
            nullable=False,
        )
    )
    value: Optional[str] = Field(sa_column=Column(Text))
    created_at: datetime = Field(
        sa_column=Column(DateTime, server_default=func.now(), nullable=False)
    )
    updated_at: datetime = Field(
        sa_column=Column(DateTime, server_default=func.now(), onupdate=func.now(), nullable=False)
    )

4.Create the Database Engine

Create the engine and the database schema.

from sqlmodel import create_engine

# Create the engine using the database URI
engine = create_engine(DATABASE_URI)
# Create all tables in the database
SQLModel.metadata.create_all(engine)

5.FastAPI Setup

Set up the FastAPI application and endpoints for setting and getting cache entries.

from fastapi import FastAPI, Depends
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
from sqlmodel import Session, select

# Initialize FastAPI app
app = FastAPI()
security = HTTPBearer()

@app.post("/set")
def set_cache(
    credentials: HTTPAuthorizationCredentials = Depends(security),
    cache: Cache
):
    # Generate embeddings for the given key using Jina AI
    cache.key_vec = generate_embeddings(credentials.credentials, cache.key)
    with Session(engine) as session:
        session.add(cache)
        session.commit()
    return {'message': 'Cache has been set'}

@app.get("/get/{key}")
def get_cache(
    credentials: HTTPAuthorizationCredentials = Depends(security),
    key: str,
    max_distance: Optional[float] = 0.1,
):
    # Generate embeddings for the given key using Jina AI
    key_vec = generate_embeddings(credentials.credentials, key)
    # The max value of distance is 0.3
    max_distance = min(max_distance, 0.3)

    with Session(engine) as session:
        result = session.exec(
            select(
                Cache,
                Cache.key_vec.cosine_distance(key_vec).label('distance')
            ).order_by(
                'distance'
            ).limit(1)
        ).first()

        if result is None:
            return {"message": "Cache not found"}, 404

        cache, distance = result
        if distance > max_distance:
            return {"message": "Cache not found"}, 404

        return {
            "key": cache.key,
            "value": cache.value,
            "distance": distance
        }

6.Generate Embeddings

Implement a function to get embeddings from Jina AI.

import requests
import os
from dotenv import load_dotenv

load_dotenv()

def generate_embeddings(jinaai_api_key: str, text: str):
    JINAAI_API_URL = 'https://api.jina.ai/v1/embeddings'
    JINAAI_HEADERS = {
        'Content-Type': 'application/json',
        'Authorization': f'Bearer {jinaai_api_key}'
    }
    JINAAI_REQUEST_DATA = {
        'input': [text],
        'model': 'jina-embeddings-v2-base-en'  # Use the Jina Embeddings model with 768 dimensions
    }
    response = requests.post(JINAAI_API_URL, headers=JINAAI_HEADERS, json=JINAAI_REQUEST_DATA)
    # Extract and return the embedding from the response
    return response.json()['data'][0]['embedding']

How to Use This App

Prerequisites

  • A running TiDB Serverless cluster with vector search enabled
  • Python 3.8 or later
  • Jina AI API key from Jina AI

Run the example

1.Clone this repo

git clone https://github.com/pingcap/tidb-vector-python.git

2.Create a virtual environment

cd tidb-vector-python/examples/semantic-cache
python3 -m venv .venv
source .venv/bin/activate

3.Install dependencies

pip install -r requirements.txt

4.Set the environment variables

Get the HOST, PORT, USERNAME, PASSWORD, and DATABASE from the TiDB Cloud console, as described in the [Prerequisites](../README.md#prerequisites) section. Then set the following environment variables:

export DATABASE_URI="mysql+pymysql://<USERNAME>:<PASSWORD>@<HOST>:<PORT>/<DATABASE>?ssl_ca=/etc/ssl/cert.pem&ssl_verify_cert=true&ssl_verify_identity=true"

or create a .env file with the above environment variables.

5.Run this example

Start the semantic cache server

uvicorn cache:app --reload

6.Test the API

Get the Jina AI API key from the Jina AI Embedding API page, and save it somewhere safe for later use.

  • POST /set
curl --location ':8000/set' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <your jina token>' \
--data '{
    "key": "what is tidb",
    "value": "tidb is a mysql-compatible and htap database"
}'
  • GET /get/<key>
curl --location ':8000/get/what%27s%20tidb%20and%20tikv?max_distance=0.5' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <your jina token>'

Conclusion

By combining Jina AI’s powerful embedding capabilities with TiDB’s efficient vector operations, you can build a robust semantic cache service. This service is ideal for applications requiring fast, intelligent caching and retrieval of semantically similar data. Start experimenting with this setup to explore its full potential in your projects.

More Demos

There are some examples to show how to use the tidb-vector-python to interact with TiDB Vector in different scenarios.

Happy coding!


Last updated June 23, 2024

Spin up a Serverless database with 25GiB free resources.

Start Right Away