Building a RAG System with Llama3, Ollama, LlamaIndex and TiDB

Welcome to this step-by-step tutorial on creating a robust Retrieval-Augmented Generation (RAG) system using Llama3, Ollama, LlamaIndex, and TiDB Serverless, which is a MySQL-compatible database but with built-in vector storage in it. This guide is designed to help you integrate these powerful technologies to leverage AI-driven search and response generation capabilities in your applications.

Prerequisites

Before you start, make sure you have:

Python 3.8 or later
An account of TiDB Cloud with Vector Storage feature enabled to access TiDB Serverless according to this guide.
Run local Llama3 with Ollama. You can find the setup guide here.

Try TiDB Serverless with Vector Search

Join the waitlist for the private beta of built-in vector search.

Join Now

Step 1: Setting Up Your Environment

First, we need to install the necessary Python packages and configure our environment to connect to TiDB Serverless.

Install Packages

Open your terminal or command prompt and run the following commands to install the required packages:

pip install llama-index-vector-stores-tidbvector
pip install llama-index

Configure Database Connection

Set up your TiDB connection URL using environment variables for security:

import getpass

tidb_connection_url = getpass.getpass("Enter your TiDB connection URL (format - mysql+pymysql://root@host:port/database): ")

Step 2: Using Ollama as LLM

Set up Llama3 with Ollama, a model from the Llama family, as your LLM for processing queries and generating responses. Make sure you have run ollama run llama3 after starting the Ollama.

Configure LLM Settings

from llama_index.llms.ollama import Ollama
from llama_index.core import Settings

Settings.llm = Ollama(model="llama3", request_timeout=60.0)

Step 3: Creating a Vector Store in TiDB

Now, let’s create a table in TiDB optimized for vector searching. This will store our data and make it searchable via semantic vectors.

from llama_index.vector_stores.tidbvector import TiDBVectorStore

VECTOR_TABLE_NAME = "your_vector_table_name"
tidbvec = TiDBVectorStore(
    connection_string=tidb_connection_url,
    table_name=VECTOR_TABLE_NAME,
    distance_strategy="cosine",
    vector_dimension=1536,
    drop_existing_table=False
)

Step 4: Implementing RAG

Combine the retrieval capabilities of TiDB with the generation power of Ollama to answer queries.

Load Data and Create Index

Assuming you have some documents stored, load them and create an index:

from llama_index.core import SimpleDirectoryReader, StorageContext, VectorStoreIndex

documents = SimpleDirectoryReader("./your_data_directory").load_data()
storage_context = StorageContext.from_defaults(vector_store=tidbvec)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

Query and Generate Response

Use the created index to handle a query and generate a response using the Ollama model:

query_engine = index.as_query_engine()
response = query_engine.query("What did the author discuss?")
print(response)

Conclusion

Congratulations! You’ve just set up a complete RAG system using Llama3, Ollama, LlamaIndex, and TiDB Serverless. This setup allows you to leverage advanced AI capabilities for semantic search and response generation, making your applications smarter and more responsive. Continue to experiment with different configurations and datasets to fully explore the potential of these tools.

Start your journey with TiDB Serverless today and join the waitlist for TiDB Vector Search.

Join the Waitlist

Last updated May 28, 2024