{"id":17907,"date":"2024-06-26T08:09:47","date_gmt":"2024-06-26T15:09:47","guid":{"rendered":"https:\/\/www.pingcap.com\/?post_type=article&#038;p=17907"},"modified":"2024-06-26T08:09:50","modified_gmt":"2024-06-26T15:09:50","slug":"ai-powered-search-with-tidb-vector","status":"publish","type":"article","link":"https:\/\/www.pingcap.com\/ko\/article\/ai-powered-search-with-tidb-vector\/","title":{"rendered":"AI-Powered Search with TiDB Vector"},"content":{"rendered":"<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Introduction\"><\/span>Introduction<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>The evolution of AI has brought significant advancements in search technologies. Traditional keyword-based search is being increasingly replaced by AI-powered search, which leverages machine learning models to understand the semantic meaning of queries and data. <a href=\"\/ko\/ai\/\">TiDB Vector<\/a>, a feature of TiDB, offers a robust solution for implementing AI-powered search, enabling semantic search and similarity search for various data types such as text, images, and more.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"What_is_TiDB_Vector_Search\"><\/span>What is TiDB Vector Search?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>TiDB Vector Search is a powerful tool that allows you to perform searches based on the semantic meaning of data rather than just keywords. This is achieved by converting data into vector embeddings, which are then used to measure the similarity between different pieces of data.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Key Features:<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Semantic Search:<\/strong> Find data that has a similar meaning to your query.<\/li>\n\n\n\n<li><strong>Similarity Search:<\/strong> Identify similar items in large datasets, useful for recommendations and content discovery.<\/li>\n\n\n\n<li><strong>Versatile Data Support:<\/strong> Works with text, images, videos, audios, and more.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_TiDB_Vector_Search_Works\"><\/span>How TiDB Vector Search Works<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>TiDB Vector Search involves the following steps:<\/p>\n\n\n\n<ol class=\"wp-block-list\" start=\"1\">\n<li><strong>Embedding Generation:<\/strong> Convert data into vectors using embeddings. Embeddings represent data points in a multi-dimensional space.<\/li>\n\n\n\n<li><strong>Vector Storage:<\/strong> Store these vectors in TiDB, allowing for efficient querying and retrieval.<\/li>\n\n\n\n<li><strong>Similarity Search:<\/strong> Use vector distance metrics to find the nearest neighbors (most similar data points) to a query vector.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Setting_Up_TiDB_Vector_Search\"><\/span>Setting Up TiDB Vector Search<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>To get started with TiDB Vector Search, follow these steps:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Create a TiDB Serverless Cluster<\/h3>\n\n\n\n<ol class=\"wp-block-list\" start=\"1\">\n<li><strong>Sign Up:<\/strong><a href=\"https:\/\/tidbcloud.com\/free-trial\/\">Join TiDB Cloud<\/a>.<\/li>\n\n\n\n<li><strong>Select Region:<\/strong> Choose the <code>eu-central-1<\/code> region (currently supports vector search).<\/li>\n\n\n\n<li><strong>Create Cluster:<\/strong> Follow the <a href=\"https:\/\/docs.pingcap.com\/tidbcloud\/tidb-cloud-quickstart\">quickstart guide<\/a> to set up your cluster.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Enable Vector Search<\/h3>\n\n\n\n<p>If the vector search feature is not visible, contact the support team at xin.shi@pingcap.com to enable it for your account.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Connect to TiDB<\/h3>\n\n\n\n<ol class=\"wp-block-list\" start=\"1\">\n<li><strong>Navigate to Clusters:<\/strong> Go to the Clusters page and select your cluster.<\/li>\n\n\n\n<li><strong>Connect:<\/strong> Click on &#8220;Connect&#8221; and select &#8220;General&#8221; from the dropdown. Keep the endpoint type as Public.<\/li>\n\n\n\n<li><strong>Set Password:<\/strong> If not already set, create a password for your cluster.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Example_Semantic_Search_with_OpenAI_and_TiDB\"><\/span>Example: Semantic Search with OpenAI and TiDB<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Here&#8217;s a practical example of using OpenAI&#8217;s embeddings for semantic search with TiDB Vector.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Prerequisites:<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Python >= 3.6<\/li>\n\n\n\n<li>TiDB Serverless Cluster with vector support<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Setting Up Environment<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code># Create virtual environment\npython3 -m venv .venv\nsource .venv\/bin\/activate\n\n# Install dependencies\npip install openai peewee pymysql tidb_vector<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Example Code<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>import os\nfrom openai import OpenAI\nfrom peewee import Model, MySQLDatabase, TextField\nfrom tidb_vector.peewee import VectorField\n\n# Initialize OpenAI client\nclient = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))\nembedding_model = \"text-embedding-ada-002\"\nembedding_dimensions = 1536\n\n# Connect to TiDB\ndb = MySQLDatabase(\n    'test',\n    user=os.environ.get('TIDB_USERNAME'),\n    password=os.environ.get('TIDB_PASSWORD'),\n    host=os.environ.get('TIDB_HOST'),\n    port=4000,\n    ssl_verify_cert=True,\n    ssl_verify_identity=True\n)\n\n# Sample documents\ndocuments = &#91;\n    \"TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads.\",\n    \"TiFlash is the key component that makes TiDB essentially an HTAP database.\",\n    \"TiKV is a distributed and transactional key-value database, providing transactional APIs with ACID compliance.\"\n]\n\n# Define model\nclass DocModel(Model):\n    text = TextField()\n    embedding = VectorField(dimensions=embedding_dimensions)\n\n    class Meta:\n        database = db\n        table_name = \"doc_test\"\n\n# Setup database\ndb.connect()\ndb.drop_tables(&#91;DocModel])\ndb.create_tables(&#91;DocModel])\n\n# Generate embeddings\nembeddings = &#91;r.embedding for r in client.embeddings.create(input=documents, model=embedding_model).data]\ndata_source = &#91;{\"text\": doc, \"embedding\": emb} for doc, emb in zip(documents, embeddings)]\nDocModel.insert_many(data_source).execute()\n\n# Query example\nquestion = \"What is TiKV?\"\nquestion_embedding = client.embeddings.create(input=question, model=embedding_model).data&#91;0].embedding\nrelated_docs = DocModel.select(DocModel.text, DocModel.embedding.cosine_distance(question_embedding).alias(\"distance\")).order_by(SQL(\"distance\")).limit(3)\n\nprint(\"Question:\", question)\nprint(\"Related documents:\")\nfor doc in related_docs:\n    print(doc.distance, doc.text)\n\ndb.close()<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Conclusion\"><\/span>Conclusion<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>TiDB Vector Search provides a powerful platform for building AI-powered search applications. By leveraging vector embeddings and similarity search, you can implement advanced search capabilities that go beyond traditional keyword-based methods. Whether you&#8217;re dealing with text, images, or other types of data, TiDB Vector Search can help you unlock new possibilities for your applications.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Call to Action<\/h3>\n\n\n\n<p>Ready to explore TiDB Serverless and build your own AI-powered search applications? <a href=\"https:\/\/tidbcloud.com\/free-trial\/\">Get started with TiDB<\/a> and discover the power of semantic search today!<\/p>","protected":false},"excerpt":{"rendered":"<p>Introduction The evolution of AI has brought significant advancements in search technologies. Traditional keyword-based search is being increasingly replaced by AI-powered search, which leverages machine learning models to understand the semantic meaning of queries and data. TiDB Vector, a feature of TiDB, offers a robust solution for implementing AI-powered search, enabling semantic search and similarity [&hellip;]<\/p>\n","protected":false},"author":8,"featured_media":0,"template":"","class_list":["post-17907","article","type-article","status-publish","hentry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.9 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>AI-Powered Search with TiDB Vector<\/title>\n<meta name=\"description\" content=\"TiDB Vector Search is a powerful tool that allows you to perform AI searches based on the semantic meaning of data rather than just keywords.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.pingcap.com\/ko\/article\/ai-powered-search-with-tidb-vector\/\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"AI-Powered Search with TiDB Vector\" \/>\n<meta property=\"og:description\" content=\"TiDB Vector Search is a powerful tool that allows you to perform AI searches based on the semantic meaning of data rather than just keywords.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.pingcap.com\/ko\/article\/ai-powered-search-with-tidb-vector\/\" \/>\n<meta property=\"og:site_name\" content=\"TiDB\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/facebook.com\/pingcap2015\" \/>\n<meta property=\"article:modified_time\" content=\"2024-06-26T15:09:50+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/static.pingcap.com\/files\/2024\/09\/11005522\/Homepage-Ad.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1440\" \/>\n\t<meta property=\"og:image:height\" content=\"714\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@PingCAP\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"4\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.pingcap.com\/article\/ai-powered-search-with-tidb-vector\/\",\"url\":\"https:\/\/www.pingcap.com\/article\/ai-powered-search-with-tidb-vector\/\",\"name\":\"AI-Powered Search with TiDB Vector\",\"isPartOf\":{\"@id\":\"https:\/\/www.pingcap.com\/#website\"},\"datePublished\":\"2024-06-26T15:09:47+00:00\",\"dateModified\":\"2024-06-26T15:09:50+00:00\",\"description\":\"TiDB Vector Search is a powerful tool that allows you to perform AI searches based on the semantic meaning of data rather than just keywords.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.pingcap.com\/article\/ai-powered-search-with-tidb-vector\/#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.pingcap.com\/article\/ai-powered-search-with-tidb-vector\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.pingcap.com\/article\/ai-powered-search-with-tidb-vector\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.pingcap.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Articles\",\"item\":\"https:\/\/www.pingcap.com\/article\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"AI-Powered Search with TiDB Vector\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.pingcap.com\/#website\",\"url\":\"https:\/\/www.pingcap.com\/\",\"name\":\"TiDB\",\"description\":\"TiDB | SQL at Scale\",\"publisher\":{\"@id\":\"https:\/\/www.pingcap.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.pingcap.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.pingcap.com\/#organization\",\"name\":\"PingCAP\",\"url\":\"https:\/\/www.pingcap.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png\",\"contentUrl\":\"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png\",\"width\":811,\"height\":232,\"caption\":\"PingCAP\"},\"image\":{\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/facebook.com\/pingcap2015\",\"https:\/\/x.com\/PingCAP\",\"https:\/\/linkedin.com\/company\/pingcap\",\"https:\/\/youtube.com\/channel\/UCuq4puT32DzHKT5rU1IZpIA\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"AI-Powered Search with TiDB Vector","description":"TiDB Vector Search is a powerful tool that allows you to perform AI searches based on the semantic meaning of data rather than just keywords.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.pingcap.com\/ko\/article\/ai-powered-search-with-tidb-vector\/","og_locale":"ko_KR","og_type":"article","og_title":"AI-Powered Search with TiDB Vector","og_description":"TiDB Vector Search is a powerful tool that allows you to perform AI searches based on the semantic meaning of data rather than just keywords.","og_url":"https:\/\/www.pingcap.com\/ko\/article\/ai-powered-search-with-tidb-vector\/","og_site_name":"TiDB","article_publisher":"https:\/\/facebook.com\/pingcap2015","article_modified_time":"2024-06-26T15:09:50+00:00","og_image":[{"width":1440,"height":714,"url":"https:\/\/static.pingcap.com\/files\/2024\/09\/11005522\/Homepage-Ad.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_site":"@PingCAP","twitter_misc":{"Est. reading time":"4\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.pingcap.com\/article\/ai-powered-search-with-tidb-vector\/","url":"https:\/\/www.pingcap.com\/article\/ai-powered-search-with-tidb-vector\/","name":"AI-Powered Search with TiDB Vector","isPartOf":{"@id":"https:\/\/www.pingcap.com\/#website"},"datePublished":"2024-06-26T15:09:47+00:00","dateModified":"2024-06-26T15:09:50+00:00","description":"TiDB Vector Search is a powerful tool that allows you to perform AI searches based on the semantic meaning of data rather than just keywords.","breadcrumb":{"@id":"https:\/\/www.pingcap.com\/article\/ai-powered-search-with-tidb-vector\/#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.pingcap.com\/article\/ai-powered-search-with-tidb-vector\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.pingcap.com\/article\/ai-powered-search-with-tidb-vector\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.pingcap.com\/"},{"@type":"ListItem","position":2,"name":"Articles","item":"https:\/\/www.pingcap.com\/article\/"},{"@type":"ListItem","position":3,"name":"AI-Powered Search with TiDB Vector"}]},{"@type":"WebSite","@id":"https:\/\/www.pingcap.com\/#website","url":"https:\/\/www.pingcap.com\/","name":"\ud2f0DB","description":"TiDB | SQL at Scale","publisher":{"@id":"https:\/\/www.pingcap.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.pingcap.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/www.pingcap.com\/#organization","name":"PingCAP","url":"https:\/\/www.pingcap.com\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/","url":"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png","contentUrl":"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png","width":811,"height":232,"caption":"PingCAP"},"image":{"@id":"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/facebook.com\/pingcap2015","https:\/\/x.com\/PingCAP","https:\/\/linkedin.com\/company\/pingcap","https:\/\/youtube.com\/channel\/UCuq4puT32DzHKT5rU1IZpIA"]}]}},"card_markup":"        <a class=\"card-article\" href=\"https:\/\/www.pingcap.com\/ko\/article\/ai-powered-search-with-tidb-vector\/\">            <h3>AI-Powered Search with TiDB Vector<\/h3>            <p>Introduction The evolution of AI has brought significant advancements in search technologies. Traditional keyword-based search is being increasingly replaced by AI-powered search, which leverages machine learning models to understand the semantic meaning of queries and data. TiDB Vector, a feature of TiDB, offers a robust solution for implementing AI-powered search, enabling semantic search and similarity [&hellip;]<\/p>        <\/a>","_links":{"self":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/article\/17907","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/article"}],"about":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/types\/article"}],"author":[{"embeddable":true,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/users\/8"}],"wp:attachment":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/media?parent=17907"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}