{"id":17460,"date":"2024-06-03T08:48:21","date_gmt":"2024-06-03T15:48:21","guid":{"rendered":"https:\/\/www.pingcap.com\/?post_type=article&#038;p=17460"},"modified":"2024-06-03T08:48:25","modified_gmt":"2024-06-03T15:48:25","slug":"introduce-vector-search-indexes-in-tidb","status":"publish","type":"article","link":"https:\/\/www.pingcap.com\/ko\/article\/introduce-vector-search-indexes-in-tidb\/","title":{"rendered":"Introduce Vector Search Indexes in TiDB &#8211; A MySQL-compatible database with built-in Vector Storage"},"content":{"rendered":"<p><a href=\"\/ko\/tidb\/\">\ud2f0DB<\/a>, a MySQL-compatible database, has introduced a powerful feature for handling high-dimensional data: Vector Search Indexes. This post will explore how TiDB implements these indexes using the Hierarchical Navigable Small World (HNSW) method, and how they can be utilized for efficient nearest neighbor searches.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"What_are_Vector_Search_Indexes\"><\/span>What are Vector Search Indexes?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Vector Search Indexes are designed to facilitate efficient approximate nearest neighbor (ANN) searches in a vector space. This is particularly useful for applications involving high-dimensional data like image recognition, recommendation systems, and natural language processing. TiDB&#8217;s implementation allows such queries to be completed in milliseconds, vastly improving performance over traditional brute force methods.<\/p>\n\n\n\n<p class=\"has-medium-font-size\"><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-vivid-cyan-blue-color\"><strong><em>Join the waitlist for the private beta of built-in vector search<\/em> <em>in TiDB Serverless.<\/em><\/strong><\/mark><\/p>\n\n\n\n<p><a href=\"https:\/\/tidb.cloud\/ai\/\" class=\"button\" target=\"_blank\" data-gtag=\"event:go_to_lead_form_page,product_type:serverless,button_name:Join Now,position:blog_middle\" rel=\"noopener\">Join Now<\/a><br><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Creating_a_HNSW_Vector_Index_in_TiDB\"><\/span>Creating a HNSW Vector Index in TiDB<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>TiDB supports the creation of HNSW Vector Indexes using the following SQL syntax:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>CREATE TABLE vector_table_with_index (\n    id INT PRIMARY KEY,\n    doc TEXT,\n    embedding VECTOR(3) COMMENT \"hnsw(distance=cosine)\"\n);<\/code><\/pre>\n\n\n\n<p>Note: The syntax for creating the HNSW Index may change in future releases. It is crucial to specify the distance metric (e.g., cosine or L2) when creating the vector index.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Limitations and Compatibility<\/h3>\n\n\n\n<p>Currently, TiDB only supports creating vector indexes with L2 and cosine distances during the table creation. The ability to add or drop vector indexes using DDL commands post-creation is not available yet but is planned for future updates.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Utilizing Vector Indexes<\/h3>\n\n\n\n<p>Vector Indexes can be used in SQL queries to perform k-nearest neighbor searches. Here\u2019s an example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT *\nFROM vector_table_with_index\nORDER BY Vec_Cosine_Distance(embedding, '&#91;1, 2, 3]')\nLIMIT 10;<\/code><\/pre>\n\n\n\n<p>It&#8217;s important to use the same distance metric defined when creating the index to leverage its benefits fully.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Integration with ORMs<\/h3>\n\n\n\n<p>TiDB provides support for various Python ORMs, enabling easier integration into applications:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>TiDB Vector Client for Python<\/strong>: <a href=\"https:\/\/github.com\/pingcap\/tidb-vector-python\">GitHub Link<\/a><\/li>\n\n\n\n<li><strong>SQLAlchemy<\/strong>: <a href=\"https:\/\/github.com\/pingcap\/tidb-vector-python?tab=readme-ov-file#sqlalchemy\">GitHub Link<\/a><\/li>\n\n\n\n<li><strong>Peewee<\/strong>: <a href=\"https:\/\/github.com\/pingcap\/tidb-vector-python?tab=readme-ov-file#peewee\">GitHub Link<\/a><\/li>\n\n\n\n<li><strong>Django<\/strong>: <a href=\"https:\/\/github.com\/pingcap\/django-tidb?tab=readme\">GitHub Link<\/a><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Performance_Analysis\"><\/span>Performance Analysis<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>To analyze the performance and ensure the Vector Index is being used, you can use the EXPLAIN or EXPLAIN ANALYZE statements in TiDB:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>EXPLAIN SELECT * FROM vector_table_with_index\nORDER BY Vec_Cosine_Distance(embedding, '&#91;1, 2, 3]')\nLIMIT 10;<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Best_Practices\"><\/span>Best Practices<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>To ensure optimal performance, especially when indexes are &#8220;cold&#8221; (not recently accessed), it&#8217;s recommended to &#8220;warm up&#8221; the index by running similar queries beforehand. Additionally, managing the data set size by using fewer dimensions or compression techniques can help maintain high performance.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Conclusion\"><\/span>Conclusion<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Vector Search Indexes in TiDB offer a robust solution for efficiently handling complex queries involving high-dimensional data. By leveraging these indexes, developers can significantly enhance the performance of their applications, making real-time data interaction more feasible.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Real Demos of TiDB Vector Search<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/github.com\/pingcap\/tidb-vector-python\/blob\/main\/examples\/openai_embedding\/README.md\">OpenAI Embedding<\/a>: use the OpenAI embedding model to generate vectors for text data.<\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/pingcap\/tidb-vector-python\/blob\/main\/examples\/image_search\/README.md\">Image Search<\/a>: use the OpenAI CLIP model to generate vectors for image and text.<\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/pingcap\/tidb-vector-python\/blob\/main\/examples\/llamaindex-tidb-vector-with-ui\/README.md\">LlamaIndex RAG with UI<\/a>: use the LlamaIndex to build an <a href=\"https:\/\/docs.llamaindex.ai\/en\/latest\/getting_started\/concepts\/\">RAG(Retrieval-Augmented Generation)<\/a> application.<\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/pingcap\/tidb-vector-python\/blob\/main\/examples\/llamaindex-tidb-vector\/README.md\">Chat with URL<\/a>: use LlamaIndex to build an <a href=\"https:\/\/docs.llamaindex.ai\/en\/latest\/getting_started\/concepts\/\">RAG(Retrieval-Augmented Generation)<\/a> application that can chat with a URL.<\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/pingcap\/tidb-vector-python\/blob\/main\/examples\/graphrag-demo\/README.md\">GraphRAG<\/a>: 20 lines code of using TiDB Serverless to build a Knowledge Graph based RAG application.<\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/pingcap\/tidb-vector-python\/blob\/main\/examples\/graphrag-step-by-step-tutorial\/README.md\">GraphRAG Step by Step Tutorial<\/a>: Step by step tutorial to build a Knowledge Graph based RAG application with Colab notebook. In this tutorial, you will learn how to extract knowledge from a text corpus, build a Knowledge Graph, store the Knowledge Graph in TiDB Serverless, and search from the Knowledge Graph.<\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>TiDB, a MySQL-compatible database, has introduced a powerful feature for handling high-dimensional data: Vector Search Indexes. This post will explore how TiDB implements these indexes using the Hierarchical Navigable Small World (HNSW) method, and how they can be utilized for efficient nearest neighbor searches. What are Vector Search Indexes? Vector Search Indexes are designed to [&hellip;]<\/p>\n","protected":false},"author":8,"featured_media":0,"template":"","class_list":["post-17460","article","type-article","status-publish","hentry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.9 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Introduce Vector Search Indexes in TiDB<\/title>\n<meta name=\"description\" content=\"Explore how TiDB implements vector search indexes using HNSW, and how they can be utilized for efficient nearest neighbor searches.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.pingcap.com\/ko\/article\/introduce-vector-search-indexes-in-tidb\/\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Introduce Vector Search Indexes in TiDB\" \/>\n<meta property=\"og:description\" content=\"Explore how TiDB implements vector search indexes using HNSW, and how they can be utilized for efficient nearest neighbor searches.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.pingcap.com\/ko\/article\/introduce-vector-search-indexes-in-tidb\/\" \/>\n<meta property=\"og:site_name\" content=\"TiDB\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/facebook.com\/pingcap2015\" \/>\n<meta property=\"article:modified_time\" content=\"2024-06-03T15:48:25+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/static.pingcap.com\/files\/2024\/09\/11005522\/Homepage-Ad.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1440\" \/>\n\t<meta property=\"og:image:height\" content=\"714\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@PingCAP\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"3\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.pingcap.com\/article\/introduce-vector-search-indexes-in-tidb\/\",\"url\":\"https:\/\/www.pingcap.com\/article\/introduce-vector-search-indexes-in-tidb\/\",\"name\":\"Introduce Vector Search Indexes in TiDB\",\"isPartOf\":{\"@id\":\"https:\/\/www.pingcap.com\/#website\"},\"datePublished\":\"2024-06-03T15:48:21+00:00\",\"dateModified\":\"2024-06-03T15:48:25+00:00\",\"description\":\"Explore how TiDB implements vector search indexes using HNSW, and how they can be utilized for efficient nearest neighbor searches.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.pingcap.com\/article\/introduce-vector-search-indexes-in-tidb\/#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.pingcap.com\/article\/introduce-vector-search-indexes-in-tidb\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.pingcap.com\/article\/introduce-vector-search-indexes-in-tidb\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.pingcap.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Articles\",\"item\":\"https:\/\/www.pingcap.com\/article\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Introduce Vector Search Indexes in TiDB &#8211; A MySQL-compatible database with built-in Vector Storage\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.pingcap.com\/#website\",\"url\":\"https:\/\/www.pingcap.com\/\",\"name\":\"TiDB\",\"description\":\"TiDB | SQL at Scale\",\"publisher\":{\"@id\":\"https:\/\/www.pingcap.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.pingcap.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.pingcap.com\/#organization\",\"name\":\"PingCAP\",\"url\":\"https:\/\/www.pingcap.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png\",\"contentUrl\":\"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png\",\"width\":811,\"height\":232,\"caption\":\"PingCAP\"},\"image\":{\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/facebook.com\/pingcap2015\",\"https:\/\/x.com\/PingCAP\",\"https:\/\/linkedin.com\/company\/pingcap\",\"https:\/\/youtube.com\/channel\/UCuq4puT32DzHKT5rU1IZpIA\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Introduce Vector Search Indexes in TiDB","description":"Explore how TiDB implements vector search indexes using HNSW, and how they can be utilized for efficient nearest neighbor searches.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.pingcap.com\/ko\/article\/introduce-vector-search-indexes-in-tidb\/","og_locale":"ko_KR","og_type":"article","og_title":"Introduce Vector Search Indexes in TiDB","og_description":"Explore how TiDB implements vector search indexes using HNSW, and how they can be utilized for efficient nearest neighbor searches.","og_url":"https:\/\/www.pingcap.com\/ko\/article\/introduce-vector-search-indexes-in-tidb\/","og_site_name":"TiDB","article_publisher":"https:\/\/facebook.com\/pingcap2015","article_modified_time":"2024-06-03T15:48:25+00:00","og_image":[{"width":1440,"height":714,"url":"https:\/\/static.pingcap.com\/files\/2024\/09\/11005522\/Homepage-Ad.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_site":"@PingCAP","twitter_misc":{"Est. reading time":"3\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.pingcap.com\/article\/introduce-vector-search-indexes-in-tidb\/","url":"https:\/\/www.pingcap.com\/article\/introduce-vector-search-indexes-in-tidb\/","name":"Introduce Vector Search Indexes in TiDB","isPartOf":{"@id":"https:\/\/www.pingcap.com\/#website"},"datePublished":"2024-06-03T15:48:21+00:00","dateModified":"2024-06-03T15:48:25+00:00","description":"Explore how TiDB implements vector search indexes using HNSW, and how they can be utilized for efficient nearest neighbor searches.","breadcrumb":{"@id":"https:\/\/www.pingcap.com\/article\/introduce-vector-search-indexes-in-tidb\/#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.pingcap.com\/article\/introduce-vector-search-indexes-in-tidb\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.pingcap.com\/article\/introduce-vector-search-indexes-in-tidb\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.pingcap.com\/"},{"@type":"ListItem","position":2,"name":"Articles","item":"https:\/\/www.pingcap.com\/article\/"},{"@type":"ListItem","position":3,"name":"Introduce Vector Search Indexes in TiDB &#8211; A MySQL-compatible database with built-in Vector Storage"}]},{"@type":"WebSite","@id":"https:\/\/www.pingcap.com\/#website","url":"https:\/\/www.pingcap.com\/","name":"\ud2f0DB","description":"TiDB | SQL at Scale","publisher":{"@id":"https:\/\/www.pingcap.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.pingcap.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/www.pingcap.com\/#organization","name":"PingCAP","url":"https:\/\/www.pingcap.com\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/","url":"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png","contentUrl":"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png","width":811,"height":232,"caption":"PingCAP"},"image":{"@id":"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/facebook.com\/pingcap2015","https:\/\/x.com\/PingCAP","https:\/\/linkedin.com\/company\/pingcap","https:\/\/youtube.com\/channel\/UCuq4puT32DzHKT5rU1IZpIA"]}]}},"card_markup":"        <a class=\"card-article\" href=\"https:\/\/www.pingcap.com\/ko\/article\/introduce-vector-search-indexes-in-tidb\/\">            <h3>Introduce Vector Search Indexes in TiDB &#8211; A MySQL-compatible database with built-in Vector Storage<\/h3>            <p>TiDB, a MySQL-compatible database, has introduced a powerful feature for handling high-dimensional data: Vector Search Indexes. This post will explore how TiDB implements these indexes using the Hierarchical Navigable Small World (HNSW) method, and how they can be utilized for efficient nearest neighbor searches. What are Vector Search Indexes? Vector Search Indexes are designed to [&hellip;]<\/p>        <\/a>","_links":{"self":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/article\/17460","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/article"}],"about":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/types\/article"}],"author":[{"embeddable":true,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/users\/8"}],"wp:attachment":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/media?parent=17460"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}