{"id":17897,"date":"2024-06-26T05:22:55","date_gmt":"2024-06-26T12:22:55","guid":{"rendered":"https:\/\/www.pingcap.com\/?post_type=article&#038;p=17897"},"modified":"2024-06-26T05:22:58","modified_gmt":"2024-06-26T12:22:58","slug":"benchmarking-llama-3-with-tidb-vector-search","status":"publish","type":"article","link":"https:\/\/www.pingcap.com\/ko\/article\/benchmarking-llama-3-with-tidb-vector-search\/","title":{"rendered":"Benchmarking Llama 3 with TiDB Vector Search"},"content":{"rendered":"<p>As artificial intelligence models continue to evolve, evaluating their performance through rigorous benchmarking becomes crucial. Llama 3, a state-of-the-art language model, is no exception. This article explores the benchmarking of Llama 3, utilizing TiDB Vector Search to enhance the accuracy and efficiency of semantic search capabilities.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Understanding_Llama_3\"><\/span>Understanding Llama 3<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Llama 3 is designed to excel in natural language understanding and generation tasks. Its architecture leverages advanced transformer models, enabling it to process and generate human-like text based on the context provided.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"The_Role_of_TiDB_Vector_Search_in_Benchmarking\"><\/span>The Role of TiDB Vector Search in Benchmarking<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Benchmarking Llama 3 requires a robust system to handle large volumes of data and perform high-speed searches. TiDB Vector Search provides an optimal solution with its ability to store and search vector embeddings. This capability ensures that semantic searches, crucial for benchmarking language models, are performed efficiently and accurately.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Setting_Up_TiDB_Vector_Search\"><\/span>Setting Up TiDB Vector Search<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>To benchmark Llama 3 with TiDB Vector Search, follow these steps:<\/p>\n\n\n\n<ol class=\"wp-block-list\" start=\"1\">\n<li><strong>Sign Up and Create a Cluster<\/strong>:\n<ul class=\"wp-block-list\">\n<li>Sign up on <a href=\"https:\/\/tidbcloud.com\/free-trial\/\">tidbcloud<\/a>.<\/li>\n\n\n\n<li>Select the EU-Central-1 region and create a TiDB Serverless cluster with vector support.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Connect to Your Cluster<\/strong>:\n<ul class=\"wp-block-list\">\n<li>Navigate to the Clusters page, select your target cluster, and click &#8220;Connect&#8221;.<\/li>\n\n\n\n<li>Use the connection dialog to set up your connection parameters.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Create Tables and Insert Data<\/strong>:\n<ul class=\"wp-block-list\">\n<li>Create a table with a vector field to store embeddings:<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n<pre class=\"wp-block-code\"><code>CREATE TABLE benchmark_table (id INT PRIMARY KEY, text TEXT, embedding VECTOR(1536));\nINSERT INTO benchmark_table VALUES (1, 'Sample text 1', '&#91;0.1, 0.2, ...]'), (2, 'Sample text 2', '&#91;0.2, 0.1, ...]');<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Benchmarking Process<\/h3>\n\n\n\n<ol class=\"wp-block-list\" start=\"1\">\n<li><strong>Generate Embeddings with Llama 3<\/strong>:\n<ul class=\"wp-block-list\">\n<li>Use Llama 3 to generate vector embeddings for your benchmark dataset. This dataset should include a variety of texts to comprehensively evaluate the model&#8217;s performance.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Store Embeddings in TiDB<\/strong>:\n<ul class=\"wp-block-list\">\n<li>Insert the generated embeddings into the <code>benchmark_table<\/code> in your TiDB cluster.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Perform Semantic Searches<\/strong>:\n<ul class=\"wp-block-list\">\n<li>Use TiDB Vector Search to perform semantic searches on the stored embeddings<\/li>\n\n\n\n<li>Measure the response time and accuracy of the search results to evaluate Llama 3\u2019s performance.<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT * FROM benchmark_table ORDER BY vec_cosine_distance(embedding, '&#91;query_embedding]') LIMIT 10;<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Performance_Metrics\"><\/span>Performance Metrics<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>To effectively benchmark Llama 3, consider the following performance metrics:<\/p>\n\n\n\n<ol class=\"wp-block-list\" start=\"1\">\n<li><strong>Accuracy<\/strong>:\n<ul class=\"wp-block-list\">\n<li>Measure how well the search results match the expected outcomes. This can be evaluated using precision, recall, and F1 score metrics.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Latency<\/strong>:\n<ul class=\"wp-block-list\">\n<li>Record the time taken to perform searches. Lower latency indicates better performance in real-time applications.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>\ud655\uc7a5\uc131<\/strong>:\n<ul class=\"wp-block-list\">\n<li>Assess how the system performs with increasing data volumes. TiDB\u2019s distributed architecture should maintain performance as data scales.<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Example_Benchmarking_Code\"><\/span>Example Benchmarking Code<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Here\u2019s a sample Python script to benchmark Llama 3 using TiDB Vector Search:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import os\nfrom openai import OpenAI\nfrom peewee import Model, MySQLDatabase, TextField, SQL\nfrom tidb_vector.peewee import VectorField\n\n# Connect to TiDB\ndb = MySQLDatabase('benchmark', user=os.environ.get('TIDB_USERNAME'), password=os.environ.get('TIDB_PASSWORD'), host=os.environ.get('TIDB_HOST'), port=4000)\ndb.connect()\n\n# Define model\nclass BenchmarkModel(Model):\n    text = TextField()\n    embedding = VectorField(dimensions=1536)\n    class Meta:\n        database = db\n        table_name = \"benchmark_table\"\n\n# Create table and insert data\ndb.create_tables(&#91;BenchmarkModel])\n\n# Use Llama 3 to generate embeddings\nclient = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))\ndocuments = &#91;\"Sample text 1\", \"Sample text 2\", \"Sample text 3\"]\nembeddings = &#91;r.embedding for r in client.embeddings.create(input=documents, model=\"text-embedding-3-small\").data]\n\n# Insert embeddings into TiDB\ndata_source = &#91;{\"text\": doc, \"embedding\": emb} for doc, emb in zip(documents, embeddings)]\nBenchmarkModel.insert_many(data_source).execute()\n\n# Perform a search\nquery_embedding = client.embeddings.create(input=\"Query text\", model=\"text-embedding-3-small\").data&#91;0].embedding\nresults = BenchmarkModel.select(BenchmarkModel.text, BenchmarkModel.embedding.cosine_distance(query_embedding).alias(\"distance\")).order_by(SQL(\"distance\")).limit(10)\n\n# Display results\nfor result in results:\n    print(result.text, result.distance)\n\ndb.close()<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Conclusion\"><\/span>Conclusion<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Benchmarking Llama 3 with TiDB Vector Search provides valuable insights into the model&#8217;s performance in real-world scenarios. By leveraging the power of vector embeddings and efficient search capabilities, you can ensure that your AI applications are both accurate and responsive. Start your benchmarking journey with TiDB Vector Search today by visiting <a href=\"https:\/\/tidbcloud.com\/free-trial\/\">TiDB Cloud<\/a> and explore its potential for your AI projects.<\/p>","protected":false},"excerpt":{"rendered":"<p>As artificial intelligence models continue to evolve, evaluating their performance through rigorous benchmarking becomes crucial. Llama 3, a state-of-the-art language model, is no exception. This article explores the benchmarking of Llama 3, utilizing TiDB Vector Search to enhance the accuracy and efficiency of semantic search capabilities. Understanding Llama 3 Llama 3 is designed to excel [&hellip;]<\/p>\n","protected":false},"author":8,"featured_media":0,"template":"","class_list":["post-17897","article","type-article","status-publish","hentry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.9 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Benchmarking Llama 3 with TiDB Vector Search<\/title>\n<meta name=\"description\" content=\"Explore the benchmarking of Llama 3, utilizing TiDB Vector Search to enhance the accuracy and efficiency of semantic search capabilities.\" \/>\n<meta name=\"robots\" content=\"noindex, follow\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Benchmarking Llama 3 with TiDB Vector Search\" \/>\n<meta property=\"og:description\" content=\"Explore the benchmarking of Llama 3, utilizing TiDB Vector Search to enhance the accuracy and efficiency of semantic search capabilities.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.pingcap.com\/ko\/article\/benchmarking-llama-3-with-tidb-vector-search\/\" \/>\n<meta property=\"og:site_name\" content=\"TiDB\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/facebook.com\/pingcap2015\" \/>\n<meta property=\"article:modified_time\" content=\"2024-06-26T12:22:58+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/static.pingcap.com\/files\/2024\/09\/11005522\/Homepage-Ad.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1440\" \/>\n\t<meta property=\"og:image:height\" content=\"714\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@PingCAP\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"3\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.pingcap.com\/article\/benchmarking-llama-3-with-tidb-vector-search\/\",\"url\":\"https:\/\/www.pingcap.com\/article\/benchmarking-llama-3-with-tidb-vector-search\/\",\"name\":\"Benchmarking Llama 3 with TiDB Vector Search\",\"isPartOf\":{\"@id\":\"https:\/\/www.pingcap.com\/#website\"},\"datePublished\":\"2024-06-26T12:22:55+00:00\",\"dateModified\":\"2024-06-26T12:22:58+00:00\",\"description\":\"Explore the benchmarking of Llama 3, utilizing TiDB Vector Search to enhance the accuracy and efficiency of semantic search capabilities.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.pingcap.com\/article\/benchmarking-llama-3-with-tidb-vector-search\/#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.pingcap.com\/article\/benchmarking-llama-3-with-tidb-vector-search\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.pingcap.com\/article\/benchmarking-llama-3-with-tidb-vector-search\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.pingcap.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Articles\",\"item\":\"https:\/\/www.pingcap.com\/article\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Benchmarking Llama 3 with TiDB Vector Search\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.pingcap.com\/#website\",\"url\":\"https:\/\/www.pingcap.com\/\",\"name\":\"TiDB\",\"description\":\"TiDB | SQL at Scale\",\"publisher\":{\"@id\":\"https:\/\/www.pingcap.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.pingcap.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.pingcap.com\/#organization\",\"name\":\"PingCAP\",\"url\":\"https:\/\/www.pingcap.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png\",\"contentUrl\":\"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png\",\"width\":811,\"height\":232,\"caption\":\"PingCAP\"},\"image\":{\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/facebook.com\/pingcap2015\",\"https:\/\/x.com\/PingCAP\",\"https:\/\/linkedin.com\/company\/pingcap\",\"https:\/\/youtube.com\/channel\/UCuq4puT32DzHKT5rU1IZpIA\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Benchmarking Llama 3 with TiDB Vector Search","description":"Explore the benchmarking of Llama 3, utilizing TiDB Vector Search to enhance the accuracy and efficiency of semantic search capabilities.","robots":{"index":"noindex","follow":"follow"},"og_locale":"ko_KR","og_type":"article","og_title":"Benchmarking Llama 3 with TiDB Vector Search","og_description":"Explore the benchmarking of Llama 3, utilizing TiDB Vector Search to enhance the accuracy and efficiency of semantic search capabilities.","og_url":"https:\/\/www.pingcap.com\/ko\/article\/benchmarking-llama-3-with-tidb-vector-search\/","og_site_name":"TiDB","article_publisher":"https:\/\/facebook.com\/pingcap2015","article_modified_time":"2024-06-26T12:22:58+00:00","og_image":[{"width":1440,"height":714,"url":"https:\/\/static.pingcap.com\/files\/2024\/09\/11005522\/Homepage-Ad.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_site":"@PingCAP","twitter_misc":{"Est. reading time":"3\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.pingcap.com\/article\/benchmarking-llama-3-with-tidb-vector-search\/","url":"https:\/\/www.pingcap.com\/article\/benchmarking-llama-3-with-tidb-vector-search\/","name":"Benchmarking Llama 3 with TiDB Vector Search","isPartOf":{"@id":"https:\/\/www.pingcap.com\/#website"},"datePublished":"2024-06-26T12:22:55+00:00","dateModified":"2024-06-26T12:22:58+00:00","description":"Explore the benchmarking of Llama 3, utilizing TiDB Vector Search to enhance the accuracy and efficiency of semantic search capabilities.","breadcrumb":{"@id":"https:\/\/www.pingcap.com\/article\/benchmarking-llama-3-with-tidb-vector-search\/#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.pingcap.com\/article\/benchmarking-llama-3-with-tidb-vector-search\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.pingcap.com\/article\/benchmarking-llama-3-with-tidb-vector-search\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.pingcap.com\/"},{"@type":"ListItem","position":2,"name":"Articles","item":"https:\/\/www.pingcap.com\/article\/"},{"@type":"ListItem","position":3,"name":"Benchmarking Llama 3 with TiDB Vector Search"}]},{"@type":"WebSite","@id":"https:\/\/www.pingcap.com\/#website","url":"https:\/\/www.pingcap.com\/","name":"\ud2f0DB","description":"TiDB | SQL at Scale","publisher":{"@id":"https:\/\/www.pingcap.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.pingcap.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/www.pingcap.com\/#organization","name":"PingCAP","url":"https:\/\/www.pingcap.com\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/","url":"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png","contentUrl":"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png","width":811,"height":232,"caption":"PingCAP"},"image":{"@id":"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/facebook.com\/pingcap2015","https:\/\/x.com\/PingCAP","https:\/\/linkedin.com\/company\/pingcap","https:\/\/youtube.com\/channel\/UCuq4puT32DzHKT5rU1IZpIA"]}]}},"card_markup":"        <a class=\"card-article\" href=\"https:\/\/www.pingcap.com\/ko\/article\/benchmarking-llama-3-with-tidb-vector-search\/\">            <h3>Benchmarking Llama 3 with TiDB Vector Search<\/h3>            <p>As artificial intelligence models continue to evolve, evaluating their performance through rigorous benchmarking becomes crucial. Llama 3, a state-of-the-art language model, is no exception. This article explores the benchmarking of Llama 3, utilizing TiDB Vector Search to enhance the accuracy and efficiency of semantic search capabilities. Understanding Llama 3 Llama 3 is designed to excel [&hellip;]<\/p>        <\/a>","_links":{"self":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/article\/17897","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/article"}],"about":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/types\/article"}],"author":[{"embeddable":true,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/users\/8"}],"wp:attachment":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/media?parent=17897"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}