{"id":23471,"date":"2024-11-22T13:00:19","date_gmt":"2024-11-22T21:00:19","guid":{"rendered":"https:\/\/www.pingcap.com\/?p=23471"},"modified":"2025-02-17T00:15:07","modified_gmt":"2025-02-17T08:15:07","slug":"managing-large-transactions-pipelined-dml-tidb","status":"publish","type":"post","link":"https:\/\/www.pingcap.com\/ko\/blog\/managing-large-transactions-pipelined-dml-tidb\/","title":{"rendered":"Pipelined DML in TiDB: A Breakthrough for Managing Large Transactions"},"content":{"rendered":"<p>Managing large transactions in distributed databases has always been a tough nut to crack. From processing millions of records during migrations to tackling complex workflows like ETL (Extract, Transform, Load), businesses need database solutions that are not only fast but also reliable.<\/p>\n\n\n\n<p><a href=\"https:\/\/www.pingcap.com\/ko\/tidb-self-managed\/\">\ud2f0DB<\/a> has long been at the forefront of <a href=\"https:\/\/www.pingcap.com\/ko\/blog\/why-distributed-sql-databases-elevate-modern-app-dev\/\">distributed SQL database<\/a> innovation, and with Pipelined DML in TiDB 8.1, we\u2019re raising the bar once again. This new feature, currently available as an experimental capability, redefines how large transactions are handled, introducing a seamless, memory-efficient approach that makes scaling up easier than ever.<\/p>\n\n\n\n<p>In this post, we\u2019ll dive into why we built Pipelined DML, how it works, and the transformational benefits it brings to managing modern data workloads.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Tackling_Large_Transactions_Challenges_and_Opportunities\"><\/span>Tackling Large Transactions: Challenges and Opportunities<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Large-scale transactions are the backbone of many critical operations\u2014think bulk data updates, system migrations, or ETL workflows where millions of rows need processing. While TiDB excels as a distributed SQL database, handling such transactions at scale brought two significant challenges:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Memory Limits:<\/strong> Before TiDB 8.1, all transaction mutations were held in memory throughout the transaction\u2019s lifecycle. For operations touching millions of rows, this could lead to high memory usage and, in some cases, Out of Memory (OOM) errors if there weren\u2019t available resources.<\/li>\n\n\n\n<li><strong>Performance Slowdowns:<\/strong> Managing large in-memory buffers relied on red-black trees, introducing computational overhead. As buffers grew, their operations slowed due to the O(NlogN) complexity inherent in these structures.<\/li>\n<\/ul>\n\n\n\n<p>Take a common example: Archiving historical sales data into a separate table:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>INSERT INTO sales_archive \nSELECT * FROM sales \nWHERE sale_date &lt; '2023-01-01';<\/code><\/pre>\n\n\n\n<p>In such scenarios, holding millions of rows in memory until the transaction commits not only strains resources but also impacts speed. These challenges highlighted a clear opportunity to improve scalability, reduce complexity, and enhance reliability. With the rise of modern data workloads, the TiDB team developed a bold solution: Pipelined DML, designed to transform how large transactions are handled.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Existing Workarounds: Steps Toward a Complete Solution for Large Transactions<\/h3>\n\n\n\n<p>Before Pipelined DML was available, TiDB implemented several workarounds to bypass the large transaction problem. These interim solutions, while helpful in specific scenarios, came with trade-offs:<\/p>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li><strong>Batch<\/strong><strong>&#8211;<\/strong><strong>DML <\/strong>(now deprecated)<strong>:<\/strong> Introduced in TiDB v4.0, Batch-DML allowed splitting transactions into smaller parts for individual commits, enabling support for larger data operations. However, as TiDB has evolved with more robust solutions, Batch-DML has been deprecated to ensure higher reliability and maintain data integrity. It is no longer recommended for use.<\/li>\n\n\n\n<li><strong><a href=\"https:\/\/docs.pingcap.com\/tidb\/stable\/non-transactional-dml\">Non-Transactional DML<\/a><\/strong><strong>:<\/strong> The feature was first introduced in v6.1 and became GA in v6.5. TiDB splits a statement into multiple ones and executes them in sequence. It is safe for data integrity, but lacks atomicity and requires users to modify their statements, which often creates additional complexity for users.<\/li>\n<\/ol>\n\n\n\n<p>These methods showcased TiDB\u2019s flexibility and focus on user needs, even under challenging circumstances. However, they also underscored the need for a more seamless, built-in solution that is <strong>atomic, performant, and easy to use<\/strong>. This realization paved the way for Pipelined DML.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Pipelined_DML_Revolutionizing_Large_Transactions\"><\/span>Pipelined DML: Revolutionizing Large Transactions<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>To address the growing demands of large-scale data operations, the TiDB team developed Pipelined DML, a transformative enhancement to the original Percolator protocol. This feature changes the game by enabling continuous write flushing to <a href=\"https:\/\/docs.pingcap.com\/tidb\/stable\/tikv-overview\">TiKV<\/a>, TiDB\u2019s storage layer, rather than relying solely on in-memory buffers until the commit phase. This shift ensures efficient, scalable, and reliable transaction management. Here\u2019s what makes Pipelined DML a breakthrough:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1. <strong>Continuous Flushing: Incremental Writes for Better Memory Management<\/strong><\/h3>\n\n\n\n<p>Pipelined DML writes data to TiKV in small, manageable batches, drastically reducing memory usage. This ensures smooth processing, even for large transactions, while eliminating the risk of out-of-memory (OOM) errors.<\/p>\n\n\n\n<p><em>Example:<\/em> Picture migrating a massive sales dataset to an archive table in TiDB. Previously, this would require storing the entire dataset in memory before committing, risking resource exhaustion. Now, Pipelined DML writes the data incrementally to TiKV as processed, keeping memory usage steady and workflows uninterrupted.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">2. <strong>Asynchronous Buffer Management: Parallel Processing for Reduced Latency<\/strong><\/h3>\n\n\n\n<p>By decoupling transaction execution from storage writes, Pipelined DML allows TiDB to process and write data simultaneously. This parallelism reduces transaction latency and optimizes resource utilization.<\/p>\n\n\n\n<p><em>Example:<\/em> Imagine processing millions of real-time log entries for an analytics system. With Pipelined DML, the system can process new entries while simultaneously writing completed logs to storage, enabling faster throughput and seamless handling of high-volume workloads.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">3. <strong>Smoothed CPU and I\/O Utilization: Consistent Resource Efficiency<\/strong><\/h3>\n\n\n\n<p>Traditional batch processing often leads to spikes in resource usage, which can slow down other operations. Pipelined DML spreads the workload evenly, ensuring TiKV processes data at a steady rate.<\/p>\n\n\n\n<p><em>Example:<\/em> Updating global pricing data for a retail platform\u2019s products can be resource-intensive. Pipelined DML ensures these updates happen gradually, avoiding sudden CPU or I\/O bottlenecks and maintaining system stability.<\/p>\n\n\n\n<p>These innovations make Pipelined DML a cornerstone of TiDB\u2019s ability to meet modern data workload demands. It allows businesses to manage massive transactions with confidence, delivering a scalable, efficient, and reliable solution.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"436\" src=\"https:\/\/static.pingcap.com\/files\/2024\/11\/22112643\/image-1024x436.png\" alt=\"Traditional 2PC processes store all writes in memory until the commit phase (left), while Pipelined DML incrementally flushes writes to TiKV as generated, ensuring minimal memory usage (right) for large transactions.\" class=\"wp-image-23520\" srcset=\"https:\/\/static.pingcap.com\/files\/2024\/11\/22112643\/image-1024x436.png 1024w, https:\/\/static.pingcap.com\/files\/2024\/11\/22112643\/image-300x128.png 300w, https:\/\/static.pingcap.com\/files\/2024\/11\/22112643\/image-768x327.png 768w, https:\/\/static.pingcap.com\/files\/2024\/11\/22112643\/image.png 1280w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"has-text-align-center\"><em>Figure 1: Traditional 2PC processes store all writes in memory until the commit phase (left), while Pipelined DML incrementally flushes writes to TiKV as generated, ensuring minimal memory usage (right).<\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_Pipelined_DML_Works_A_Step-by-Step_Breakdown_for_Managing_Large_Transactions\"><\/span>How Pipelined DML Works: A Step-by-Step Breakdown for Managing Large Transactions<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Pipelined DML integrates seamlessly into TiDB\u2019s architecture, enhancing transaction processing without disrupting established workflows. Let\u2019s break down how it optimizes each phase of the transaction lifecycle:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1. <strong>Execution Phase: Efficient Start to Every Transaction<\/strong><\/h3>\n\n\n\n<p>Like traditional Two-Phase Commit (2PC), the transaction begins with parsing, planning, and executing the SQL statement. However, Pipelined DML introduces a critical difference: instead of holding all writes in memory, it immediately flushes them to TiKV as generated. This approach prevents memory buildup and keeps processing efficient, even for transactions involving millions of rows.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">2. <strong>Flushing Mechanism: Incremental Writes for Stability<\/strong><\/h3>\n\n\n\n<p>Writes are sent to TiKV in small, manageable batches. This mechanism serves two vital functions:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Persistence: <\/strong>By progressively storing writes in TiKV, Pipelined DML significantly reduces memory usage, ensuring stable operations for large transactions.<\/li>\n\n\n\n<li><strong>Rate Limiting:<\/strong> If TiDB\u2019s executor generates writes faster than TiKV can process them, the system dynamically slows down the producer. This ensures smooth operation, avoiding memory overloads or processing bottlenecks.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">3. <strong>Commit Phase: Consistent and Reliable Finalization<\/strong><\/h3>\n\n\n\n<p>TiDB moves to the commit phase after all flushed writes. At this stage, it scans and commits locks associated with the flushed data, ensuring transactional consistency. This approach maintains TiDB&#8217;s ACID guarantees, even with highly complex or large-scale workloads.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Memory Buffer Enhancements: Supporting Continuous Flushing<\/h3>\n\n\n\n<p>To make Pipelined DML possible, TiDB\u2019s in-memory database (MemDB)\u2014responsible for managing transaction writes\u2014was upgraded to support continuous flushing.<\/p>\n\n\n\n<p>The enhanced architecture consists of seven principles:<\/p>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li><strong>Dual MemDB States<\/strong>: Each transaction uses one mutable MemDB and at most one immutable MemDB.<\/li>\n\n\n\n<li><strong>Writes<\/strong>: All write operations are directed to the mutable MemDB.<\/li>\n\n\n\n<li><strong>Reads<\/strong>: Read operations aggregate data from all active MemDBs for accurate results.<\/li>\n\n\n\n<li><strong>Flushing<\/strong>: The immutable MemDB handles flushes exclusively.<\/li>\n\n\n\n<li><strong>Transition<\/strong>: When no immutable MemDB exists, the mutable MemDB becomes immutable, and a new mutable MemDB is created.<\/li>\n\n\n\n<li><strong>Memory release<\/strong>: The immutable MemDB is discarded after flushing completes.<\/li>\n\n\n\n<li><strong>Flow Control<\/strong>: Incoming writes are paused when the mutable MemDB grows too large to maintain stability.<\/li>\n<\/ol>\n\n\n\n<p>These enhancements enable TiDB to efficiently manage concurrent read, write, and flush operations while keeping memory usage under control.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1013\" height=\"938\" src=\"https:\/\/static.pingcap.com\/files\/2024\/11\/22112706\/image-1.png\" alt=\"The updated buffer structure supporting continuous flushing, designed to replace the original memory buffer.\" class=\"wp-image-23521\" style=\"width:612px;height:auto\" srcset=\"https:\/\/static.pingcap.com\/files\/2024\/11\/22112706\/image-1.png 1013w, https:\/\/static.pingcap.com\/files\/2024\/11\/22112706\/image-1-300x278.png 300w, https:\/\/static.pingcap.com\/files\/2024\/11\/22112706\/image-1-768x711.png 768w\" sizes=\"auto, (max-width: 1013px) 100vw, 1013px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"has-text-align-center\"><em>Figure 2: The updated buffer structure supporting continuous flushing, designed to replace the original memory buffer<\/em>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Overcoming_Challenges_in_Pipelined_DML\"><\/span>Overcoming Challenges in Pipelined DML<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Introducing a feature as transformative as Pipelined DML requires tackling unique technical hurdles. The TiDB team identified and addressed three primary challenges to ensure seamless and reliable operation.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Challenge 1: Out-of-Order Flush Operations<\/h3>\n\n\n\n<p>In distributed systems, network instability can cause lost, delayed, or arrive in the wrong order flush operations, risking data inconsistency.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>How <\/strong><strong>\ud2f0DB<\/strong><strong> Solves This: Generation Numbers<\/strong><\/h4>\n\n\n\n<p>Each flush operation is assigned a unique, incrementing generation number to ensure:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Correct Ordering:<\/strong> TiKV processes writes in sequential order based on their generation numbers.<\/li>\n\n\n\n<li><strong>Stale Write Prevention: <\/strong>If a delayed operation arrives after a more recent one, TiKV rejects it, preserving data integrity.<\/li>\n\n\n\n<li><em>Example: <\/em>Imagine Flush Operation 1 (Gen 1) and Flush Operation 2 (Gen 2) are sent correctly but arrive out of sequence. TiKV processes Gen 2 while discarding the outdated Gen 1, ensuring data consistency.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Challenge 2: Slow Point-Read Operations<\/h3>\n\n\n\n<p>When a requested key-value pair has already been flushed to TiKV and is no longer in TiDB&#8217;s memory, retrieving it requires a remote procedure call (RPC) to TiKV, increasing latency.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>How <\/strong>Does <strong>TiDB Solve This?<\/strong><\/h4>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li><strong>Lazy Check:<\/strong> For cases requiring verification of key non-existence, checks are deferred to the actual write in TiKV. The new <code>buffer.GetLocal(key)<\/code> method limits lookups to in-memory data, avoiding unnecessary RPCs.<\/li>\n\n\n\n<li><strong>Prefetching:<\/strong> For scenarios like bulk updates, TiDB preloads key-value pairs into a cache using <code>BatchGet<\/code>. This cache ensures subsequent <code>buffer.Get(key)<\/code> operations hit the in-memory cache rather than initiating costly RPCs.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Challenge 3: Staging Across Nodes<\/h3>\n\n\n\n<p>TiDB\u2019s staging mechanism allows temporary buffers (stages) to hold committed or rolled back changes. While this is simple in a memory-only setup, extending it across TiDB and TiKV nodes adds complexity.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>How <\/strong>Does <strong>TiDB Simplify Staging<\/strong>?<\/h4>\n\n\n\n<p>By limiting staging operations to occur between flush points, the system ensures:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>All staged changes resolve (committed or rolled back) before TiKV flushes data.<\/li>\n\n\n\n<li>Memory-only staging is preserved, simplifying implementation while retaining essential features like <code>ON DUPLICATE KEY UPDATE<\/code>.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Performance_Enhancements_TiDB_Reaches_New_Heights_for_Large_Transactions\"><\/span>Performance Enhancements: TiDB Reaches New Heights for Large Transactions<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>With Pipelined DML, TiDB sets a new standard for processing large transactions, significantly improving speed, memory efficiency, and throughput. Extensive benchmarks of TiDB 8.4 against TiDB 7.5 highlight these advancements in practical, real-world scenarios.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Highlights of Testing Environment<\/h3>\n\n\n\n<p>All benchmarks follow the below parameters:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Workload:<\/strong> YCSB table with 10 million rows.<\/li>\n\n\n\n<li><strong>Test Environment:<\/strong> GCP n2-standard-16 machines.<\/li>\n\n\n\n<li>Cluster Size: 3 TiKV nodes.<\/li>\n<\/ul>\n\n\n\n<p>Results demonstrate the practical application of Pipelined DML for scaling operations with TiDB.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Key Improvements<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Cluster Size<\/strong><\/td><td><strong>Workload Type<\/strong><\/td><td><strong>Latency (<\/strong><strong>\ud2f0DB<\/strong><strong> 7.5)<\/strong><\/td><td><strong>Latency (<\/strong><strong>\ud2f0DB<\/strong><strong> 8.4 with Pipelined DML)<\/strong><\/td><td><strong>Data Throughput<\/strong><\/td><td><strong>Performance Gain<\/strong><\/td><\/tr><tr><td>3 TiKVs<\/td><td>YCSB-insert-10M<\/td><td>368s<\/td><td>159s<\/td><td>75.3 MiB\/s<\/td><td>2.31x<\/td><\/tr><tr><td>3 TiKVs<\/td><td>YCSB-update-10M<\/td><td>255s<\/td><td>131s<\/td><td>91.5 MiB\/s<\/td><td>1.95x<\/td><\/tr><tr><td>3 TiKVs<\/td><td>YCSB-delete-10M<\/td><td>136s<\/td><td>42s<\/td><td>285 MiB\/s<\/td><td>3.24x<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">How These Gains Translate to Real-World Scenarios<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Insert Operations:<\/strong> Ideal for loading new datasets, such as importing sales records or customer profiles into TiDB. Continuous flushing ensures rapid, resource-efficient performance.<\/li>\n\n\n\n<li><strong>Update Operations:<\/strong> Useful for processing bulk adjustments, like updating pricing data across a large inventory. These updates are now faster and less memory-intensive.<\/li>\n\n\n\n<li><strong>Delete Operations:<\/strong> Perfect for archiving or clearing outdated records, such as cleaning up logs, where TiDB\u2019s new capabilities demonstrate incredible speed and efficiency.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Steady Resource Utilization for Better Stability<\/h3>\n\n\n\n<p>One of the standout benefits of Pipelined DML is how it smooths CPU and I\/O utilization. Unlike traditional Two-Phase Commit (2PC), which often leads to resource spikes during commit phases, Pipelined DML maintains steady performance throughout. This consistency improves system reliability, especially during peak workloads.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"565\" src=\"https:\/\/static.pingcap.com\/files\/2024\/11\/22112736\/image-2-1024x565.png\" alt=\"Steady resource utilization for better stability managing large transactions.\" class=\"wp-image-23522\" style=\"width:710px;height:auto\" srcset=\"https:\/\/static.pingcap.com\/files\/2024\/11\/22112736\/image-2-1024x565.png 1024w, https:\/\/static.pingcap.com\/files\/2024\/11\/22112736\/image-2-300x165.png 300w, https:\/\/static.pingcap.com\/files\/2024\/11\/22112736\/image-2-768x424.png 768w, https:\/\/static.pingcap.com\/files\/2024\/11\/22112736\/image-2.png 1280w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Latency and Scalability: A Closer Look in Processing Large Transactions<\/h3>\n\n\n\n<p>To process large-scale transactions with minimal delays, Pipelined DML employs a producer-consumer model that ensures data flows efficiently between TiDB and TiKV:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Producer:<\/strong> The TiDB executor generates data for TiKV.<\/li>\n\n\n\n<li><strong>Consumer:<\/strong> TiKV processes the incoming write requests.<\/li>\n\n\n\n<li><strong>Channel:<\/strong> Flush operations act as the bridge, maintaining smooth communication between the producer and consumer.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Managing Latency for Consistent Performance<\/h4>\n\n\n\n<p>In high-demand scenarios, the producer (TiDB executor) might generate data faster than the consumer (TiKV) can handle. Instead of overwhelming the system, Pipelined DML temporarily pauses the producer to ensure stability and prevent memory spikes. This process, called \u201cflush wait,\u201d is one of three factors influencing overall latency:<\/p>\n\n\n\n<p><em>Latency Formula:<\/em><\/p>\n\n\n\n<p>Overall latency = Execution Time + Flush Wait + Commit Primary Key Duration<\/p>\n\n\n\n<p>Here\u2019s how each factor contributes:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Execution Time:<\/strong> The time TiDB spends generating data for the transaction.<\/li>\n\n\n\n<li><strong>Flush Wait:<\/strong> Any delays caused when the producer must pause for the consumer to catch up.<\/li>\n\n\n\n<li><strong>Commit <\/strong><strong>Primary Key<\/strong><strong>Duration:<\/strong> The time required to commit the primary key once the transaction is ready, which is negligible for large transactions.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Scalability Made Simple<\/h4>\n\n\n\n<p>Scaling out TiKV nodes directly reduces flush wait times, enabling the system to process larger workloads more efficiently. Benchmarks show significant latency reductions when the cluster adds additional TiKV nodes, demonstrating how easily TiDB adapts to increased demands.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Example Use Case<\/h5>\n\n\n\n<p>Suppose an e-commerce platform is preparing for a major sale event and anticipates a surge in transactions, such as bulk order updates or inventory adjustments. By scaling the TiKV nodes beforehand, the platform can handle these operations seamlessly without increasing latency.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"568\" src=\"https:\/\/static.pingcap.com\/files\/2024\/11\/22112850\/image-4-1024x568.png\" alt=\"\" class=\"wp-image-23524\" style=\"width:737px;height:auto\" srcset=\"https:\/\/static.pingcap.com\/files\/2024\/11\/22112850\/image-4-1024x568.png 1024w, https:\/\/static.pingcap.com\/files\/2024\/11\/22112850\/image-4-300x166.png 300w, https:\/\/static.pingcap.com\/files\/2024\/11\/22112850\/image-4-768x426.png 768w, https:\/\/static.pingcap.com\/files\/2024\/11\/22112850\/image-4.png 1280w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Conclusion_Unlocking_Large_Transactions_with_TiDB\"><\/span>Conclusion: Unlocking Large Transactions with TiDB<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Pipelined DML marks a transformative step forward in TiDB\u2019s journey to redefine large transaction management. By tackling longstanding challenges like memory constraints and transaction latency, this feature empowers organizations to process massive datasets with confidence and efficiency.<\/p>\n\n\n\n<p>With its ability to deliver faster execution times, dramatically reduced memory usage, and unmatched scalability, TiDB ensures businesses meet the demands of modern data workloads. Whether you\u2019re scaling your infrastructure, streamlining ETL pipelines, or optimizing high-traffic operations, Pipelined DML positions TiDB as a forward-looking, reliable solution for the future of data management.<\/p>\n\n\n\n<p>If you have any questions about Pipelined DML, please feel free to connect with us on&nbsp;<a href=\"https:\/\/twitter.com\/PingCAP\" target=\"_blank\" rel=\"noreferrer noopener\">Twitter<\/a>,&nbsp;<a href=\"https:\/\/www.linkedin.com\/company\/pingcap\/mycompany\/\" target=\"_blank\" rel=\"noreferrer noopener\">LinkedIn<\/a>, or through our&nbsp;<a href=\"https:\/\/slack.tidb.io\/invite?team=tidb-community&amp;channel=everyone&amp;ref=pingcap\" target=\"_blank\" rel=\"noreferrer noopener\">Slack Channel<\/a>.&nbsp;<\/p>","protected":false},"excerpt":{"rendered":"<p>Managing large transactions in distributed databases has always been a tough nut to crack. From processing millions of records during migrations to tackling complex workflows like ETL (Extract, Transform, Load), businesses need database solutions that are not only fast but also reliable. TiDB has long been at the forefront of distributed SQL database innovation, and [&hellip;]<\/p>\n","protected":false},"author":283,"featured_media":23537,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ub_ctt_via":"","footnotes":""},"categories":[13],"tags":[147,333,332,9,111],"class_list":["post-23471","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-product","tag-distributed-sql","tag-large-transactions","tag-pipelined-dml","tag-scalability","tag-tidb"],"acf":[],"featured_image_src":"https:\/\/static.pingcap.com\/files\/2024\/11\/22140632\/tidb_feature_1800x600-1-5.png","author_info":{"display_name":"Ziqian Qin","author_link":"https:\/\/www.pingcap.com\/ko\/blog\/author\/zqin\/"},"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.9 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Large Transactions: How Pipelined DML Works in TiDB<\/title>\n<meta name=\"description\" content=\"Explore what Pipelined DML is, how it works, and the transformational benefits it brings to managing large transactions.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.pingcap.com\/ko\/blog\/managing-large-transactions-pipelined-dml-tidb\/\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Large Transactions: How Pipelined DML Works in TiDB\" \/>\n<meta property=\"og:description\" content=\"Explore what Pipelined DML is, how it works, and the transformational benefits it brings to managing large transactions.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.pingcap.com\/ko\/blog\/managing-large-transactions-pipelined-dml-tidb\/\" \/>\n<meta property=\"og:site_name\" content=\"TiDB\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/facebook.com\/pingcap2015\" \/>\n<meta property=\"article:published_time\" content=\"2024-11-22T21:00:19+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-02-17T08:15:07+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/static.pingcap.com\/files\/2024\/11\/22140653\/tidb_1200x627-5.png\" \/>\n\t<meta property=\"og:image:width\" content=\"2400\" \/>\n\t<meta property=\"og:image:height\" content=\"1254\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Ziqian Qin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/static.pingcap.com\/files\/2024\/11\/22140709\/tidb_twitter_1600x900-5.png\" \/>\n<meta name=\"twitter:creator\" content=\"@PingCAP\" \/>\n<meta name=\"twitter:site\" content=\"@PingCAP\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Ziqian Qin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"11\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/\"},\"author\":{\"name\":\"Ziqian Qin\",\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/person\/24209e5293678f6a587055f7265c8756\"},\"headline\":\"Pipelined DML in TiDB: A Breakthrough for Managing Large Transactions\",\"datePublished\":\"2024-11-22T21:00:19+00:00\",\"dateModified\":\"2025-02-17T08:15:07+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/\"},\"wordCount\":2185,\"publisher\":{\"@id\":\"https:\/\/www.pingcap.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/static.pingcap.com\/files\/2024\/11\/22140632\/tidb_feature_1800x600-1-5.png\",\"keywords\":[\"Distributed SQL\",\"Large Transactions\",\"Pipelined DML\",\"Scalability\",\"TiDB\"],\"articleSection\":[\"Product\"],\"inLanguage\":\"ko-KR\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/\",\"url\":\"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/\",\"name\":\"Large Transactions: How Pipelined DML Works in TiDB\",\"isPartOf\":{\"@id\":\"https:\/\/www.pingcap.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/static.pingcap.com\/files\/2024\/11\/22140632\/tidb_feature_1800x600-1-5.png\",\"datePublished\":\"2024-11-22T21:00:19+00:00\",\"dateModified\":\"2025-02-17T08:15:07+00:00\",\"description\":\"Explore what Pipelined DML is, how it works, and the transformational benefits it brings to managing large transactions.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#primaryimage\",\"url\":\"https:\/\/static.pingcap.com\/files\/2024\/11\/22140632\/tidb_feature_1800x600-1-5.png\",\"contentUrl\":\"https:\/\/static.pingcap.com\/files\/2024\/11\/22140632\/tidb_feature_1800x600-1-5.png\",\"width\":3600,\"height\":1200},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.pingcap.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Pipelined DML in TiDB: A Breakthrough for Managing Large Transactions\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.pingcap.com\/#website\",\"url\":\"https:\/\/www.pingcap.com\/\",\"name\":\"TiDB\",\"description\":\"TiDB | SQL at Scale\",\"publisher\":{\"@id\":\"https:\/\/www.pingcap.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.pingcap.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.pingcap.com\/#organization\",\"name\":\"PingCAP\",\"url\":\"https:\/\/www.pingcap.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png\",\"contentUrl\":\"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png\",\"width\":811,\"height\":232,\"caption\":\"PingCAP\"},\"image\":{\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/facebook.com\/pingcap2015\",\"https:\/\/x.com\/PingCAP\",\"https:\/\/linkedin.com\/company\/pingcap\",\"https:\/\/youtube.com\/channel\/UCuq4puT32DzHKT5rU1IZpIA\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/person\/24209e5293678f6a587055f7265c8756\",\"name\":\"Ziqian Qin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/static.pingcap.com\/files\/2022\/10\/17234942\/avatar.jpg\",\"contentUrl\":\"https:\/\/static.pingcap.com\/files\/2022\/10\/17234942\/avatar.jpg\",\"caption\":\"Ziqian Qin\"},\"url\":\"https:\/\/www.pingcap.com\/ko\/blog\/author\/zqin\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Large Transactions: How Pipelined DML Works in TiDB","description":"Explore what Pipelined DML is, how it works, and the transformational benefits it brings to managing large transactions.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.pingcap.com\/ko\/blog\/managing-large-transactions-pipelined-dml-tidb\/","og_locale":"ko_KR","og_type":"article","og_title":"Large Transactions: How Pipelined DML Works in TiDB","og_description":"Explore what Pipelined DML is, how it works, and the transformational benefits it brings to managing large transactions.","og_url":"https:\/\/www.pingcap.com\/ko\/blog\/managing-large-transactions-pipelined-dml-tidb\/","og_site_name":"TiDB","article_publisher":"https:\/\/facebook.com\/pingcap2015","article_published_time":"2024-11-22T21:00:19+00:00","article_modified_time":"2025-02-17T08:15:07+00:00","og_image":[{"width":2400,"height":1254,"url":"https:\/\/static.pingcap.com\/files\/2024\/11\/22140653\/tidb_1200x627-5.png","type":"image\/png"}],"author":"Ziqian Qin","twitter_card":"summary_large_image","twitter_image":"https:\/\/static.pingcap.com\/files\/2024\/11\/22140709\/tidb_twitter_1600x900-5.png","twitter_creator":"@PingCAP","twitter_site":"@PingCAP","twitter_misc":{"Written by":"Ziqian Qin","Est. reading time":"11\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#article","isPartOf":{"@id":"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/"},"author":{"name":"Ziqian Qin","@id":"https:\/\/www.pingcap.com\/#\/schema\/person\/24209e5293678f6a587055f7265c8756"},"headline":"Pipelined DML in TiDB: A Breakthrough for Managing Large Transactions","datePublished":"2024-11-22T21:00:19+00:00","dateModified":"2025-02-17T08:15:07+00:00","mainEntityOfPage":{"@id":"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/"},"wordCount":2185,"publisher":{"@id":"https:\/\/www.pingcap.com\/#organization"},"image":{"@id":"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#primaryimage"},"thumbnailUrl":"https:\/\/static.pingcap.com\/files\/2024\/11\/22140632\/tidb_feature_1800x600-1-5.png","keywords":["Distributed SQL","Large Transactions","Pipelined DML","Scalability","TiDB"],"articleSection":["Product"],"inLanguage":"ko-KR"},{"@type":"WebPage","@id":"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/","url":"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/","name":"Large Transactions: How Pipelined DML Works in TiDB","isPartOf":{"@id":"https:\/\/www.pingcap.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#primaryimage"},"image":{"@id":"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#primaryimage"},"thumbnailUrl":"https:\/\/static.pingcap.com\/files\/2024\/11\/22140632\/tidb_feature_1800x600-1-5.png","datePublished":"2024-11-22T21:00:19+00:00","dateModified":"2025-02-17T08:15:07+00:00","description":"Explore what Pipelined DML is, how it works, and the transformational benefits it brings to managing large transactions.","breadcrumb":{"@id":"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/"]}]},{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#primaryimage","url":"https:\/\/static.pingcap.com\/files\/2024\/11\/22140632\/tidb_feature_1800x600-1-5.png","contentUrl":"https:\/\/static.pingcap.com\/files\/2024\/11\/22140632\/tidb_feature_1800x600-1-5.png","width":3600,"height":1200},{"@type":"BreadcrumbList","@id":"https:\/\/www.pingcap.com\/blog\/managing-large-transactions-pipelined-dml-tidb\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.pingcap.com\/"},{"@type":"ListItem","position":2,"name":"Pipelined DML in TiDB: A Breakthrough for Managing Large Transactions"}]},{"@type":"WebSite","@id":"https:\/\/www.pingcap.com\/#website","url":"https:\/\/www.pingcap.com\/","name":"\ud2f0DB","description":"TiDB | SQL at Scale","publisher":{"@id":"https:\/\/www.pingcap.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.pingcap.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/www.pingcap.com\/#organization","name":"PingCAP","url":"https:\/\/www.pingcap.com\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/","url":"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png","contentUrl":"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png","width":811,"height":232,"caption":"PingCAP"},"image":{"@id":"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/facebook.com\/pingcap2015","https:\/\/x.com\/PingCAP","https:\/\/linkedin.com\/company\/pingcap","https:\/\/youtube.com\/channel\/UCuq4puT32DzHKT5rU1IZpIA"]},{"@type":"Person","@id":"https:\/\/www.pingcap.com\/#\/schema\/person\/24209e5293678f6a587055f7265c8756","name":"Ziqian Qin","image":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/www.pingcap.com\/#\/schema\/person\/image\/","url":"https:\/\/static.pingcap.com\/files\/2022\/10\/17234942\/avatar.jpg","contentUrl":"https:\/\/static.pingcap.com\/files\/2022\/10\/17234942\/avatar.jpg","caption":"Ziqian Qin"},"url":"https:\/\/www.pingcap.com\/ko\/blog\/author\/zqin\/"}]}},"grav_blocks":false,"card_markup":"<a class=\"card-resource bg-white\" href=\"https:\/\/www.pingcap.com\/ko\/blog\/managing-large-transactions-pipelined-dml-tidb\/\"><div class=\"card-resource__image-container\"><img class=\"card-resource__image\" alt=\"tidb_feature_1800x600 (1)\" src=\"https:\/\/static.pingcap.com\/files\/2024\/11\/22140632\/tidb_feature_1800x600-1-5.png\" loading=\"lazy\" width=3600 height=1200 \/><\/div><div class=\"card-resource__content-container\"><div class=\"card-resource__content-head\"><div class=\"card-resource__category\">Product<\/div><\/div><h5 class=\"card-resource__title\">Pipelined DML in TiDB: A Breakthrough for Managing Large Transactions<\/h5><\/div><\/a>","_links":{"self":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/posts\/23471","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/users\/283"}],"replies":[{"embeddable":true,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/comments?post=23471"}],"version-history":[{"count":18,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/posts\/23471\/revisions"}],"predecessor-version":[{"id":25215,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/posts\/23471\/revisions\/25215"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/media\/23537"}],"wp:attachment":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/media?parent=23471"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/categories?post=23471"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/tags?post=23471"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}