{"id":28508,"date":"2025-07-25T10:49:19","date_gmt":"2025-07-25T17:49:19","guid":{"rendered":"https:\/\/www.pingcap.com\/?p=28508"},"modified":"2025-08-01T02:41:26","modified_gmt":"2025-08-01T09:41:26","slug":"exploring-tidb-observability-real-world-case-studies","status":"publish","type":"post","link":"https:\/\/www.pingcap.com\/ko\/blog\/exploring-tidb-observability-real-world-case-studies\/","title":{"rendered":"Exploring TiDB Observability: A Journey Through Real-World Case Studies"},"content":{"rendered":"\n<p>Have you ever seen two nearly identical SQL statements, differing only in date parameters or function variations, return similar results but with wildly different performance, sometimes by factors of 10 or 100? In real-world scenarios, we typically run the `EXPLAIN` statement to examine changes in the execution plan.<\/p>\n\n\n\n<p>But what if the execution plan doesn&#8217;t change? Then what? That\u2019s where&nbsp;<code>EXPLAIN ANALYZE<\/code>&nbsp;comes in. It reveals the real runtime behavior of each operator, not just the plan. <\/p>\n\n\n\n<p>In a <a href=\"https:\/\/www.pingcap.com\/blog\/tidb-index-optimization-best-practices-better-performance\/\">previous blog<\/a>, we explored the observability tools needed to detect and eliminate unused or inefficient indexes in <a href=\"https:\/\/www.pingcap.com\/tidb-self-managed\/\">TiDB<\/a>, improving performance and stability. This post will combine real-world cases and common issues to explore how to leverage operator execution information for more precise analysis and diagnosis of SQL performance.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"TiDB_Observability_Operator_Execution_Information_Introduction\"><\/span>TiDB Observability: Operator Execution Information Introduction<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Usually we can use the&nbsp;<code>explain analyze&nbsp;<\/code>statement to obtain operator execution information. <code>Explain analyze&nbsp;<\/code>will actually execute the corresponding SQL statement, while capturing its runtime execution information, and return it together with the execution plan. The recorded information includes: <code>actRows&nbsp;<\/code>,&nbsp;<code>execution info&nbsp;<\/code>,&nbsp;<code>memory&nbsp;<\/code>, and&nbsp;<code>disk&nbsp;<\/code>.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Attribute Name<\/strong><\/td><td><strong>Meaning<\/strong><\/td><\/tr><tr><td>actRows<\/td><td>Number of rows output by the operator.<\/td><\/tr><tr><td>execution info<\/td><td>Execution information of the operator.&nbsp;<code>time<\/code>&nbsp;represents the total&nbsp;<code>wall time<\/code>&nbsp;from entering the operator to leaving the operator, including the total execution time of all sub-operators. If the operator is called many times by the parent operator (in loops), then the time refers to the accumulated time.&nbsp;<code>loops<\/code>&nbsp;is the number of times the current operator is called by the parent operator.<\/td><\/tr><tr><td>memory<\/td><td>Memory space occupied by the operator.<\/td><\/tr><tr><td>disk<\/td><td>Disk space occupied by the operator.<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>For explanations of `execution info` across different operators, refer to the&nbsp;<a href=\"https:\/\/docs.pingcap.com\/zh\/tidb\/stable\/sql-statement-explain-analyze\">TiDB documentation<\/a>. Developers refined these metrics through extensive troubleshooting of performance issues, making them essential reading for anyone aiming to deeply understand TiDB SQL performance diagnostics.<\/p>\n\n\n\n<p>Sometimes SQL performance issues are intermittent, and this increases the difficultly of using EXPLAIN ANALYZE directly. At this time, you can quickly locate and retrieve detailed execution information for problematic SQL statements through the&nbsp;<a href=\"https:\/\/docs.pingcap.com\/zh\/tidb\/stable\/dashboard-overview#%E6%9C%80%E8%BF%91%E7%9A%84%E6%85%A2%E6%9F%A5%E8%AF%A2\">slow log query<\/a> page of TiDB Dashboard.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"TiDB_Observability_Case_Studies\"><\/span>TiDB Observability Case Studies<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Next, we will explore related problems through specific real-world examples. These examples include:&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Investigating query latency jitter<\/li>\n\n\n\n<li>Understanding operator concurrency<\/li>\n\n\n\n<li>Why did MAX() take 100ms while MIN() took 8 seconds?<\/li>\n<\/ul>\n\n\n\n<p>Please note that developers sourced most execution plans in the following cases from production environments (desensitized for privacy).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Query Latency Jitter<\/h3>\n\n\n\n<p>Intermittent query latency jitter is one of the most common performance issues. If the slow query log can pinpoint the specific SQL statements causing performance fluctuations, further analysis of their operator execution information often reveals the root cause and provides actionable optimization clues.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Case<\/h4>\n\n\n\n<p>Consider a customer\u2019s point query latency jitter issue, where delays occasionally exceeded 2 seconds. Using the slow query log, we located the operator execution metrics for one such problematic query:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>mysql&gt; explain analyze select * from t0 where col0 = 100 and col1 = 'A';\n+---------------------+---------+---------+-----------+---------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+---------+------+\n| id                  | estRows | actRows | task      | access object                               | execution info                                                                                                                                                                                                                                                                                                                                                                  | operator info | memory  | disk |\n+---------------------+---------+---------+-----------+---------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------+---------------+---------+------+\n| Point_Get_1         | 1       | 1       | root      | table:t0, index:uniq_col0_col1              | time:2.52s, loops:2, ResolveLock:{num_rpc:1, total_time:2.52s}, Get:{num_rpc:3, total_time:2.2ms}, txnNotFound_backoff:{num:12, total_time:2.51s}, tikv_wall_time: 322.8us, scan_detail: {total_process_keys: 2, total_process_keys_size: 825, total_keys: 9, get_snapshot_time: 18us, rocksdb: {delete_skipped_count: 3, key_skipped_count: 14, block: {cache_hit_count: 16}}} | N\/A           | N\/A     | N\/A  |\n+---------------------+---------+---------+-----------+---------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------+---------------+---------+------+<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">Analysis<\/h4>\n\n\n\n<p>First, we observe that the query involves only a&nbsp;<code>Point_Get_1&nbsp;<\/code>operator, with an execution time of&nbsp;<strong>2.52s<\/strong> recorded in&nbsp;<strong>execution info<\/strong>.<strong>&nbsp;<\/strong>This indicates that the entire execution duration is captured accurately.<\/p>\n\n\n\n<p>Upon closer inspection of the&nbsp;<strong>execution info<\/strong>, we note the presence of a&nbsp;<strong>ResolveLock&nbsp;<\/strong>entry. Details reveal that this operation consumed&nbsp;<strong>2.52s<\/strong>&nbsp;in total, meaning nearly all query time was spent resolving locks. In contrast, the actual&nbsp;<strong>Get&nbsp;<\/strong>operation took only&nbsp;<strong>2.2ms<\/strong>, confirming that data access was negligible.<\/p>\n\n\n\n<p>Additionally, a&nbsp;<strong>txnNotFound_backoff&nbsp;<\/strong>entry highlights retries triggered by stale transactions. Specifically, <strong>12 retries<\/strong> occurred, cumulatively lasting&nbsp;<strong>2.51s<\/strong>&nbsp;(aligning closely with the&nbsp;<code>ResolveLock<\/code>&nbsp;duration of <strong>2.52s<\/strong>).&nbsp;<\/p>\n\n\n\n<p>This leads us to hypothesize: the point query likely encountered locks from stale transactions. During the <strong>ResolveLock<\/strong> phase, the system detected expired locks, initiating a cleanup process. This lock resolution overhead caused high query latency<strong>.<\/strong><\/p>\n\n\n\n<p>To validate this hypothesis, we can cross-reference monitoring data, as shown in the below image:&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"830\" height=\"642\" src=\"https:\/\/static.pingcap.com\/files\/2025\/07\/25145715\/image-7.png\" alt=\"Figuring out a hypothesis for TiDB observability.\" class=\"wp-image-28541\" srcset=\"https:\/\/static.pingcap.com\/files\/2025\/07\/25145715\/image-7.png 830w, https:\/\/static.pingcap.com\/files\/2025\/07\/25145715\/image-7-300x232.png 300w, https:\/\/static.pingcap.com\/files\/2025\/07\/25145715\/image-7-768x594.png 768w\" sizes=\"auto, (max-width: 830px) 100vw, 830px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Operator Concurrency in TiDB<\/h3>\n\n\n\n<p>In TiDB, we can adjust the execution concurrency of operators through system variables to ultimately tune SQL performance. Operator concurrency significantly impacts execution efficiency. For example, with the same number of coprocessor tasks, increasing concurrency from 5 to 10 may nearly double performance\u2014though at the cost of higher resource utilization. Execution information helps us understand the actual concurrency of operators, laying the groundwork for deeper performance diagnostics. Let\u2019s examine a real-world case.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Case<\/h4>\n\n\n\n<p>The system was configured with:&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>tidb_executor_concurrency<\/code>&nbsp;= 5&nbsp;<\/li>\n\n\n\n<li><code>tidb_distsql_scan_concurrency<\/code>&nbsp;= 15&nbsp;<\/li>\n<\/ul>\n\n\n\n<p>What are the actual execution concurrencies for the&nbsp;<code>cop_task<\/code>&nbsp;and&nbsp;<code>tikv_task<\/code>&nbsp;in the following execution plan?&nbsp;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>mysql&gt; explain analyze select * from t0 where c like '2147%';\n+-------------------------------+---------+---------+-----------+-------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------+---------+------+\n| id                            | estRows | actRows | task      | access object                 | execution info                                                                                                                                                                                                                                                                                                                                                                                                                                                        | operator info                           | memory  | disk |\n+-------------------------------+---------+---------+-----------+-------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------+---------+------+\n| IndexLookUp_10                | 3.82    | 99      | root      |                               | time:2.83ms, loops:2, index_task: {total_time: 733.8\u00b5s, fetch_handle: 727.9\u00b5s, build: 806ns, wait: 5.14\u00b5s}, table_task: {total_time: 1.96ms, num: 1, concurrency: 5}, next: {wait_index: 821\u00b5s, wait_table_lookup_build: 108.2\u00b5s, wait_table_lookup_resp: 1.85ms}                                                                                                                                                                                                     |                                         | 41.0 KB | N\/A  |\n| \u251c\u2500IndexRangeScan_8(Build)     | 3.82    | 99      | cop&#91;tikv] | table:t0, index:idx_c(c)      | time:719.1\u00b5s, loops:3, cop_task: {num: 1, max: 650\u00b5s, proc_keys: 99, rpc_num: 1, rpc_time: 625.7\u00b5s, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}, tikv_task:{time:0s, loops:3}, scan_detail: {total_process_keys: 99, total_process_keys_size: 18810, total_keys: 100, get_snapshot_time: 102\u00b5s, rocksdb: {key_skipped_count: 99, block: {cache_hit_count: 3}}}                                                                                               | range:&#91;\"2147\",\"2148\"), keep order:false | N\/A     | N\/A  |\n| \u2514\u2500TableRowIDScan_9(Probe)     | 3.82    | 99      | cop&#91;tikv] | table:t0                      | time:1.83ms, loops:2, cop_task: {num: 4, max: 736.9\u00b5s, min: 532.6\u00b5s, avg: 599\u00b5s, p95: 736.9\u00b5s, max_proc_keys: 44, p95_proc_keys: 44, rpc_num: 4, rpc_time: 2.32ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15}, tikv_task:{proc max:0s, min:0s, avg: 0s, p80:0s, p95:0s, iters:5, tasks:4}, scan_detail: {total_process_keys: 99, total_process_keys_size: 22176, total_keys: 99, get_snapshot_time: 217.3\u00b5s, rocksdb: {block: {cache_hit_count: 201}}}      | keep order:false                        | N\/A     | N\/A  |\n+-------------------------------+---------+---------+-----------+-------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------+---------+------+\n3 rows in set (0.00 sec)<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">Analysis<\/h4>\n\n\n\n<p>We just used this example to explain the relationship between cop_task and tikv_task items in the execution information and the actual execution concurrency of the cop task.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Cop_task vs tikv_task<\/h5>\n\n\n\n<p>First, it\u2019s critical to clarify that the&nbsp;<strong>xxx_task<\/strong>&nbsp;entries in&nbsp;<strong>execution info<\/strong>&nbsp;are not equivalent to the &#8220;<strong>task<\/strong>&#8221; column in execution plans.<\/p>\n\n\n\n<p>For example, in the execution plan, the category of the&nbsp;<strong>task&nbsp;<\/strong>column is&nbsp;<strong>&#8220;root&#8221;&nbsp;<\/strong>,&nbsp;<strong>&#8220;cop [tikv]&#8221;&nbsp;<\/strong>, etc. It describes which component the operator actually executes in (such as TiDB, TiKV, or TiFlash). Additionally, it further explains its communication protocol type with the storage engine (such as Coprocessor, Batch Coprocessor, or MPP).<\/p>\n\n\n\n<p>In contrast, the various&nbsp;<strong>tasks&nbsp;<\/strong>in&nbsp;<strong>execution info&nbsp;<\/strong>are more about breaking down operator execution information from different dimensions. This allows users to quickly locate potential performance issues and verify each other through information from different dimensions. Specifically:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>tikv_task<\/strong> describes the overall execution of a specific TiKV operator;<\/li>\n\n\n\n<li><strong>cop_task<\/strong> describes the execution of the entire RPC task, which includes&nbsp;<strong>tikv_task&nbsp;<\/strong>. For example, a&nbsp;<strong>cop_task&nbsp;<\/strong>may contain two operators,&nbsp;<strong>tableScan + Selection<\/strong>. Each operator has its own&nbsp;<strong>tikv_task&nbsp;<\/strong>information to describe its execution. <strong>cop_task&nbsp;<\/strong>describes the execution information of the entire RPC request, which covers the execution time of these two operators.<\/li>\n<\/ul>\n\n\n\n<p>Similarly, in an MPP query, the&nbsp;<strong>tiflash_task&nbsp;<\/strong>statistic in&nbsp;<strong>execution info&nbsp;<\/strong>describes the overall execution of a particular TiFlash operator:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>+------------------------------+-------------+----------+--------------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+-----------+---------+\n| id                           | estRows     | actRows  | task         | access object  | execution info                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | operator info                                      | memory    | disk    |\n+------------------------------+-------------+----------+--------------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+-----------+---------+\n| HashAgg_22                   | 1.00        | 1        | root         |                | time:17ms, open:1.92ms, close:4.83\u00b5s, loops:2, RU:1832.08, partial_worker:{wall_time:15.055084ms, concurrency:5, task_num:1, tot_wait:15.017625ms, tot_exec:12.333\u00b5s, tot_time:75.203959ms, max:15.042667ms, p95:15.042667ms}, final_worker:{wall_time:15.079958ms, concurrency:5, task_num:5, tot_wait:1.414\u00b5s, tot_exec:41ns, tot_time:75.277708ms, max:15.060375ms, p95:15.060375ms}                                                                                                                                                                                                                                                 | funcs:count(Column#19)-&gt;Column#17                  | 6.23 KB   | 0 Bytes |\n| \u2514\u2500TableReader_24             | 1.00        | 1        | root         |                | time:16.9ms, open:1.9ms, close:3.46\u00b5s, loops:2, cop_task: {num: 2, max: 0s, min: 0s, avg: 0s, p95: 0s, copr_cache_hit_ratio: 0.00}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | MppVersion: 3, data:ExchangeSender_23              | 673 Bytes | N\/A     |\n|   \u2514\u2500ExchangeSender_23        | 1.00        | 1        | mpp&#91;tiflash] |                | tiflash_task:{time:13.1ms, loops:1, threads:1}, tiflash_network: {inner_zone_send_bytes: 24}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | ExchangeType: PassThrough                          | N\/A       | N\/A     |\n|     \u2514\u2500HashAgg_9              | 1.00        | 1        | mpp&#91;tiflash] |                | tiflash_task:{time:13.1ms, loops:1, threads:1}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | funcs:count(test.lineitem.L_RETURNFLAG)-&gt;Column#19 | N\/A       | N\/A     |\n|       \u2514\u2500TableFullScan_21     | 11997996.00 | 11997996 | mpp&#91;tiflash] | table:lineitem | tiflash_task:{time:12.8ms, loops:193, threads:12}, tiflash_scan:{mvcc_input_rows:0, mvcc_input_bytes:0, mvcc_output_rows:0, local_regions:10, remote_regions:0, tot_learner_read:0ms, region_balance:{instance_num: 1, max\/min: 10\/10=1.000000}, delta_rows:0, delta_bytes:0, segments:20, stale_read_regions:0, tot_build_snapshot:0ms, tot_build_bitmap:0ms, tot_build_inputstream:15ms, min_local_stream:10ms, max_local_stream:11ms, dtfile:{data_scanned_rows:11997996, data_skipped_rows:0, mvcc_scanned_rows:0, mvcc_skipped_rows:0, lm_filter_scanned_rows:0, lm_filter_skipped_rows:0, tot_rs_index_check:3ms, tot_read:53ms}} | keep order:false                                   | N\/A       | N\/A     |\n+------------------------------+-------------+----------+--------------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+-----------+---------+<\/code><\/pre>\n\n\n\n<h5 class=\"wp-block-heading\">Execution Concurrency of Cop Task<\/h5>\n\n\n\n<p>First, let&#8217;s examine the execution plan.&nbsp;<code>IndexLookUp_10&nbsp;<\/code>is a&nbsp;<strong>root operator&nbsp;<\/strong>. We know that the <code>IndexLookUp&nbsp;<\/code>operator mainly performs two steps: one is to obtain the&nbsp;<strong>row id&nbsp;<\/strong>of the target row through the index; the other is to read the required column data according to the rowid. In&nbsp;<code>IndexLookUp_10&nbsp;<\/code>&#8216;s <strong>execution info&nbsp;<\/strong>, the details of&nbsp;<code>index_task&nbsp;<\/code>and&nbsp;<code>table_task<\/code> are listed separately. Obviously, <code>index_task&nbsp;<\/code>corresponds to the&nbsp;<code>IndexRangeScan_8&nbsp;<\/code>operator, and&nbsp;<code>table_task&nbsp;<\/code>corresponds to the&nbsp;<code>TableRowIDScan_9&nbsp;<\/code>operator.<\/p>\n\n\n\n<p>From the perspective of concurrency,&nbsp;<code>index_task&nbsp;<\/code>does not display concurrency information, which means that the concurrency of the&nbsp;<code>IndexRangeScan_8&nbsp;<\/code>operator defaults to&nbsp;<strong>1&nbsp;<\/strong>. However, the <strong>cop_task&nbsp;<\/strong>of the&nbsp;<code>IndexRangeScan_8&nbsp;<\/code>is&nbsp;<strong>15&nbsp;<\/strong>(determined by the&nbsp;<code>tidb_distsql_scan_concurrency<\/code> parameter), which means that theoretically it can execute&nbsp;<strong>15 cop tasks&nbsp;<\/strong>concurrently to read data.<\/p>\n\n\n\n<p>For&nbsp;<code>table_task<\/code>, its concurrency is&nbsp;<strong>5&nbsp;<\/strong>(determined by the&nbsp;<code>tidb_executor_concurrency<\/code> parameter), which means that up to&nbsp;<strong>5&nbsp;<code>TableRowIDScan_9&nbsp;<\/code>operators&nbsp;<\/strong>can run simultaneously. The <code>distsql_concurrency&nbsp;<\/code>of&nbsp;<code>TableRowIDScan_9&nbsp;<\/code><strong>cop_task&nbsp;<\/strong>is also&nbsp;<strong>15&nbsp;<\/strong>(determined by <code>tidb_distsql_scan_concurrency&nbsp;<\/code>). Therefore, the maximum concurrent read capacity of <code>table_task<\/code> is&nbsp;<strong>5 \u00d7 15 = 75 cop tasks<\/strong>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Max =&gt; Min, 100ms =&gt; 8s<\/h3>\n\n\n\n<p>A SQL query calculating the max value of an indexed column took approximately 100 milliseconds. However, when modified to calculate the min value, execution time soared to 8+ seconds. Below is the `EXPLAIN ANALYZE` output for both scenarios:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code> mysql&gt; explain analyze select max(time_a) from t0 limit 1;\n+--------------------------------+---------+---------+-----------+----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------+-----------+------+\n| id                             | estRows | actRows | task      | access object                                      | execution info                                                                                                                            | operator info                                            | memory    | disk |\n+--------------------------------+---------+---------+-----------+----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------+-----------+------+\n| Limit_14                       | 1.00.   | 1       | root      |                                                    | time:2.328901ms, loops:2                                                                                                                  | offset:0, count:1                                        | N\/A       | N\/A  |                       \n| \u2514\u2500StreamAgg_19                 | 1.00    | 1       | root      |                                                    | time:2.328897ms, loops:1                                                                                                                  | funcs:max(t0.time_a)-&gt;Column#18                          | 128 Bytes | N\/A  |\n|   \u2514\u2500Limit_39                   | 1.00    | 1       | root      |                                                    | time:2.324137ms, loops:2                                                                                                                  | offset:0, count:1                                        | N\/A       | N\/A  |\n|     \u2514\u2500IndexReader_45           | 1.00    | 1       | root      |                                                    | time:2.322215ms, loops:1, cop_task: {num: 1, max:2.231389ms, proc_keys: 32, rpc_num: 1, rpc_time: 2.221023ms, copr_cache_hit_ratio: 0.00} | index:Limit_26                                           | 461 Bytes | N\/A  |\n|       \u2514\u2500Limit_44               | 1.00    | 1       | cop&#91;tikv] |                                                    | time:0ns, loops:0, tikv_task:{time:2ms, loops:1}                                                                                          | offset:0, count:1                                        | N\/A       | N\/A  |\n|         \u2514\u2500IndexFullScan_31     | 1.00    | 32      | cop&#91;tikv] | table:t0, index:time_a(time_a)                     | time:0ns, loops:0, tikv_task:{time:2ms, loops:1}                                                                                          | keep order:true, desc                                    | N\/A       | N\/A  |\n+------------------------------+---------+---------+-----------+----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------+-----------+------+\n6 rows in set (0.12 sec)\n\nmysql&gt; explain analyze select min(time_a) from t0 limit 1;\n+--------------------------------+---------+---------+-----------+----------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------+-----------------+------+\n| id                             | estRows | actRows | task      | access object                                      | execution info                                                                                                                                                                                                                                                                     | operator info                                            | memory          | disk |\n+--------------------------------+---------+---------+-----------+----------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------+-----------------+------+\n| Limit_14                       | 1.00    | 1       | root      |                                                    | time:8.263857153s, loops:2                                                                                                                                                                                                                                                         | offset:0, count:1                                        | N\/A             | N\/A  |\n| \u2514\u2500StreamAgg_19                 | 1.00    | 1       | root      |                                                    | time:8.26385598s, loops:1                                                                                                                                                                                                                                                          | funcs:min(t0.time_a)-&gt;Column#18                          | 128 Bytes       | N\/A  |\n|   \u2514\u2500Limit_39                   | 1.00    | 1       | root      |                                                    | time:8.263848289s, loops:2                                                                                                                                                                                                                                                         | offset:0, count:1                                        | N\/A             | N\/A  |\n|     \u2514\u2500IndexReader_45           | 1.00    | 1       | root      |                                                    | time:8.26384652s, loops:1, cop_task: {num: 175, max: 1.955114915s, min: 737.989\u03bcs, avg: 603.631575ms, p95: 1.161411687s, max_proc_keys: 480000, p95_proc_keys: 480000, tot_proc: 1m44.809s, tot_wait: 361ms, rpc_num: 175, rpc_time: 1m45.632904647s, copr_cache_hit_ratio: 0.00}  | index:Limit_44                                           | 6.6025390625 KB | N\/A  |\n|       \u2514\u2500Limit_44               | 1.00    | 1       | cop&#91;tikv] |                                                    | time:0ns, loops:0, tikv_task:{proc max:1.955s, min:0s, p80:784ms, p95:1.118s, iters:175, tasks:175}                                                                                                                                                                                | offset:0, count:1                                        | N\/A             | N\/A  |\n|         \u2514\u2500IndexFullScan_31     | 1.00    | 32      | cop&#91;tikv] | table:t0, index:time_a(time_a)                     | time:0ns, loops:0, tikv_task:{proc max:1.955s, min:0s, p80:784ms, p95:1.118s, iters:175, tasks:175}                                                                                                                                                                                | keep order:true                                          | N\/A             | N\/A  |\n+--------------------------------+---------+---------+-----------+----------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------+-----------------+------+\n6 rows in set (8.38 sec)<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">Analysis<\/h4>\n\n\n\n<p>When using operator execution information for performance diagnosis, we generally first look at the execution time of each operator itself (excluding the time spent waiting for operator data) from top to bottom, and then look for the operator that has the greatest impact on the overall query performance. The following is the calculation method for operator execution time.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Operator Execution Time of Monad Operator<\/h5>\n\n\n\n<p>The following are the execution time calculation methods for different types of operators:<\/p>\n\n\n\n<p>1.&nbsp;<strong>Operator of type &#8220;root&#8221;<\/strong><\/p>\n\n\n\n<p>You can directly subtract the execution time of an operator\u2019s sub-operators from its total execution time to obtain the operator\u2019s own processing time.<\/p>\n\n\n\n<p>2.&nbsp;<strong>Operators of type &#8220;cop [<\/strong><strong>tikv<\/strong><strong>]&#8221;<\/strong><\/p>\n\n\n\n<p>Execution information contains <code>tikv_task <\/code>statistics. Execution time with wait operator data can be estimated using the following formula: <code>Estimated execution time = avg \u00d7 tasks\/concurrency<\/code>. Then, subtract the execution time of the sub-operator from this time to obtain the actual processing time of the operator.<\/p>\n\n\n\n<p>3.&nbsp;<strong>Operators of type &#8220;mpp [<\/strong><strong>tiflash<\/strong><strong>]&#8221;, &#8220;cop [tiflash]&#8221; or &#8220;batchcop [tiflash]&#8221;<\/strong><\/p>\n\n\n\n<p>The execution information contains&nbsp;<code>tiflash_task&nbsp;<\/code>statistics. Usually,&nbsp;<code>proc max&nbsp;<\/code>can be subtracted from the&nbsp;<code>proc max&nbsp;<\/code>of the sub-operator to get the processing time of the operator. This is because all TiFlash tasks for the same query typically start executing at the same time.<\/p>\n\n\n\n<p><strong>Note&nbsp;<\/strong>: For the&nbsp;<code>ExchangeSender<\/code> operator, its execution time includes the time it takes for data to be received by the upper&nbsp;<code>ExchangeReceiver&nbsp;<\/code>operator, so it is often longer than the time it takes for the upper&nbsp;<code>ExchangeReceiver<\/code> to read memory data.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Operator Execution Time of Multi-Operator<\/h5>\n\n\n\n<p>For composite operators with multiple sub-operators, the&nbsp;<strong>execution info&nbsp;<\/strong>usually lists the execution information of each sub-operator in detail. For example, in the execution information of the <code>IndexLookUp&nbsp;<\/code>operator, the execution details of&nbsp;<code>index_task&nbsp;<\/code>and&nbsp;<code>table_task&nbsp;<\/code>are explicitly included. By analyzing the execution information of these sub-operators, we can accurately determine which sub-operator has a greater impact on the overall performance, so as to optimize in a more targeted way.<\/p>\n\n\n\n<p>In this example, we can see that&nbsp;<code>IndexReader_45&nbsp;<\/code>is the key operator with the greatest impact on performance. Comparing its execution information, it can be found that there is a significant difference in the number of&nbsp;<code>cop tasks<\/code>: in the &#8220;max&#8221; scenario, there is only&nbsp;<strong>one&nbsp;<\/strong><code>cop task<\/code>; while in the &#8220;min&#8221; scenario, there are&nbsp;<strong>175&nbsp;<\/strong><code>cop tasks<\/code>. At the same time, the number of&nbsp;<code>proc_keys&nbsp;<\/code>has also increased from&nbsp;<strong>32&nbsp;<\/strong>to&nbsp;<strong>480,000<\/strong>.<\/p>\n\n\n\n<p>Judging from the tags in the &#8220;operator info&#8221; column, the reading order in the &#8220;max&#8221; scenario is in descending order (&nbsp;<code>keep order, desc&nbsp;<\/code>), that is, reading from large to small; while in the &#8220;min&#8221; scenario, it is the default ascending order (&nbsp;<code>keep order&nbsp;<\/code>). The optimizer has optimized the reading order of the index according to the type of aggregation function &#8211; using descending reading in the &#8220;max&#8221; scenario and ascending reading in the &#8220;min&#8221; scenario. The original intention of this optimization strategy is to find the first data that meets the condition as soon as possible.<\/p>\n\n\n\n<p>In the &#8220;min&#8221; scenario, there are a large number of deleted but uncollected keys near the smallest keys. Therefore, the system has to scan a large amount of useless data during the reading process until it finds the first valid data, which leads to significant performance overhead.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"TiDB_Observability_Future_Outlook\"><\/span>TiDB Observability: Future Outlook<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>In TiDB 9.0 Beta 1, we will further enhance operator execution information and improve system observability. Key improvements include:<\/p>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li><strong>More Accurate Performance Attribution<\/strong>: This improvement will enhance the accuracy of operator timing by including both initialization (open) and finalization (close) times. Previously, execution metrics could underreport actual runtimes, making it harder to pinpoint performance bottlenecks. Now, the reported time reflects the full wall-clock duration (time) from start to finish, helping DBAs get a truer picture of spent time during query execution.&nbsp;<br><strong>Why it matters<\/strong>:<br>You can now identify slow operators and distinguish between slow execution vs. slow setup\/teardown more reliably, reducing false assumptions during tuning.<\/li>\n\n\n\n<li><strong>Clearer Time Accounting for Concurrent Execution<\/strong>: Previously, when the operator had multiple concurrent executions, the cumulative wall time could be misleading. In some cases, sub-operators could appear slower than their parent, leading to confusion. This improvement introduces total_time, total_open, and total_close, which separately report accumulated time across concurrent tasks, offering a clearer breakdown of time spent.<br><strong>Why it matters<\/strong>:<br>You can now distinguish real bottlenecks from artifacts of concurrent execution, making root cause analysis when investigating performance issues.<\/li>\n\n\n\n<li><strong>Better Diagnosis of TiFlash Specific Delays<\/strong>: TiFlash execution info now includes new latency metrics like minTSO_wait, pipeline_breaker_wait, and pipeline_queue_wait. These metrics reveal hidden wait times during MPP task scheduling and pipeline execution.<br><strong>Why it matters<\/strong>:<br>You can now distinguish real bottlenecks from artifacts of concurrent execution, making root cause analysis when investigating performance issues.<\/li>\n<\/ol>\n\n\n\n<p>With these improvements, TiDB will provide users with more comprehensive and accurate execution information, helping to better diagnose and optimize query performance.<\/p>\n\n\n\n<p>If you have any questions about TiDB observability, please feel free to connect with us on&nbsp;<a href=\"https:\/\/twitter.com\/PingCAP\" target=\"_blank\" rel=\"noreferrer noopener\">Twitter<\/a>,&nbsp;<a href=\"https:\/\/www.linkedin.com\/company\/pingcap\/mycompany\/\" target=\"_blank\" rel=\"noreferrer noopener\">LinkedIn<\/a>, or through our&nbsp;<a href=\"https:\/\/slack.tidb.io\/invite?team=tidb-community&amp;channel=everyone&amp;ref=pingcap&amp;__hstc=86493575.a1fea8c8486aaa74956704cbebabab43.1749809256999.1753717060814.1753721870423.140&amp;__hssc=86493575.6.1753721870423&amp;__hsfp=3863828579\" target=\"_blank\" rel=\"noreferrer noopener\">Slack Channel<\/a>.&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Have you ever seen two nearly identical SQL statements, differing only in date parameters or function variations, return similar results but with wildly different performance, sometimes by factors of 10 or 100? In real-world scenarios, we typically run the `EXPLAIN` statement to examine changes in the execution plan. But what if the execution plan doesn&#8217;t [&hellip;]<\/p>\n","protected":false},"author":313,"featured_media":28551,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ub_ctt_via":"","footnotes":""},"categories":[6],"tags":[147,422,11,111,29],"class_list":["post-28508","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-engineering","tag-distributed-sql","tag-observability","tag-real-time-analytics","tag-tidb","tag-tutorial"],"acf":[],"featured_image_src":"https:\/\/static.pingcap.com\/files\/2025\/07\/28095103\/tidb_feature_1800x600-1-13.png","author_info":{"display_name":"Barry Hu","author_link":"https:\/\/www.pingcap.com\/ko\/blog\/author\/bhu\/"},"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.9 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>TiDB Observability: A Journey Through Real-World Case Studies<\/title>\n<meta name=\"description\" content=\"Explore how TiDB users analyze and optimize SQL performance using observability data and real-world case studies.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.pingcap.com\/ko\/blog\/exploring-tidb-observability-real-world-case-studies\/\" \/>\n<meta property=\"og:locale\" content=\"ko_KR\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"TiDB Observability: A Journey Through Real-World Case Studies\" \/>\n<meta property=\"og:description\" content=\"Explore how TiDB users analyze and optimize SQL performance using observability data and real-world case studies.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.pingcap.com\/ko\/blog\/exploring-tidb-observability-real-world-case-studies\/\" \/>\n<meta property=\"og:site_name\" content=\"TiDB\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/facebook.com\/pingcap2015\" \/>\n<meta property=\"article:published_time\" content=\"2025-07-25T17:49:19+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-08-01T09:41:26+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/static.pingcap.com\/files\/2025\/07\/28095124\/tidb_1200x627-2-4.png\" \/>\n\t<meta property=\"og:image:width\" content=\"2400\" \/>\n\t<meta property=\"og:image:height\" content=\"1254\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Barry Hu\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/static.pingcap.com\/files\/2025\/07\/28095138\/tidb_twitter_1600x900-1-12.png\" \/>\n<meta name=\"twitter:creator\" content=\"@PingCAP\" \/>\n<meta name=\"twitter:site\" content=\"@PingCAP\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Barry Hu\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"16\ubd84\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/\"},\"author\":{\"name\":\"Barry Hu\",\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/person\/742a8051e7e1806b4477f9f38843320d\"},\"headline\":\"Exploring TiDB Observability: A Journey Through Real-World Case Studies\",\"datePublished\":\"2025-07-25T17:49:19+00:00\",\"dateModified\":\"2025-08-01T09:41:26+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/\"},\"wordCount\":2087,\"publisher\":{\"@id\":\"https:\/\/www.pingcap.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/static.pingcap.com\/files\/2025\/07\/28095103\/tidb_feature_1800x600-1-13.png\",\"keywords\":[\"Distributed SQL\",\"Observability\",\"Real-time analytics\",\"TiDB\",\"Tutorial\"],\"articleSection\":[\"Engineering\"],\"inLanguage\":\"ko-KR\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/\",\"url\":\"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/\",\"name\":\"TiDB Observability: A Journey Through Real-World Case Studies\",\"isPartOf\":{\"@id\":\"https:\/\/www.pingcap.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/static.pingcap.com\/files\/2025\/07\/28095103\/tidb_feature_1800x600-1-13.png\",\"datePublished\":\"2025-07-25T17:49:19+00:00\",\"dateModified\":\"2025-08-01T09:41:26+00:00\",\"description\":\"Explore how TiDB users analyze and optimize SQL performance using observability data and real-world case studies.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#breadcrumb\"},\"inLanguage\":\"ko-KR\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#primaryimage\",\"url\":\"https:\/\/static.pingcap.com\/files\/2025\/07\/28095103\/tidb_feature_1800x600-1-13.png\",\"contentUrl\":\"https:\/\/static.pingcap.com\/files\/2025\/07\/28095103\/tidb_feature_1800x600-1-13.png\",\"width\":3600,\"height\":1200},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.pingcap.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Exploring TiDB Observability: A Journey Through Real-World Case Studies\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.pingcap.com\/#website\",\"url\":\"https:\/\/www.pingcap.com\/\",\"name\":\"TiDB\",\"description\":\"TiDB | SQL at Scale\",\"publisher\":{\"@id\":\"https:\/\/www.pingcap.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.pingcap.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"ko-KR\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.pingcap.com\/#organization\",\"name\":\"PingCAP\",\"url\":\"https:\/\/www.pingcap.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png\",\"contentUrl\":\"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png\",\"width\":811,\"height\":232,\"caption\":\"PingCAP\"},\"image\":{\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/facebook.com\/pingcap2015\",\"https:\/\/x.com\/PingCAP\",\"https:\/\/linkedin.com\/company\/pingcap\",\"https:\/\/youtube.com\/channel\/UCuq4puT32DzHKT5rU1IZpIA\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/person\/742a8051e7e1806b4477f9f38843320d\",\"name\":\"Barry Hu\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"ko-KR\",\"@id\":\"https:\/\/www.pingcap.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/static.pingcap.com\/files\/2022\/10\/17234942\/avatar.jpg\",\"contentUrl\":\"https:\/\/static.pingcap.com\/files\/2022\/10\/17234942\/avatar.jpg\",\"caption\":\"Barry Hu\"},\"description\":\"Database Engineer\",\"url\":\"https:\/\/www.pingcap.com\/ko\/blog\/author\/bhu\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"TiDB Observability: A Journey Through Real-World Case Studies","description":"Explore how TiDB users analyze and optimize SQL performance using observability data and real-world case studies.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.pingcap.com\/ko\/blog\/exploring-tidb-observability-real-world-case-studies\/","og_locale":"ko_KR","og_type":"article","og_title":"TiDB Observability: A Journey Through Real-World Case Studies","og_description":"Explore how TiDB users analyze and optimize SQL performance using observability data and real-world case studies.","og_url":"https:\/\/www.pingcap.com\/ko\/blog\/exploring-tidb-observability-real-world-case-studies\/","og_site_name":"TiDB","article_publisher":"https:\/\/facebook.com\/pingcap2015","article_published_time":"2025-07-25T17:49:19+00:00","article_modified_time":"2025-08-01T09:41:26+00:00","og_image":[{"width":2400,"height":1254,"url":"https:\/\/static.pingcap.com\/files\/2025\/07\/28095124\/tidb_1200x627-2-4.png","type":"image\/png"}],"author":"Barry Hu","twitter_card":"summary_large_image","twitter_image":"https:\/\/static.pingcap.com\/files\/2025\/07\/28095138\/tidb_twitter_1600x900-1-12.png","twitter_creator":"@PingCAP","twitter_site":"@PingCAP","twitter_misc":{"Written by":"Barry Hu","Est. reading time":"16\ubd84"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#article","isPartOf":{"@id":"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/"},"author":{"name":"Barry Hu","@id":"https:\/\/www.pingcap.com\/#\/schema\/person\/742a8051e7e1806b4477f9f38843320d"},"headline":"Exploring TiDB Observability: A Journey Through Real-World Case Studies","datePublished":"2025-07-25T17:49:19+00:00","dateModified":"2025-08-01T09:41:26+00:00","mainEntityOfPage":{"@id":"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/"},"wordCount":2087,"publisher":{"@id":"https:\/\/www.pingcap.com\/#organization"},"image":{"@id":"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#primaryimage"},"thumbnailUrl":"https:\/\/static.pingcap.com\/files\/2025\/07\/28095103\/tidb_feature_1800x600-1-13.png","keywords":["Distributed SQL","Observability","Real-time analytics","TiDB","Tutorial"],"articleSection":["Engineering"],"inLanguage":"ko-KR"},{"@type":"WebPage","@id":"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/","url":"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/","name":"TiDB Observability: A Journey Through Real-World Case Studies","isPartOf":{"@id":"https:\/\/www.pingcap.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#primaryimage"},"image":{"@id":"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#primaryimage"},"thumbnailUrl":"https:\/\/static.pingcap.com\/files\/2025\/07\/28095103\/tidb_feature_1800x600-1-13.png","datePublished":"2025-07-25T17:49:19+00:00","dateModified":"2025-08-01T09:41:26+00:00","description":"Explore how TiDB users analyze and optimize SQL performance using observability data and real-world case studies.","breadcrumb":{"@id":"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#breadcrumb"},"inLanguage":"ko-KR","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/"]}]},{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#primaryimage","url":"https:\/\/static.pingcap.com\/files\/2025\/07\/28095103\/tidb_feature_1800x600-1-13.png","contentUrl":"https:\/\/static.pingcap.com\/files\/2025\/07\/28095103\/tidb_feature_1800x600-1-13.png","width":3600,"height":1200},{"@type":"BreadcrumbList","@id":"https:\/\/www.pingcap.com\/blog\/exploring-tidb-observability-real-world-case-studies\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.pingcap.com\/"},{"@type":"ListItem","position":2,"name":"Exploring TiDB Observability: A Journey Through Real-World Case Studies"}]},{"@type":"WebSite","@id":"https:\/\/www.pingcap.com\/#website","url":"https:\/\/www.pingcap.com\/","name":"\ud2f0DB","description":"TiDB | SQL at Scale","publisher":{"@id":"https:\/\/www.pingcap.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.pingcap.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"ko-KR"},{"@type":"Organization","@id":"https:\/\/www.pingcap.com\/#organization","name":"PingCAP","url":"https:\/\/www.pingcap.com\/","logo":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/","url":"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png","contentUrl":"https:\/\/static.pingcap.com\/files\/2021\/11\/pingcap-logo.png","width":811,"height":232,"caption":"PingCAP"},"image":{"@id":"https:\/\/www.pingcap.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/facebook.com\/pingcap2015","https:\/\/x.com\/PingCAP","https:\/\/linkedin.com\/company\/pingcap","https:\/\/youtube.com\/channel\/UCuq4puT32DzHKT5rU1IZpIA"]},{"@type":"Person","@id":"https:\/\/www.pingcap.com\/#\/schema\/person\/742a8051e7e1806b4477f9f38843320d","name":"Barry Hu","image":{"@type":"ImageObject","inLanguage":"ko-KR","@id":"https:\/\/www.pingcap.com\/#\/schema\/person\/image\/","url":"https:\/\/static.pingcap.com\/files\/2022\/10\/17234942\/avatar.jpg","contentUrl":"https:\/\/static.pingcap.com\/files\/2022\/10\/17234942\/avatar.jpg","caption":"Barry Hu"},"description":"Database Engineer","url":"https:\/\/www.pingcap.com\/ko\/blog\/author\/bhu\/"}]}},"grav_blocks":false,"card_markup":"<a class=\"card-resource bg-white\" href=\"https:\/\/www.pingcap.com\/ko\/blog\/exploring-tidb-observability-real-world-case-studies\/\"><div class=\"card-resource__image-container\"><img class=\"card-resource__image\" alt=\"tidb_feature_1800x600 (1)\" src=\"https:\/\/static.pingcap.com\/files\/2025\/07\/28095103\/tidb_feature_1800x600-1-13.png\" loading=\"lazy\" width=3600 height=1200 \/><\/div><div class=\"card-resource__content-container\"><div class=\"card-resource__content-head\"><div class=\"card-resource__category\">Engineering<\/div><\/div><h5 class=\"card-resource__title\">Exploring TiDB Observability: A Journey Through Real-World Case Studies<\/h5><\/div><\/a>","_links":{"self":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/posts\/28508","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/users\/313"}],"replies":[{"embeddable":true,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/comments?post=28508"}],"version-history":[{"count":45,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/posts\/28508\/revisions"}],"predecessor-version":[{"id":28662,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/posts\/28508\/revisions\/28662"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/media\/28551"}],"wp:attachment":[{"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/media?parent=28508"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/categories?post=28508"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.pingcap.com\/ko\/wp-json\/wp\/v2\/tags?post=28508"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}