
In complex business environments, resource isolation is critical to ensure stability, performance, and efficient resource utilization. TiDB offers robust features for resource control, allowing you to achieve efficient resource isolation across multiple levels:
- Physical Isolation: Through physical topology and data placement strategies.
- Soft Isolation (Flow-based Isolation): Through dynamic flow control and resource management.
This blog focuses on best practices for flow-based resource isolation, helping you understand how to control and isolate workloads at runtime without rigid physical separation.
What Kind of Resource Isolation Do You Need?
As of today, TiDB’s resource control capabilities have become increasingly sophisticated. They can achieve solid isolation effects across multiple levels, from data-based separation down to fine-grained SQL and background task control.
Data-Based Resource Isolation (Physical Isolation )
When it comes to resource isolation, people often think of how to limit the resources used by each application when multiple applications share a cluster.
TiDB 6.5 provides the ‘Placement Rule IN SQL’ feature, a data placement strategy based on SQL rules. This strategy can easily achieve physical separation at the data level, meeting the physical isolation requirements when multiple systems share a TiDB cluster.
Typical Scenario: A Fixed Set of Applications Share the Cluster and Fully Monopolize Resources

- Placement Rule IN SQL rules (suitable for a fixed set medium and large applications)
create placement policy policy_order constraints="[+zone=order]";
create placement policy policy_inv constraints="[+zone=inventory]";
alter database retail_order placement policy policy_order;
alter database retail_inventory placement policy policy_inv;
Flow Control-Based Resource Isolation (Soft Isolation)
While data isolation schemes can isolate resources like CPU, memory, and disk effectively, they offer limited flexibility.
- Medium and large systems benefit from binding databases and tables to specific storage nodes.
- Setting changes may involve the migration of underlying data, which takes some time to fully take effect.
- The operational complexity is relatively high.
TiDB 7.1 provides resource control functionality to simplify the complexity of resource control in a distributed architecture.
- Abstracting CPU/IO/network resources into RU (Resource Unit) uniformly.
- Provide isolation at three levels: database users, SQL statements, and background tasks
- With volume expansion and contraction, it can take effect in seconds without data transfer, providing ultimate flexibility
User(Tenants) Level Resource Limit
Small database systems generally have low resource utilization. When we apply for database resources, we often choose resources based on intuition or standardized configuration, and there may not be a clear basis behind it. In addition, multiple systems usually have different peak load times. With traditional solutions, teams must allocate resources based on the maximum peak plus redundancy, which inevitably leads to more idle capacity.
TiDB’s resource control solution enables us to achieve more efficient resource utilization in a smaller configuration. After integrating multiple small database systems into TiDB, teams can manage each system’s quota precisely based on its actual load peak, QPS, and other requirements. If any system consumes a high amount of resources, teams can assign it a dedicated TiDB computing instance.

Typical Scenario: Batch Load Isolation
In a system, there are often two types of loads: online and batch. For example, the peak of online operations is during the day, and it is basically the low peak period of business in the early morning. At this point, the system can safely run large-scale batch operations. Teams can perform high-density read and write tasks to complete batches quickly without disrupting daytime online services.
However, in some cases, the batch may run to the peak of business the next day. It is obviously not appropriate to stop the batch at this time. It is also not appropriate to let it continue to consume a lot of resources for execution.
In the past, there was actually a lack of control measures within the database, and limited control could only be carried out at the scheduling and application layers. TiDB’s resource control capabilities can easily manage this situation.
- Create separate users for batch jobs and bind resource groups.
CREATE USER 'USER_BATCH' IDENTIFIED BY '******';
CREATE RESOURCE GROUP rg_batch RU_PER_SEC = UNLIMITED PRIORITY = HIGH;
alter user 'USER_BATCH' RESOURCE GROUP rg_batch;
-- If the batch task runs overtime into the next day, simply execute the following SQL to immediately limit its rate
ALTER RESOURCE GROUP rg_batch RU_PER_SEC = 100000 PRIORITY = LOW;
Typical Scenario: SaaS

As shown in the third multi-tenant architecture pattern above, multiple tenants share a set of tables or databases. The data of different SaaS tenants is stored in the table through partitioning, or a single table is used directly (distinguished by the tenant_id fields in the table).
Faced with the scenario of thousands of tenants in SaaS, TiDB’s resource control functionality unifies and abstracts resources without dividing CPU/memory/node quantity. This makes it easy to achieve resource control at the database user level. It’s easy to change the limit. Its advantage is that it can dynamically adjust based on the actual situation of the cluster, offering great flexibility.

- Assuming tenant 1 has a very large volume, a separate database user (user1) can be created for it. Additionally, an independent resource group can be created for this user with high priority and permission for overuse.
CREATE RESOURCE GROUP rg_tenant1 RU_PER_SEC = 1000000 PRIORITY = HIGH BURSTABLE;
alter user 'user1' RESOURCE GROUP rg_tenant1;
- Other tenants share the same resource pool due to their smaller size.
CREATE RESOURCE GROUP rg_tenant_default RU_PER_SEC = 500000 PRIORITY = MEDIUM;
alter user 'user_default' RESOURCE GROUP rg_tenant_default;
- Expansion mode
ALTER RESOURCE GROUP rg_tenant_default RU_PER_SEC = 900000 PRIORITY = MEDIUM;
SQL Statement Level Isolation
Resource control in TiDB 7.2 introduced management capabilities for Runaway Queries, or queries that take longer to execute or consume more resources than expected.
- QUERY_WATCH (manual processing): Limits viewership of discovered SQL.
- QUERY_LIMIT (automatic processing): When unexpected SQL occurs, the database can recognize it and handle it with Self-Adaptation.
Typical Scenario: Limited Viewership of SQL and Circuit Breaker (QUERY_WATCH)
Some SQL statements may have this situation: Written improperly or in certain scenarios (such as drawing numbers) that require querying a large amount of data. They may consume a lot of resources during runtime. It’s okay if they run slower, but don’t consume too many resources and impact normal business requests. It’s easy to limit viewership of them through TiDB’s resource control capability.
-- When this SQL is executed, it is executed using only the rg1 resource group (rg1 is a resource of 100 RU).
SELECT /*+ RESOURCE_GROUP(rg1) */ * FROM t where is_delete=0 and create_time>='2023-01-01';
-- Lower the priority of this SQL in the default resource group to allow it to execute in a rate-limited manner.
QUERY WATCH ADD RESOURCE GROUP DEFAULT ACTION COOLDOWN SQL TEXT EXACT TO 'select * FROM t where is_delete=0 and create_time>='2023-01-01'
In addition, a situation could occur where a large SQL — never run before — suddenly appears in the system. Its origin is unknown, so we hope to directly block it and prevent it from executing.
-- Intercept and terminate based on the SQL DIGEST dimension.
QUERY WATCH ADD RESOURCE GROUP default ACTION KILL SQL DIGEST 'd08bc323a934c39dc41948b0a073725be3398479b6fa4f6dd1db2a9b115f7f57';
-- Additionally, you can use the SQL TEXT dimension (to individually intercept SQL with a specific condition value).
-- "as well as the PLAN DIGEST dimension (which can intercept the occurrence of a poor plan in SQL)"
Typical Scenario: Unexpected Query Self-Adaptation Management (QUERY_LIMIT)
QUERY_WATCH-based limited viewership and circuit breaker solutions are manual processing methods during or after the event. From their appearance and discovery to the final solution of the problem, this process often takes some time; even experienced administrators who are very familiar with the system often take several minutes or even longer to solve the problem. This means that the time that the business may be affected will also be related to it.
QUERY_LIMIT provides a more flexible Self-Adaptation processing method, allowing the database to automatically discover and handle exceptions. We only need to define the exception:
- Create a
rg_auto_cooldown
resource group with a limit of 100,000 RUs. We can define queries that take more than 60 seconds to execute in this resource group as Runaway Queries, and let the system automatically reduce the priority limit viewership of Runaway Queries.
CREATE RESOURCE GROUP rg_auto_cooldown RU_PER_SEC = 100000 QUERY_LIMIT=(EXEC_ELAPSED='60s', ACTION=COOLDOWN);
- Create a
rg_over_10000
resource group with a limit of 100,000 RU. We can define queries in this resource group that exceed 10,000 RU per second as Runaway Queries and let the system automatically reduce the priority limit viewership of them.
CREATE RESOURCE GROUP rg_colldown RU_PER_SEC = 100000 QUERY_LIMIT=(RU=10000, ACTION=COOLDOWN);
- You can also isolate large queries to other resource groups in the current resource group: Define the query that handles more than 1,000,000 rows of data in the
default
resource group as a Runaway Query. Then let the system automatically place it in therg_bigquery
resource group to avoid contention with requests in the current resource group.
CREATE RESOURCE GROUP rg_bigquery RU_PER_SEC = 10000 PRIORITY = LOW;
-- Assume the current resource group is default.
ALTER RESOURCE GROUP default QUERY_LIMIT=(PROCESSED_KEYS=1000000, ACTION=SWITCH_GROUP(rg_bigquery));
Background Task Level Isolation
Creating indexes in daily operation and maintenance is a common task. Although all DDLs are online in TiDB, we cannot ignore the risk of resource consumption when creating indexes. Resource control functionality in TiDB 7.4 introduced management of background tasks.
- Taking the DDL task of creating an index as an example, the DDL can be set as a background task and limited to a maximum of 30% of the total resources of the TiKV node. At this time, the system will dynamically limit the resource usage of this task to minimize the impact of such tasks on the performance of other foreground tasks during execution.
ALTER RESOURCE GROUP `default` BACKGROUND=(TASK_TYPES='ddl', UTILIZATION_LIMIT=30);
- Currently, TiDB supports the following types of background tasks: Lightning, br, ddl, and stats.
Conclusion
Throughout this post, we provided a more comprehensive understanding of TiDB’s resource isolation capabilities. In the future, our hope is that everyone can choose the appropriate resource isolation solution according to different scenario requirements to ensure system stability and continuously improve resource utilization.
Isolation scheme | Typical scenario | Technical implementation |
Data-based isolation scheme | Isolation between medium and large systems | Placement Rule IN SQL Placement Rule |
Different load isolation within the system (such as read-write separation) | ||
Flow control-based isolation scheme | Database user isolation (e.g. multiple small systems, SaaS scenarios) | Resource Control |
Manually isolate SQL statements (limit viewership of, circuit breaker) | ||
Automatically isolate abnormal SQL within the resource group | ||
Background task level isolation (DDL, backup, derivative, etc.) |
You can explore TiDB’s resource isolation capabilities in action by launching a free test cluster or connecting with one of our database experts.
Experience modern data infrastructure firsthand.
TiDB Cloud Dedicated
A fully-managed cloud DBaaS for predictable workloads
TiDB Cloud Serverless
A fully-managed cloud DBaaS for auto-scaling workloads