diff --git a/TOC.md b/TOC.md index dfa48c8fb4173..0bf1d8f3e668f 100644 --- a/TOC.md +++ b/TOC.md @@ -627,6 +627,7 @@ - Privileges - [Security Compatibility with MySQL](/security-compatibility-with-mysql.md) - [Privilege Management](/privilege-management.md) + - [Column-Level Privilege Management](/column-privilege-management.md) - [User Account Management](/user-account-management.md) - [TiDB Password Management](/password-management.md) - [Role-Based Access Control](/role-based-access-control.md) diff --git a/basic-features.md b/basic-features.md index 5d61dae69309c..7fc44b9b86898 100644 --- a/basic-features.md +++ b/basic-features.md @@ -260,7 +260,7 @@ You can try out TiDB features on [TiDB Playground](https://play.tidbcloud.com/?u | [Green GC](/system-variables.md#tidb_gc_scan_lock_mode-new-in-v50) | E | E | E | E | E | E | E | | [Resource control](/tidb-resource-control-ru-groups.md) | Y | Y | Y | Y | N | N | N | | [Runaway Queries management](/tidb-resource-control-runaway-queries.md) | Y | Y | E | N | N | N | N | -| [Background tasks management](/tidb-resource-control-background-tasks.md) | E | E | E | N | N | N | N | +| [Background tasks management](/tidb-resource-control-background-tasks.md) | Y | E | E | N | N | N | N | | [TiFlash Disaggregated Storage and Compute Architecture and S3 Support](/tiflash/tiflash-disaggregated-and-s3.md) | Y | Y | Y | E | N | N | N | | [Selecting TiDB nodes for the Distributed eXecution Framework (DXF) tasks](/system-variables.md#tidb_service_scope-new-in-v740) | Y | Y | Y | N | N | N | N | | PD Follower Proxy (controlled by [`tidb_enable_tso_follower_proxy`](/system-variables.md#tidb_enable_tso_follower_proxy-new-in-v530)) | Y | Y | Y | Y | Y | Y | Y | diff --git a/column-privilege-management.md b/column-privilege-management.md new file mode 100644 index 0000000000000..237c8e44e1f10 --- /dev/null +++ b/column-privilege-management.md @@ -0,0 +1,172 @@ +--- +title: Column-Level Privilege Management +summary: TiDB supports a MySQL-compatible column-level privilege management mechanism. You can grant or revoke `SELECT`, `INSERT`, `UPDATE`, and `REFERENCES` privileges on specific columns of a table using `GRANT` or `REVOKE`, achieving finer-grained access control. +--- + +# Column-Level Privilege Management + +Starting from v8.5.6, TiDB supports a MySQL-compatible column-level privilege management mechanism. With column-level privileges, you can grant or revoke `SELECT`, `INSERT`, `UPDATE`, and `REFERENCES` privileges on specific columns in a specified table, achieving finer-grained data access control. + +> **Note:** +> +> Although MySQL syntax allows column-level syntax such as `REFERENCES(col_name)`, `REFERENCES` itself is a database-level or table-level privilege used for foreign key-related privilege checks. Therefore, column-level `REFERENCES` does not produce any actual column-level privilege effect in MySQL. TiDB's behavior is consistent with MySQL. + +## Syntax + +The syntax for granting and revoking column-level privileges is similar to that for table-level privileges, with the following differences: + +- Write the column name list after the **privilege type**, not after the **table name**. +- Multiple column names are separated by commas (`,`). + +```sql +GRANT priv_type(col_name [, col_name] ...) [, priv_type(col_name [, col_name] ...)] ... + ON db_name.tbl_name + TO 'user'@'host'; + +REVOKE priv_type(col_name [, col_name] ...) [, priv_type(col_name [, col_name] ...)] ... + ON db_name.tbl_name + FROM 'user'@'host'; +``` + +Where: + +* `priv_type` supports `SELECT`, `INSERT`, `UPDATE`, and `REFERENCES`. +* The `ON` clause must specify a table, for example, `test.tbl`. +* A single `GRANT` or `REVOKE` statement can include multiple privilege items, and each privilege item can specify its own list of column names. + +For example, the following statement grants `SELECT` privileges on `col1` and `col2` and `UPDATE` privilege on `col3` to the user: + +```sql +GRANT SELECT(col1, col2), UPDATE(col3) ON test.tbl TO 'user'@'host'; +``` + +## Example: Grant column-level privileges + +The following example grants user `newuser` the `SELECT` privilege on `col1` and `col2` in table `test.tbl`, and grants the same user the `UPDATE` privilege on `col3`: + +```sql +CREATE DATABASE IF NOT EXISTS test; +USE test; + +DROP TABLE IF EXISTS tbl; +CREATE TABLE tbl (col1 INT, col2 INT, col3 INT); + +DROP USER IF EXISTS 'newuser'@'%'; +CREATE USER 'newuser'@'%'; + +GRANT SELECT(col1, col2), UPDATE(col3) ON test.tbl TO 'newuser'@'%'; +SHOW GRANTS FOR 'newuser'@'%'; +``` + +``` ++---------------------------------------------------------------------+ +| Grants for newuser@% | ++---------------------------------------------------------------------+ +| GRANT USAGE ON *.* TO 'newuser'@'%' | +| GRANT SELECT(col1, col2), UPDATE(col3) ON test.tbl TO 'newuser'@'%' | ++---------------------------------------------------------------------+ +``` + +In addition to using `SHOW GRANTS`, you can also view column-level privilege information by querying `INFORMATION_SCHEMA.COLUMN_PRIVILEGES`. + +## Example: Revoke column-level privileges + +The following example revokes the `SELECT` privilege on column `col2` from user `newuser`: + +```sql +REVOKE SELECT(col2) ON test.tbl FROM 'newuser'@'%'; +SHOW GRANTS FOR 'newuser'@'%'; +``` + +``` ++---------------------------------------------------------------+ +| Grants for newuser@% | ++---------------------------------------------------------------+ +| GRANT USAGE ON *.* TO 'newuser'@'%' | +| GRANT SELECT(col1), UPDATE(col3) ON test.tbl TO 'newuser'@'%' | ++---------------------------------------------------------------+ +``` + +## Example: Column-level privilege access control + +After granting or revoking column-level privileges, TiDB performs privilege checks on columns referenced in SQL statements. For example: + +* `SELECT` statements: `SELECT` column privileges affect columns referenced in the `SELECT` list as well as `WHERE`, `ORDER BY`, and other clauses. +* `UPDATE` statements: columns being updated in the `SET` clause require `UPDATE` column privileges. Columns read in expressions or conditions usually also require `SELECT` column privileges. +* `INSERT` statements: columns being written to require `INSERT` column privileges. `INSERT INTO t VALUES (...)` is equivalent to writing values to all columns in table definition order. + +In the following example, user `newuser` can only query `col1` and update `col3`: + +```sql +-- Execute as newuser +SELECT col1 FROM tbl; +SELECT * FROM tbl; -- Error (missing SELECT column privilege for col2, col3) + +UPDATE tbl SET col3 = 1; +UPDATE tbl SET col1 = 2; -- Error (missing UPDATE column privilege for col1) + +UPDATE tbl SET col3 = col1; +UPDATE tbl SET col3 = col3 + 1; -- Error (missing SELECT column privilege for col3) +UPDATE tbl SET col3 = col1 WHERE col1 > 0; +``` + +## Compatibility differences with MySQL + +TiDB's column-level privileges are generally compatible with MySQL. However, there are differences in the following scenarios: + +| Scenario | TiDB | MySQL | +| :--------------------------------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Revoking column-level privileges not granted to a user | `REVOKE` executes successfully. | When `IF EXISTS` is not used, `REVOKE` returns an error. | +| Execution order of column pruning and `SELECT` privilege check | `SELECT` column privileges are checked before column pruning. For example, executing `SELECT a FROM (SELECT a, b FROM t) s` requires `SELECT` column privileges on both `t.a` and `t.b`. | Column pruning is performed before `SELECT` column privileges are checked. For example, executing `SELECT a FROM (SELECT a, b FROM t) s` only requires the `SELECT` column privilege on `t.a`. | + +### Column pruning and privilege checks in view scenarios + +When performing `SELECT` privilege checks on views, MySQL and TiDB differ as follows: + +- MySQL first prunes columns in the view's internal query and then checks the column privileges of the internal tables, making the checks relatively lenient in some scenarios. +- TiDB does not perform column pruning before privilege checks, so additional column privileges might be required. + +```sql +-- Prepare the environment by logging in as root +DROP USER IF EXISTS 'u'@'%'; +CREATE USER 'u'@'%'; + +DROP TABLE IF EXISTS t; +CREATE TABLE t (a INT, b INT, c INT, d INT); + +DROP VIEW IF EXISTS v; +CREATE SQL SECURITY INVOKER VIEW v AS SELECT a, b FROM t WHERE c = 0 ORDER BY d; + +GRANT SELECT ON v TO 'u'@'%'; + +-- Log in as u +SELECT a FROM v; +-- MySQL: Error, missing access privileges for t.a, t.c, t.d +-- TiDB: Error, missing access privileges for t.a, t.b, t.c, t.d + +-- Log in as root +GRANT SELECT(a, c, d) ON t TO 'u'@'%'; + +-- Log in as u +SELECT a FROM v; +-- MySQL: Success (internal query is pruned to `SELECT a FROM t WHERE c = 0 ORDER BY d`) +-- TiDB: Error, missing access privileges for t.b + +SELECT * FROM v; +-- MySQL: Error, missing access privileges for t.b +-- TiDB: Error, missing access privileges for t.b + +-- Log in as root +GRANT SELECT(b) ON t TO 'u'@'%'; + +-- Log in as u +SELECT * FROM v; +-- MySQL: Success +-- TiDB: Success +``` + +## See also + +* [Privilege Management](/privilege-management.md) +* [`GRANT `](/sql-statements/sql-statement-grant-privileges.md) +* [`REVOKE `](/sql-statements/sql-statement-revoke-privileges.md) diff --git a/dashboard/dashboard-slow-query.md b/dashboard/dashboard-slow-query.md index 40fb09c82adb4..133ac560f69b5 100644 --- a/dashboard/dashboard-slow-query.md +++ b/dashboard/dashboard-slow-query.md @@ -61,7 +61,8 @@ Click any item in the list to display detailed execution information of the slow > **Note:** > -> The maximum length of the query recorded in the `Query` column is limited by the [`tidb_stmt_summary_max_sql_length`](/system-variables.md#tidb_stmt_summary_max_sql_length-new-in-v40) system variable. +> - The maximum length of the query recorded in the `Query` column is limited by the [`tidb_stmt_summary_max_sql_length`](/system-variables.md#tidb_stmt_summary_max_sql_length-new-in-v40) system variable. +> - For prepared statements, arguments are listed at the end of the query, for example: `[arguments: "foo", 123]`. Non-printable arguments are displayed as hexadecimal literals, for example, `0x01`. Click the **Expand** button to view the detailed information of an item. Click the **Copy** button to copy the detailed information to the clipboard. diff --git a/dashboard/top-sql.md b/dashboard/top-sql.md index 73c3b5e0b9185..4d8d8cbe5eb8b 100644 --- a/dashboard/top-sql.md +++ b/dashboard/top-sql.md @@ -1,32 +1,37 @@ --- title: TiDB Dashboard Top SQL page -summary: TiDB Dashboard Top SQL allows real-time monitoring and visualization of CPU overhead for SQL statements in your database. It helps optimize performance by identifying high CPU load statements and provides detailed execution information. It's suitable for analyzing performance issues and can be accessed through TiDB Dashboard or a browser. The feature has a slight impact on cluster performance and is now generally available for production use. +summary: Use Top SQL to identify queries that consume the most CPU, network, and logical IO resources --- # TiDB Dashboard Top SQL Page -With Top SQL, you can monitor and visually explore the CPU overhead of each SQL statement in your database in real-time, which helps you optimize and resolve database performance issues. Top SQL continuously collects and stores CPU load data summarized by SQL statements at any seconds from all TiDB and TiKV instances. The collected data can be stored for up to 30 days. Top SQL presents you with visual charts and tables to quickly pinpoint which SQL statements are contributing the high CPU load of a TiDB or TiKV instance over a certain period of time. +On the Top SQL page of TiDB Dashboard, you can view and analyze the most resource-consuming SQL queries on a specified TiDB or TiKV node over a period of time. + +- After you enable Top SQL, this feature continuously collects CPU workload data from existing TiDB and TiKV nodes and retains the data for up to 30 days. +- Starting from v8.5.6, you can also enable **TiKV Network IO collection (multi-dimensional)** in the Top SQL settings to further view metrics such as `Network Bytes` and `Logical IO Bytes` for specified TiKV nodes, and perform aggregation analysis in dimensions of `By Query`, `By Table`, `By DB`, and `By Region`. Top SQL provides the following features: -* Visualize the top 5 types of SQL statements with the highest CPU overhead through charts and tables. -* Display detailed execution information such as queries per second, average latency, and query plan. +* Visualize the top `5`, `20`, or `100` SQL queries with the most resource consumption in the current time range through charts and tables, with the remaining records automatically summarized as `Others`. +* Display resource consumption hotspots sorted by CPU time or network bytes. When selecting a TiKV node, you can also sort by logical IO bytes. +* Display SQL and execution plan details by query. When selecting a TiKV node, you can also aggregate analysis in dimensions of `By Table`, `By DB`, and `By Region`. +* Zoom in on a selected time range in the chart, manually refresh data, enable auto refresh, and export table data to CSV. * Collect all SQL statements that are executed, including those that are still running. -* Allow viewing data of a specific TiDB and TiKV instance. +* Display data of a specific TiDB or TiKV node. ## Recommended scenarios Top SQL is suitable for analyzing performance issues. The following are some typical Top SQL scenarios: -* You discovered that an individual TiKV instance in the cluster has a very high CPU usage through the Grafana charts. You want to know which SQL statements cause the CPU hotspots so that you can optimize them and better leverage all of your distributed resources. -* You discovered that the cluster has a very high CPU usage overall and queries are slow. You want to quickly figure out which SQL statements are currently consuming the most CPU resources so that you can optimize them. -* The CPU usage of the cluster has drastically changed and you want to know the major cause. -* Analyze the most resource-intensive SQL statements in the cluster and optimize them to reduce hardware costs. +* You discovered that an individual TiDB or TiKV node in the cluster has a very high CPU usage. You want to quickly locate which type of SQL is consuming a lot of CPU resources. +* The overall cluster queries become slow. You want to find out which SQL is currently consuming the most resources, or compare the main query differences before and after the workload changes. +* You need to locate hotspots from a higher dimension and want to aggregate and view resource consumption on the TiKV side by `Table`, `DB`, or `Region`. +* You need to troubleshoot TiKV hotspots from the perspective of network traffic or logical IO, not just limited to the CPU dimension. Top SQL cannot be used in the following scenarios: - Top SQL cannot be used to pinpoint non-performance issues, such as incorrect data or abnormal crashes. -- Top SQL does not support analyzing database performance issues that are not caused by high CPU load, such as transaction lock conflicts. +- Top SQL is not suitable for directly analyzing lock conflicts, transaction semantic errors, or other issues not caused by resource consumption. ## Access the page @@ -34,9 +39,9 @@ You can access the Top SQL page using either of the following methods: * After logging in to TiDB Dashboard, click **Top SQL** in the left navigation menu. - ![Top SQL](/media/dashboard/top-sql-access.png) + ![Top SQL](/media/dashboard/v8.5-top-sql-access.png) -* Visit in your browser. Replace `127.0.0.1:2379` with the actual PD instance address and port. +* Visit in your browser. Replace `127.0.0.1:2379` with the actual PD node address and port. ## Enable Top SQL @@ -47,10 +52,10 @@ You can access the Top SQL page using either of the following methods: Top SQL is not enabled by default as it has a slight impact on cluster performance (within 3% on average) when enabled. You can enable Top SQL by the following steps: 1. Visit the [Top SQL page](#access-the-page). -2. Click **Open Settings**. On the right side of the **Settings** area, switch on **Enable Feature**. +2. Click **Open Settings**. In the **Settings** area on the right side of the page, enable the **Enable Feature** switch. 3. Click **Save**. -After enabling the feature, wait up to 1 minute for Top SQL to load the data. Then you can see the CPU load details. +After enabling Top SQL, you can only view data collected starting from this point in time, while historical data before enabling will not be backfilled. Data display usually has a delay of about 1 minute, so you need to wait a moment to see new data. After disabling Top SQL, if historical data has not expired, the Top SQL page still displays this historical data, but new data will no longer be collected or displayed. In addition to the UI, you can also enable the Top SQL feature by setting the TiDB system variable [`tidb_enable_top_sql`](/system-variables.md#tidb_enable_top_sql-new-in-v540): @@ -60,57 +65,104 @@ In addition to the UI, you can also enable the Top SQL feature by setting the Ti SET GLOBAL tidb_enable_top_sql = 1; ``` +### (Optional) Enable TiKV Network IO collection New in v8.5.6 + +To view Top SQL by `Order By Network` or `Order By Logical IO` for TiKV nodes, or to use the `By Region` aggregation, you can enable the **Enable TiKV Network IO collection (multi-dimensional)** switch in Top SQL settings and save the changes. + +- **Order By Network**: Sorts by the number of network bytes generated during TiKV request processing. +- **Order By Logical IO**: Sorts by the amount of logical data (in bytes) processed by TiKV at the storage layer for TiKV requests, such as the data scanned or processed during reads and the data written by write requests. + +As shown in the following screenshot, the right **Settings** panel displays both the **Enable Feature** and **Enable TiKV Network IO collection (multi-dimensional)** switches. + +![Enable TiKV Network IO collection](/media/dashboard/v8.5-top-sql-settings-enable-tikv-network-io.png) + +**Enabling TiKV Network IO collection (multi-dimensional)** increases storage and query overhead. After enabling, the configuration is delivered to all current TiKV nodes; data display might also have a delay of about 1 minute. If some TiKV nodes fail to enable this feature, the page shows a warning, and new data might be incomplete. + +For newly added TiKV nodes, this switch does not take effect automatically. You need to set the **Enable TiKV Network IO collection (multi-dimensional)** switch to all enabled in the Top SQL settings panel and save, so the configuration is delivered to all TiKV nodes again. If you want newly added TiKV nodes to automatically enable this feature, add the following configuration under `server_configs.tikv` in the TiUP cluster topology file and use TiUP to re-deliver the TiKV configuration: + +```yaml +server_configs: + tikv: + resource-metering.enable-network-io-collection: true +``` + +For more information about TiUP topology configuration, see [TiUP cluster topology file configuration](/tiup/tiup-cluster-topology-reference.md). + ## Use Top SQL The following are the common steps to use Top SQL. 1. Visit the [Top SQL page](#access-the-page). -2. Select a particular TiDB or TiKV instance that you want to observe the load. +2. Select a particular TiDB or TiKV node that you want to observe the workload. + + ![Select a TiDB or TiKV node](/media/dashboard/v8.5-top-sql-usage-select-instance.png) + + If you are not sure which node to observe, you can first locate the node with abnormal workload from Grafana or the [TiDB Dashboard Overview page](/dashboard/dashboard-overview.md), and then return to the Top SQL page for further analysis. + +3. Set the time range and refresh data as needed. + + You can adjust the time range in the time picker or zoom the observation window by selecting a time range in the chart. Setting a smaller time range displays more fine-grained data, with a precision of up to 1 second. - ![Select Instance](/media/dashboard/top-sql-usage-select-instance.png) + ![Change time range](/media/dashboard/v8.5-top-sql-usage-change-timerange.png) - If you are unsure of which TiDB or TiKV instance to observe, you can select an arbitrary instance. Also, when the cluster CPU load is extremely unbalanced, you can first use Grafana charts to determine the specific instance you want to observe. + If the chart is out of date, click **Refresh** to refresh once, or select the data auto-refresh frequency from the **Refresh** drop-down list. -3. Observe the charts and tables presented by Top SQL. + ![Refresh](/media/dashboard/v8.5-top-sql-usage-refresh.png) - ![Chart and Table](/media/dashboard/top-sql-usage-chart.png) +4. Select the observation mode. - The size of the bars in the bar chart represents the size of CPU resources consumed by the SQL statement at that moment. Different colors distinguish different types of SQL statements. In most cases, you only need to focus on the SQL statements that have a higher CPU resource overhead in the corresponding time range in the chart. + - Use `Limit` to display the Top `5`, `20`, or `100` SQL queries. + - The default aggregation dimension is `By Query`. If you select a TiKV node, you can also aggregate in dimensions of `By Table`, `By DB`, or `By Region`. -4. Click a SQL statement in the table to show more information. You can see detailed execution metrics of different plans of that statement, such as Call/sec (average queries per second) and Scan Indexes/sec (average number of index rows scanned per second). + ![Select aggregation dimension](/media/dashboard/v8.5-top-sql-usage-select-agg-by.png) - ![Details](/media/dashboard/top-sql-details.png) + - The default sort order is `Order By CPU` (sorted by CPU time). If you select a TiKV node and have [enabled TiKV Network IO collection (multi-dimensional)](#optional-enable-tikv-network-io-collection-new-in-v856), you can also select `Order By Network` (sorted by network bytes) or `Order By Logical IO` (sorted by logical IO bytes). -5. Based on these initial clues, you can further explore the [SQL Statement](/dashboard/dashboard-statement-list.md) or [Slow Queries](/dashboard/dashboard-slow-query.md) page to find the root cause of high CPU consumption or large data scans of the SQL statement. + ![Select order by](/media/dashboard/v8.5-top-sql-usage-select-order-by.png) - You can adjust the time range in the time picker or select a time range in the chart to get a more precise and detailed look at the problem. A smaller time range can provide more detailed data, with precision of up to 1 second. + > **Note** + > + > `By Region`, `Order By Network`, and `Order By Logical IO` are only available when [TiKV Network IO collection (multi-dimensional)](#optional-enable-tikv-network-io-collection-new-in-v856) is enabled. If this feature is not enabled but historical data still exists, the page continues to display historical data and prompt that new data cannot be fully collected. - ![Change time range](/media/dashboard/top-sql-usage-change-timerange.png) +5. Observe the resource consumption hotspot records in the chart and table. - If the chart is out of date, you can click the **Refresh** button or select Auto Refresh options from the **Refresh** drop-down list. + ![Chart and Table](/media/dashboard/v8.5-top-sql-usage-chart.png) - ![Refresh](/media/dashboard/top-sql-usage-refresh.png) + The bar chart shows resource consumption under the current sort dimension, with different colors representing different records. The table displays cumulative values according to the current sort dimension, and provides an `Others` row at the end to summarize all non-Top N records. -6. View the CPU resource usage by table or database level to quickly identify resource usage at a higher level. Currently, only TiKV instances are supported. +6. In the `By Query` view, click a row in the table to view the execution plan details for that type of SQL. - Select a TiKV instance, and then select **By TABLE** or **By DB**: + ![Details](/media/dashboard/v8.5-top-sql-details.png) - ![Select aggregation dimension](/media/dashboard/top-sql-usage-select-agg-by.png) + In the SQL statement details, you can view the corresponding SQL template, Query template ID, Plan template ID, and execution plan text. The SQL statement details table displays different metrics depending on the node type: - View the aggregated results at a higher level: + - TiDB nodes usually show `Call/sec` and `Latency/call`. + - TiKV nodes usually show `Call/sec`, `Scan Rows/sec`, and `Scan Indexes/sec`. - ![Aggregated results at DB level](/media/dashboard/top-sql-usage-agg-by-db-detail.png) + > **Note** + > + > If you select the `By Table`, `By DB`, or `By Region` aggregation view, the page displays the aggregation results and does not show SQL statement details by SQL execution plan. + + In the `By Query` view, you can also click **Search in SQL Statements** in the Top SQL table to jump to the corresponding SQL Statement Analysis page. If you need to analyze the current table results offline, you can click **Download to CSV** above the table to export the current table data. + +7. On TiKV nodes, if you need to locate hotspots from a higher dimension, you can switch to `By Table`, `By DB`, or `By Region` to view the aggregated results. + + ![Aggregated results at DB level](/media/dashboard/v8.5-top-sql-usage-agg-by-db-detail.png) + +8. Based on these initial clues, you can further analyze the root cause using the [SQL Statement](/dashboard/dashboard-statement-list.md) or [Slow Queries](/dashboard/dashboard-slow-query.md) page. ## Disable Top SQL You can disable this feature by following these steps: -1. Visit [Top SQL page](#access-the-page). -2. Click the gear icon in the upper right corner to open the settings screen and switch off **Enable Feature**. +1. Visit the [Top SQL page](#access-the-page). +2. Click the gear icon in the upper right corner to open the settings pane and disable the **Enable Feature** switch. 3. Click **Save**. 4. In the popped-up dialog box, click **Disable**. +After you disable Top SQL, new Top SQL data collection will stop, but historical data can still be viewed before it expires. + In addition to the UI, you can also disable the Top SQL feature by setting the TiDB system variable [`tidb_enable_top_sql`](/system-variables.md#tidb_enable_top_sql-new-in-v540): {{< copyable "sql" >}} @@ -119,6 +171,15 @@ In addition to the UI, you can also disable the Top SQL feature by setting the T SET GLOBAL tidb_enable_top_sql = 0; ``` +### Disable TiKV Network IO collection + +If you only want to stop collecting multi-dimensional data such as `Network Bytes` and `Logical IO Bytes` for TiKV, while retaining the CPU dimension analysis capability of Top SQL, disable the **Enable TiKV Network IO collection (multi-dimensional)** switch in the Top SQL settings panel. + +After disabling: + +- The Top SQL page can still display previously collected, unexpired historical network IO and logical IO data. +- New network IO and logical IO data, as well as `By Region` data, will no longer be collected. + ## Frequently asked questions **1. Top SQL cannot be enabled and the UI displays "required component NgMonitoring is not started"**. @@ -127,24 +188,37 @@ See [TiDB Dashboard FAQ](/dashboard/dashboard-faq.md#a-required-component-ngmoni **2. Will performance be affected after enabling Top SQL?** -This feature has a slight impact on cluster performance. According to our benchmark, the average performance impact is usually less than 3% when the feature is enabled. +Enabling Top SQL has a slight impact on cluster performance. According to measurements, the average performance impact is less than 3%. If you also enable TiKV Network IO collection (multi-dimensional), there will be additional storage and query overhead. **3. What is the status of this feature?** It is now a generally available (GA) feature and can be used in production environments. -**4. What is the meaning of "Other Statements"?** +**4. What does `Others` mean in the UI?** -"Other Statement" counts the total CPU overhead of all non-Top 5 statements. With this information, you can learn the CPU overhead contributed by the Top 5 statements compared with the overall. +`Others` represents the summary result of all non-Top N records under the current sort dimension. You can use it to understand how much of the total workload comes from the Top N records. **5. What is the relationship between the CPU overhead displayed by Top SQL and the actual CPU usage of the process?** Their correlation is strong but they are not exactly the same thing. For example, the cost of writing multiple replicas is not counted in the TiKV CPU overhead displayed by Top SQL. In general, SQL statements with higher CPU usage result in higher CPU overhead displayed in Top SQL. -**6. What is the meaning of the Y-axis of the Top SQL chart?** +**6. What does the Y-axis of the Top SQL chart mean?** + +The Y-axis of the Top SQL chart represents the resource consumption under the current sort dimension. -It represents the size of CPU resources consumed. The more resources consumed by a SQL statement, the higher the value is. In most cases, you do not need to care about the meaning or unit of the specific value. +- When `Order By CPU` is selected, the Y-axis represents CPU time. +- When `Order By Network` is selected, the Y-axis represents network bytes. +- When `Order By Logical IO` is selected, the Y-axis represents logical IO bytes. **7. Does Top SQL collect running (unfinished) SQL statements?** -Yes. The bars displayed in the Top SQL chart at each moment indicate the CPU overhead of all running SQL statements at that moment. +Yes. After you enable Top SQL, TiDB Dashboard collects resource consumption for all running SQL statements, including unfinished ones. + +**8. Why is there no new data for `Order By Network`, `Order By Logical IO`, or `By Region`?** + +These views depend on TiKV Network IO collection (multi-dimensional). You can check the following items: + +- You have selected a TiKV node. +- The **Enable TiKV Network IO collection (multi-dimensional)** switch in the Top SQL settings panel is enabled. +- The relevant TiKV nodes in the cluster have all successfully enabled this configuration. If only some nodes enable this configuration, the Top SQL page prompts that new data might be incomplete. +- For newly added TiKV nodes, you need to manually enable the **Enable TiKV Network IO collection (multi-dimensional)** switch in the Top SQL settings panel and save the changes again. To make this setting automatically enabled for newly added nodes, also enable `resource-metering.enable-network-io-collection` in the TiKV default configuration of TiUP. diff --git a/dm/dm-compatibility-catalog.md b/dm/dm-compatibility-catalog.md index c7a28d31ed256..e4ebbb518606f 100644 --- a/dm/dm-compatibility-catalog.md +++ b/dm/dm-compatibility-catalog.md @@ -19,18 +19,35 @@ DM supports migrating data from different sources to TiDB clusters. Based on the | MySQL ≤ 5.5 | Not tested | | | MySQL 5.6 | GA | | | MySQL 5.7 | GA | | -| MySQL 8.0 | GA | Does not support binlog transaction compression [Transaction_payload_event](https://dev.mysql.com/doc/refman/8.0/en/binary-log-transaction-compression.html). | -| MySQL 8.1 ~ 8.3 | Not tested | | -| MySQL 8.4 | Incompatible | For more information, see [DM Issue #11020](https://github.com/pingcap/tiflow/issues/11020). | +| MySQL 8.0 | GA | Does not support [binlog transaction compression (`Transaction_payload_event`)](https://dev.mysql.com/doc/refman/8.0/en/binary-log-transaction-compression.html). | +| MySQL 8.1 ~ 8.3 | Not tested | Does not support [binlog transaction compression (`Transaction_payload_event`)](https://dev.mysql.com/doc/refman/8.0/en/binary-log-transaction-compression.html). | +| MySQL 8.4 | Experimental (supported starting from TiDB v8.5.6) | Does not support [binlog transaction compression (`Transaction_payload_event`)](https://dev.mysql.com/doc/refman/8.4/en/binary-log-transaction-compression.html). | | MySQL 9.x | Not tested | | | MariaDB < 10.1.2 | Incompatible | Incompatible with binlog of the time type. | | MariaDB 10.1.2 ~ 10.5.10 | Experimental | | | MariaDB > 10.5.10 | Not tested | Expected to work in most cases after bypassing the [precheck](/dm/dm-precheck.md). See [MariaDB notes](#mariadb-notes). | -### Incompatibility with foreign key CASCADE operations +### Foreign key `CASCADE` operations -- DM creates foreign key **constraints** on the target, but they are not enforced while applying transactions because DM sets the session variable [`foreign_key_checks=OFF`](/system-variables.md#foreign_key_checks). -- DM does **not** support `ON DELETE CASCADE` or `ON UPDATE CASCADE` behavior by default, and enabling `foreign_key_checks` via a DM task session variable is not recommended. If your workload relies on cascades, **do not assume** that cascade effects will be replicated. +> **Warning:** +> +> This feature is experimental. It is not recommended that you use it in the production environment. It might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tiflow/issues) on GitHub. + +Starting from v8.5.6, DM provides **experimental** support for replicating tables that use foreign key constraints. This support includes the following improvements: + +- **Safe mode**: during safe mode execution, DM sets `foreign_key_checks=0` for each batch and skips the redundant `DELETE` step for `UPDATE` statements that do not modify primary key or unique key values. This prevents `REPLACE INTO` (which internally performs `DELETE` + `INSERT`) from triggering unintended `ON DELETE CASCADE` effects on child rows. For more information, see [DM safe mode](/dm/dm-safe-mode.md#foreign-key-handling-new-in-v856). +- **Multi-worker causality**: when `worker-count > 1`, DM reads foreign key relationships from the downstream schema at task start and injects causality keys. This ensures that DML operations on parent rows complete before operations on dependent child rows, preserving binlog order across workers. + +The following limitations apply to foreign key replication: + +- In safe mode, DM does not support `UPDATE` statements that modify primary key or unique key values. The task is paused with the error `safe-mode update with foreign_key_checks=1 and PK/UK changes is not supported`. To replicate such statements, set `safe-mode: false`. +- When `foreign_key_checks=1`, DM does not support DDL statements that create, modify, or drop foreign key constraints during replication. +- Table routing is not supported when `worker-count > 1`. If you use table routing with tables that include foreign keys, set `worker-count` to `1`. +- The block-allow list must include all ancestor tables in the foreign key dependency chain. If ancestor tables are filtered out, the task is paused with an error during incremental replication. +- Foreign key metadata must be consistent between the source and downstream. If inconsistencies are detected, run `binlog-schema update --from-target` to resynchronize metadata. +- `ON UPDATE CASCADE` is not correctly replicated in safe mode when an `UPDATE` modifies primary key or unique key values. DM rewrites such statements as `DELETE` + `REPLACE`, which triggers `ON DELETE` actions instead of `ON UPDATE` actions. In this case, DM rejects the statement and pauses the task. `UPDATE` statements that do not modify key values are replicated correctly. + +In versions earlier than v8.5.6, DM creates foreign key constraints in the downstream but does not enforce them because it sets the session variable [`foreign_key_checks=OFF`](/system-variables.md#foreign_key_checks). As a result, cascading operations are not replicated to the downstream. ### MariaDB notes diff --git a/dm/dm-precheck.md b/dm/dm-precheck.md index 274581f8caffd..569b5511b7efe 100644 --- a/dm/dm-precheck.md +++ b/dm/dm-precheck.md @@ -51,7 +51,7 @@ Regardless of the migration mode you choose, the precheck always includes the fo - Compatibility of the upstream MySQL table schema - - Check whether the upstream tables have foreign keys, which are not supported by TiDB. A warning is returned if a foreign key is found in the precheck. + - Check whether the upstream tables have foreign keys. TiDB supports foreign keys (GA since v8.5.0), and DM provides experimental support for replicating tables with foreign key constraints starting from v8.5.6. During the precheck, DM returns a warning if foreign keys are detected. For supported scenarios and limitations, see [DM Compatibility Catalog](/dm/dm-compatibility-catalog.md#foreign-key-cascade-operations). - Check whether the upstream tables use character sets that are incompatible with TiDB. For more information, see [TiDB Supported Character Sets](/character-set-and-collation.md). - Check whether the upstream tables have primary key constraints or unique key constraints (introduced from v1.0.7). diff --git a/dm/dm-safe-mode.md b/dm/dm-safe-mode.md index 52b7dd10c169d..c86344e3130f9 100644 --- a/dm/dm-safe-mode.md +++ b/dm/dm-safe-mode.md @@ -24,6 +24,8 @@ In safe mode, DM guarantees the idempotency of binlog events by rewriting SQL st * `INSERT` statements are rewritten to `REPLACE` statements. * `UPDATE` statements are analyzed to obtain the value of the primary key or the unique index of the row updated. `UPDATE` statements are then rewritten to `DELETE` + `REPLACE` statements in the following two steps: DM deletes the old record using the primary key or unique index, and inserts the new record using the `REPLACE` statement. + Starting from v8.5.6, when you set `foreign_key_checks=1` in the task session, DM skips the `DELETE` step for `UPDATE` statements that do not modify primary key or unique index values. For more information, see [Foreign key handling](#foreign-key-handling-new-in-v856). + `REPLACE` is a MySQL-specific syntax for inserting data. When you insert data using `REPLACE`, and the new data and existing data have a primary key or unique constraint conflict, MySQL deletes all the conflicting records and executes the insert operation, which is equivalent to "force insert". For details, see [`REPLACE` statement](https://dev.mysql.com/doc/refman/8.0/en/replace.html) in MySQL documentation. Assume that a `dummydb.dummytbl` table has a primary key `id`. Execute the following SQL statements repeatedly on this table: @@ -91,6 +93,53 @@ mysql-instances: syncer-config-name: "global" # Name of the syncers configuration. ``` +## Foreign key handling New in v8.5.6 + +> **Warning:** +> +> This feature is experimental. It is not recommended that you use it in the production environment. It might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tiflow/issues) on GitHub. + +When you enable safe mode and set `foreign_key_checks=1` in the downstream task session, the default `DELETE` + `REPLACE` rewrite for `UPDATE` statements can trigger unintended `ON DELETE CASCADE` effects on child rows. Starting from v8.5.6, DM introduces the following improvements to address this issue. + +### Non-key `UPDATE` optimization + +For `UPDATE` statements that do not modify primary key or unique key values, DM skips the `DELETE` step and executes only `REPLACE INTO`. Because the primary key remains unchanged, `REPLACE INTO` overwrites the existing row without triggering foreign key cascade deletes. This optimization is applied automatically in safe mode. + +Take the following upstream statement as an example, where `id` is the primary key: + +```sql +UPDATE dummydb.dummytbl SET int_value = 888999 WHERE id = 123; +``` + +In versions earlier than v8.5.6, safe mode rewrites this statement as follows: + +```sql +DELETE FROM dummydb.dummytbl WHERE id = 123; -- Triggers ON DELETE CASCADE +REPLACE INTO dummydb.dummytbl (id, int_value, ...) VALUES (123, 888999, ...); +``` + +Starting from v8.5.6, safe mode rewrites the statement as follows: + +```sql +REPLACE INTO dummydb.dummytbl (id, int_value, ...) VALUES (123, 888999, ...); -- No cascade +``` + +> **Warning:** +> +> When `foreign_key_checks=1`, DM does not support replicating `UPDATE` statements that modify primary key or unique key values. In this case, the replication task is paused with the error `safe-mode update with foreign_key_checks=1 and PK/UK changes is not supported`. To replicate such `UPDATE` statements on tables with foreign keys, set `safe-mode: false`. + +### Session-level `foreign_key_checks` + +During batch execution in safe mode, DM executes `SET SESSION foreign_key_checks=0` before executing `INSERT` and `UPDATE` batches, and restores the original value of `foreign_key_checks` afterward. This prevents `REPLACE INTO` (which internally performs `DELETE` + `INSERT`) from triggering foreign key cascade operations in the downstream. + +This session-level setting introduces a small overhead per batch (two `SET SESSION` round trips). In most workloads, this overhead is negligible. + +### Multi-worker foreign key causality + +When you set `worker-count` to a value greater than 1 and the replication task includes tables with foreign keys, DM reads foreign key relationships from the downstream `CREATE TABLE` schema when the task starts. For each DML operation, DM injects causality keys based on these relationships. This ensures that operations on parent rows and their dependent child rows are assigned to the same DML worker queue. + +For detailed constraints, see [DM Compatibility Catalog](/dm/dm-compatibility-catalog.md#foreign-key-cascade-operations). + ## Notes for safe mode If you want to enable safe mode during the entire replication process for safety reasons, be aware of the following: diff --git a/download-ecosystem-tools.md b/download-ecosystem-tools.md index f1d4e44648420..6908205a69271 100644 --- a/download-ecosystem-tools.md +++ b/download-ecosystem-tools.md @@ -7,11 +7,11 @@ summary: Download the most officially maintained versions of TiDB tools. This document describes how to download the TiDB Toolkit. -TiDB Toolkit contains frequently used TiDB tools, such as data export tool Dumpling, data import tool TiDB Lightning, and backup and restore tool BR. +TiDB Toolkit contains frequently used tools, such as Dumpling (data export), TiDB Lightning (data import), BR (backup and restore), and sync-diff-inspector (data consistency check). > **Tip:** > -> - If your deployment environment has internet access, you can deploy a TiDB tool using a single [TiUP command](/tiup/tiup-component-management.md), so there is no need to download the TiDB Toolkit separately. +> - For TiDB v8.5.6 and later, most tools, including sync-diff-inspector, are directly available through TiUP. If your deployment environment has internet access, you can deploy a tool using a single [TiUP command](/tiup/tiup-component-management.md) without downloading the TiDB Toolkit separately. > - If you need to deploy and maintain TiDB on Kubernetes, instead of downloading the TiDB Toolkit, follow the steps in [TiDB Operator offline installation](https://docs.pingcap.com/tidb-in-kubernetes/stable/deploy-tidb-operator#offline-installation). ## Environment requirements @@ -45,7 +45,7 @@ Depending on which tools you want to use, you can install the corresponding offl | [TiDB Data Migration (DM)](/dm/dm-overview.md) | `dm-worker-{version}-linux-{arch}.tar.gz`
`dm-master-{version}-linux-{arch}.tar.gz`
`dmctl-{version}-linux-{arch}.tar.gz` | | [TiCDC](/ticdc/ticdc-overview.md) | `cdc-{version}-linux-{arch}.tar.gz` | | [Backup & Restore (BR)](/br/backup-and-restore-overview.md) | `br-{version}-linux-{arch}.tar.gz` | -| [sync-diff-inspector](/sync-diff-inspector/sync-diff-inspector-overview.md) | `sync_diff_inspector` | +| [sync-diff-inspector](/sync-diff-inspector/sync-diff-inspector-overview.md) | For TiDB v8.5.6 and later: `tiflow-{version}-linux-{arch}.tar.gz`
For versions before v8.5.6: `sync_diff_inspector` | | [PD Recover](/pd-recover.md) | `pd-recover-{version}-linux-{arch}.tar` | > **Note:** diff --git a/foreign-key.md b/foreign-key.md index aea4246cc9cc8..003354246b154 100644 --- a/foreign-key.md +++ b/foreign-key.md @@ -177,9 +177,11 @@ When the foreign key constraint check is disabled, the foreign key constraint ch ## Locking -When `INSERT` or `UPDATE` a child table, the foreign key constraint checks whether the corresponding foreign key value exists in the parent table, and locks the row in the parent table to avoid the foreign key value being deleted by other operations violating the foreign key constraint. The locking behavior is equivalent to performing a `SELECT FOR UPDATE` operation on the row where the foreign key value is located in the parent table. +When you `INSERT` into or `UPDATE` a child table, the foreign key constraint checks whether the corresponding foreign key value exists in the parent table and locks the corresponding row in the parent table to prevent other operations from deleting the foreign key value and violating the foreign key constraint. -Because TiDB currently does not support `LOCK IN SHARE MODE`, if a child table accepts a large number of concurrent writes and most of the referenced foreign key values are the same, there might be serious locking conflicts. It is recommended to disable [`foreign_key_checks`](/system-variables.md#foreign_key_checks) when writing a large number of child table data. +By default, in pessimistic transactions, the locking behavior of foreign key checks on rows in the parent table is equivalent to performing a locking read using `SELECT ... FOR UPDATE` (that is, acquiring an exclusive lock) on the corresponding row. In high-concurrency write scenarios for a child table, if a large number of transactions repeatedly reference the same parent table row, serious lock conflicts might occur. + +You can enable the system variable [`tidb_foreign_key_check_in_shared_lock`](/system-variables.md#tidb_foreign_key_check_in_shared_lock-new-in-v856) to let foreign key checks use shared locks. Shared locks allow multiple transactions to perform foreign key checks on the same parent table row simultaneously, thereby reducing lock conflicts and improving the performance of concurrent writes to child tables. ## Definition and metadata of foreign keys @@ -303,7 +305,7 @@ Create Table | CREATE TABLE `child` ( -- [DM](/dm/dm-overview.md) does not support foreign keys. DM disables the [`foreign_key_checks`](/system-variables.md#foreign_key_checks) of the downstream TiDB when replicating data to TiDB. Therefore, the cascading operations caused by foreign keys are not replicated from the upstream to the downstream, which might cause data inconsistency. +- [DM](/dm/dm-overview.md): starting from v8.5.6, DM supports replicating tables that use foreign key constraints as an experimental feature. For supported scenarios and limitations, see [DM Compatibility Catalog](/dm/dm-compatibility-catalog.md#foreign-key-cascade-operations). In versions earlier than v8.5.6, DM disables the [`foreign_key_checks`](/system-variables.md#foreign_key_checks) system variable when replicating data to TiDB, so cascading operations are not replicated to the downstream cluster. - [TiCDC](/ticdc/ticdc-overview.md) v6.6.0 is compatible with foreign keys. The previous versions of TiCDC might report an error when replicating tables with foreign keys. It is recommended to disable the `foreign_key_checks` of the downstream TiDB cluster when using a TiCDC version earlier than v6.6.0. - [BR](/br/backup-and-restore-overview.md) v6.6.0 is compatible with foreign keys. The previous versions of BR might report an error when restoring tables with foreign keys to a v6.6.0 or later cluster. It is recommended to disable the `foreign_key_checks` of the downstream TiDB cluster before restoring the cluster when using a BR earlier than v6.6.0. - When you use [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md), if the target table uses a foreign key, it is recommended to disable the `foreign_key_checks` of the downstream TiDB cluster before importing data. For versions earlier than v6.6.0, disabling this system variable does not take effect, and you need to grant the `REFERENCES` privilege for the downstream database user, or manually create the target table in the downstream database in advance to ensure smooth data import. diff --git a/identify-slow-queries.md b/identify-slow-queries.md index ce2dc0e10778a..e84fb959e1e5f 100644 --- a/identify-slow-queries.md +++ b/identify-slow-queries.md @@ -172,12 +172,149 @@ Fields related to storage engines: - `Storage_from_kv`: introduced in v8.5.5, indicates whether this statement read data from TiKV. - `Storage_from_mpp`: introduced in v8.5.5, indicates whether this statement read data from TiFlash. +## Use `tidb_slow_log_rules` + +[`tidb_slow_log_rules`](/system-variables.md#tidb_slow_log_rules-new-in-v856) is used to define trigger rules for slow query logs, supporting multi-dimensional metric combinations. It is suitable for "targeted sampling" and "problem reproduction" of slow logs, enabling you to filter target statements based on specific metric combinations. + +The triggering behavior of slow query logs depends on the configuration of `tidb_slow_log_rules`: + +- If `tidb_slow_log_rules` is not set, slow query log triggering still relies on [`tidb_slow_log_threshold`](/system-variables.md#tidb_slow_log_threshold) (in milliseconds). +- If `tidb_slow_log_rules` is set, the configured rules take precedence, and [`tidb_slow_log_threshold`](/system-variables.md#tidb_slow_log_threshold) will be ignored. + +For more information about meanings, diagnostic value, and background information of each field, see the [Fields description](#fields-description). + +### Unified rule syntax and type constraints + +- Rule capacity and separation: `SESSION` and `GLOBAL` each support a maximum of 10 rules. A single session can have up to 20 active rules. Rules are separated by `;`. +- Condition format: each condition uses the format `field_name:value`. Multiple conditions within a single rule are separated by `,`. +- Field and scope: field names are case-insensitive (underscores and other characters are preserved). `SESSION` rules do not support `Conn_ID`. Only `GLOBAL` rules support `Conn_ID`. +- Matching semantics: + - Numeric fields are matched using `>=`. String and boolean fields are matched using equality (`=`). + - Matching for `DB` and `Resource_group` is case-insensitive. + - Explicit operators such as `>`, `<`, and `!=` are not supported. + +Type constraints are as follows: + +- Numeric types (`int64`, `uint64`, `float64`) uniformly require `>= 0`. Negative values will result in a parsing error. + - `int64`: the maximum value is `2^63-1`. + - `uint64`: the maximum value is `2^64-1`. + - `float64`: the general upper limit is approximately `1.79e308`. Currently, parsing is done using Go's `ParseFloat`. While `NaN`/`Inf` can be parsed, they might lead to rules that are always true or always false. It is not recommended to use them. +- `bool`: supports `true`/`false`, `1`/`0`, and `t`/`f` (case-insensitive). +- `string`: currently does not support strings containing the separators `,` (condition separator) or `;` (rule separator), even with quotes (single or double). Escaping is not supported. +- Duplicate fields: if the same field is specified multiple times in a single rule, the last occurrence takes effect. + +### Supported fields + +For detailed field descriptions, diagnostic meanings, and background information, see the [field descriptions in `identify-slow-queries`](/identify-slow-queries.md#fields-description). + +Unless otherwise noted, the fields in the following table follow the general matching and type rules described in [Unified rule syntax and type constraints](#unified-rule-syntax-and-type-constraints). This table lists only the currently supported field names, types, units, and a few rule-specific notes. It does not repeat each field's semantic meaning. + +| Field name | Type | Unit | Notes | +| -------------------------------------- | -------- | ------ | ------------------------------ | +| `Conn_ID` | `uint` | count | Supported only in `GLOBAL` rules | +| `Session_alias` | `string` | none | - | +| `DB` | `string` | none | Case-insensitive when matched | +| `Exec_retry_count` | `uint` | count | - | +| `Query_time` | `float` | second | - | +| `Parse_time` | `float` | second | - | +| `Compile_time` | `float` | second | - | +| `Rewrite_time` | `float` | second | - | +| `Optimize_time` | `float` | second | - | +| `Wait_TS` | `float` | second | - | +| `Is_internal` | `bool` | none | - | +| `Digest` | `string` | none | - | +| `Plan_digest` | `string` | none | - | +| `Num_cop_tasks` | `int` | count | - | +| `Mem_max` | `int` | bytes | - | +| `Disk_max` | `int` | bytes | - | +| `Write_sql_response_total` | `float` | second | - | +| `Succ` | `bool` | none | - | +| `Resource_group` | `string` | none | Case-insensitive when matched | +| `KV_total` | `float` | second | - | +| `PD_total` | `float` | second | - | +| `Unpacked_bytes_sent_tikv_total` | `int` | bytes | - | +| `Unpacked_bytes_received_tikv_total` | `int` | bytes | - | +| `Unpacked_bytes_sent_tikv_cross_zone` | `int` | bytes | - | +| `Unpacked_bytes_received_tikv_cross_zone` | `int` | bytes | - | +| `Unpacked_bytes_sent_tiflash_total` | `int` | bytes | - | +| `Unpacked_bytes_received_tiflash_total` | `int` | bytes | - | +| `Unpacked_bytes_sent_tiflash_cross_zone` | `int` | bytes | - | +| `Unpacked_bytes_received_tiflash_cross_zone` | `int` | bytes | - | +| `Process_time` | `float` | second | - | +| `Backoff_time` | `float` | second | - | +| `Total_keys` | `uint` | count | - | +| `Process_keys` | `uint` | count | - | +| `cop_mvcc_read_amplification` | `float` | ratio | Ratio value (`Total_keys / Process_keys`) | +| `Prewrite_time` | `float` | second | - | +| `Commit_time` | `float` | second | - | +| `Write_keys` | `uint` | count | - | +| `Write_size` | `uint` | bytes | - | +| `Prewrite_region` | `uint` | count | - | + +### Effective behavior and matching order + +- Rule update behavior: every execution of `SET [SESSION|GLOBAL] tidb_slow_log_rules = '...'` overwrites the existing rules in that scope instead of appending to them. +- Rule clearing behavior: `SET [SESSION|GLOBAL] tidb_slow_log_rules = ''` clears the rules in the corresponding scope. +- If the current session has any applicable `tidb_slow_log_rules`, such as `SESSION` rules, `GLOBAL` rules for the current `Conn_ID`, or generic global rules without `Conn_ID`, the output of slow query logs is determined by rule matching results, and `tidb_slow_log_threshold` is no longer used. +- If the current session has no applicable rules, for example when both `SESSION` and `GLOBAL` rules are empty, or only `GLOBAL` rules that do not match the current `Conn_ID` are configured, slow query logging still depends on `tidb_slow_log_threshold`. Note that the unit is milliseconds. +- If you still want to use SQL execution time as a condition for writing slow logs, use `Query_time` in the rule and note that the unit is seconds. +- Rule matching logic: + - Multiple rules are combined with `OR`, while multiple field conditions within a single rule are combined with `AND`. + - `SESSION`-scope rules are matched first. If none matches, TiDB then matches `GLOBAL` rules for the current `Conn_ID`, followed by generic `GLOBAL` rules without `Conn_ID`. +- `SHOW VARIABLES LIKE 'tidb_slow_log_rules'` and `SELECT @@SESSION.tidb_slow_log_rules` return the `SESSION` rule text, or an empty string if unset. `SELECT @@GLOBAL.tidb_slow_log_rules` returns the `GLOBAL` rule text. + +### Examples + +- Standard format (`SESSION` scope): + + ```sql + SET SESSION tidb_slow_log_rules = 'Query_time: 0.5, Is_internal: false'; + ``` + +- Invalid format (`SESSION` scope does not support `Conn_ID`): + + ```sql + SET SESSION tidb_slow_log_rules = 'Conn_ID: 12, Query_time: 0.5, Is_internal: false'; + ``` + +- Global rule (applies to all connections): + + ```sql + SET GLOBAL tidb_slow_log_rules = 'Query_time: 0.5, Is_internal: false'; + ``` + +- Global rules for specific connections (applied separately to the two connections `Conn_ID:11` and `Conn_ID:12`): + + ```sql + SET GLOBAL tidb_slow_log_rules = 'Conn_ID: 11, Query_time: 0.5, Is_internal: false; Conn_ID: 12, Query_time: 0.6, Process_time: 0.3, DB: db1'; + ``` + +### Recommendations + +- `tidb_slow_log_rules` is designed to replace the single-threshold approach. It supports combinations of multi-dimensional metric conditions, enabling more flexible and fine-grained control over slow query logging. + +- In a well-provisioned test environment with 1 TiDB node (16 CPU cores, 48 GiB memory) and 3 TiKV nodes (each with 16 CPU cores and 48 GiB memory), repeated sysbench tests show that performance impact remains small when multi-dimensional slow query log rules generate millions of slow log entries within 30 minutes. However, when the log volume reaches tens of millions, TPS drops significantly and latency increases noticeably. Therefore, if business workload is high or CPU and memory resources are close to their limits, configure `tidb_slow_log_rules` carefully to avoid log flooding caused by overly broad rules. If you need to limit the log output rate, use [`tidb_slow_log_max_per_sec`](/system-variables.md#tidb_slow_log_max_per_sec-new-in-v856) to throttle it and reduce the impact on business performance. + ## Related system variables -* [`tidb_slow_log_threshold`](/system-variables.md#tidb_slow_log_threshold): Sets the threshold for the slow log. The SQL statement whose execution time exceeds this threshold is recorded in the slow log. The default value is 300 (ms). -* [`tidb_query_log_max_len`](/system-variables.md#tidb_query_log_max_len): Sets the maximum length of the SQL statement recorded in the slow log. The default value is 4096 (byte). -* [tidb_redact_log](/system-variables.md#tidb_redact_log): Determines whether to desensitize user data using `?` in the SQL statement recorded in the slow log. The default value is `0`, which means to disable the feature. -* [`tidb_enable_collect_execution_info`](/system-variables.md#tidb_enable_collect_execution_info): Determines whether to record the physical execution information of each operator in the execution plan. The default value is `1`. This feature impacts the performance by approximately 3%. After enabling this feature, you can view the `Plan` information as follows: +* [`tidb_slow_log_rules`](/system-variables.md#tidb_slow_log_rules-new-in-v856): see [`tidb_slow_log_rules` recommendations](#recommendations) + +* [`tidb_slow_log_threshold`](/system-variables.md#tidb_slow_log_threshold): sets the threshold for slow query logging. SQL statements whose execution time exceeds this threshold are recorded in the slow query log. The default value is `300ms` (milliseconds). + + > **Tip:** + > + > Time-related fields in `tidb_slow_log_rules`, such as `Query_time` and `Process_time`, use seconds as the unit and can include decimals, while [`tidb_slow_log_threshold`](/system-variables.md#tidb_slow_log_threshold) uses milliseconds. + +* [`tidb_slow_log_max_per_sec`](/system-variables.md#tidb_slow_log_max_per_sec-new-in-v856): sets the maximum number of slow query log entries that can be written per second. The default value is `0`. This variable is introduced in v8.5.6. + * A value of `0` means there is no limit on the number of slow query log entries written per second. + * A value greater than `0` means TiDB writes at most the specified number of slow query log entries per second. Any excess log entries are discarded and not written to the slow query log file. + * It is recommended to set this variable after enabling `tidb_slow_log_rules` to prevent rule-based slow query logging from being triggered too frequently. + +* [`tidb_query_log_max_len`](/system-variables.md#tidb_query_log_max_len): sets the maximum length of the SQL statement recorded in the slow query log. The default value is 4096 (byte). + +* [`tidb_redact_log`](/system-variables.md#tidb_redact_log): controls whether user data in SQL statements recorded in the slow query log is redacted and replaced with `?`. The default value is `0`, which means this feature is disabled. + +* [`tidb_enable_collect_execution_info`](/system-variables.md#tidb_enable_collect_execution_info): controls whether to record the physical execution information of each operator in the execution plan. The default value is `1`. This feature impacts the performance by approximately 3%. After enabling this feature, you can view the `Plan` information as follows: ```sql > select tidb_decode_plan('jAOIMAk1XzE3CTAJMQlmdW5jczpjb3VudChDb2x1bW4jNyktPkMJC/BMNQkxCXRpbWU6MTAuOTMxNTA1bXMsIGxvb3BzOjIJMzcyIEJ5dGVzCU4vQQoxCTMyXzE4CTAJMQlpbmRleDpTdHJlYW1BZ2dfOQkxCXQRSAwyNzY4LkgALCwgcnBjIG51bTogMQkMEXMQODg0MzUFK0hwcm9jIGtleXM6MjUwMDcJMjA2HXsIMgk1BWM2zwAAMRnIADcVyAAxHcEQNQlOL0EBBPBbCjMJMTNfMTYJMQkzMTI4MS44NTc4MTk5MDUyMTcJdGFibGU6dCwgaW5kZXg6aWR4KGEpLCByYW5nZTpbLWluZiw1MDAwMCksIGtlZXAgb3JkZXI6ZmFsc2UJMjUBrgnQVnsA'); diff --git a/information-schema/information-schema-analyze-status.md b/information-schema/information-schema-analyze-status.md index 4a57f1433ed7d..837f319a2d408 100644 --- a/information-schema/information-schema-analyze-status.md +++ b/information-schema/information-schema-analyze-status.md @@ -79,4 +79,4 @@ Fields in the `ANALYZE_STATUS` table are described as follows: ## See also - [`ANALYZE TABLE`](/sql-statements/sql-statement-analyze-table.md) -- [`SHOW ANALYZE STATUS`](/sql-statements/sql-statement-show-analyze-status.md) \ No newline at end of file +- [`SHOW ANALYZE STATUS`](/sql-statements/sql-statement-show-analyze-status.md) diff --git a/information-schema/information-schema.md b/information-schema/information-schema.md index 9f949f0cef976..5222584760617 100644 --- a/information-schema/information-schema.md +++ b/information-schema/information-schema.md @@ -20,7 +20,7 @@ Many `INFORMATION_SCHEMA` tables have a corresponding `SHOW` statement. The bene | [`COLLATIONS`](/information-schema/information-schema-collations.md) | Provides a list of collations that the server supports. | | [`COLLATION_CHARACTER_SET_APPLICABILITY`](/information-schema/information-schema-collation-character-set-applicability.md) | Explains which collations apply to which character sets. | | [`COLUMNS`](/information-schema/information-schema-columns.md) | Provides a list of columns for all tables. | -| `COLUMN_PRIVILEGES` | Not implemented by TiDB. Returns zero rows. | +| `COLUMN_PRIVILEGES` | Summarizes the information about column privileges visible to the current user. | | `COLUMN_STATISTICS` | Not implemented by TiDB. Returns zero rows. | | [`ENGINES`](/information-schema/information-schema-engines.md) | Provides a list of supported storage engines. | | `EVENTS` | Not implemented by TiDB. Returns zero rows. | @@ -38,14 +38,14 @@ Many `INFORMATION_SCHEMA` tables have a corresponding `SHOW` statement. The bene | `REFERENTIAL_CONSTRAINTS` | Provides information on `FOREIGN KEY` constraints. | | `ROUTINES` | Not implemented by TiDB. Returns zero rows. | | [`SCHEMATA`](/information-schema/information-schema-schemata.md) | Provides similar information to `SHOW DATABASES`. | -| `SCHEMA_PRIVILEGES` | Not implemented by TiDB. Returns zero rows. | +| `SCHEMA_PRIVILEGES` | Summarizes the database privileges visible to the current user. | | `SESSION_STATUS` | Not implemented by TiDB. Returns zero rows. | | [`SESSION_VARIABLES`](/information-schema/information-schema-session-variables.md) | Provides similar functionality to the command `SHOW SESSION VARIABLES` | | [`STATISTICS`](/information-schema/information-schema-statistics.md) | Provides information on table indexes. | | [`TABLES`](/information-schema/information-schema-tables.md) | Provides a list of tables that the current user has visibility of. Similar to `SHOW TABLES`. | | `TABLESPACES` | Not implemented by TiDB. Returns zero rows. | | [`TABLE_CONSTRAINTS`](/information-schema/information-schema-table-constraints.md) | Provides information on primary keys, unique indexes and foreign keys. | -| `TABLE_PRIVILEGES` | Not implemented by TiDB. Returns zero rows. | +| `TABLE_PRIVILEGES` | Summarizes the table privileges visible to the current user. | | `TRIGGERS` | Not implemented by TiDB. Returns zero rows. | | [`USER_ATTRIBUTES`](/information-schema/information-schema-user-attributes.md) | Summarizes information about user comments and user attributes. | | [`USER_PRIVILEGES`](/information-schema/information-schema-user-privileges.md) | Summarizes the privileges associated with the current user. | diff --git a/media/dashboard/v8.5-top-sql-access.png b/media/dashboard/v8.5-top-sql-access.png new file mode 100644 index 0000000000000..3f7cf525df95c Binary files /dev/null and b/media/dashboard/v8.5-top-sql-access.png differ diff --git a/media/dashboard/v8.5-top-sql-details.png b/media/dashboard/v8.5-top-sql-details.png new file mode 100644 index 0000000000000..7357ba92c51cd Binary files /dev/null and b/media/dashboard/v8.5-top-sql-details.png differ diff --git a/media/dashboard/v8.5-top-sql-settings-enable-tikv-network-io.png b/media/dashboard/v8.5-top-sql-settings-enable-tikv-network-io.png new file mode 100644 index 0000000000000..0511908eb42e9 Binary files /dev/null and b/media/dashboard/v8.5-top-sql-settings-enable-tikv-network-io.png differ diff --git a/media/dashboard/v8.5-top-sql-usage-agg-by-db-detail.png b/media/dashboard/v8.5-top-sql-usage-agg-by-db-detail.png new file mode 100644 index 0000000000000..40959319968af Binary files /dev/null and b/media/dashboard/v8.5-top-sql-usage-agg-by-db-detail.png differ diff --git a/media/dashboard/v8.5-top-sql-usage-change-timerange.png b/media/dashboard/v8.5-top-sql-usage-change-timerange.png new file mode 100644 index 0000000000000..0dd5399c1b4fc Binary files /dev/null and b/media/dashboard/v8.5-top-sql-usage-change-timerange.png differ diff --git a/media/dashboard/v8.5-top-sql-usage-chart.png b/media/dashboard/v8.5-top-sql-usage-chart.png new file mode 100644 index 0000000000000..560812d79d6e9 Binary files /dev/null and b/media/dashboard/v8.5-top-sql-usage-chart.png differ diff --git a/media/dashboard/v8.5-top-sql-usage-refresh.png b/media/dashboard/v8.5-top-sql-usage-refresh.png new file mode 100644 index 0000000000000..a93449751eebb Binary files /dev/null and b/media/dashboard/v8.5-top-sql-usage-refresh.png differ diff --git a/media/dashboard/v8.5-top-sql-usage-select-agg-by.png b/media/dashboard/v8.5-top-sql-usage-select-agg-by.png new file mode 100644 index 0000000000000..a81848639ab39 Binary files /dev/null and b/media/dashboard/v8.5-top-sql-usage-select-agg-by.png differ diff --git a/media/dashboard/v8.5-top-sql-usage-select-instance.png b/media/dashboard/v8.5-top-sql-usage-select-instance.png new file mode 100644 index 0000000000000..995946eec11fe Binary files /dev/null and b/media/dashboard/v8.5-top-sql-usage-select-instance.png differ diff --git a/media/dashboard/v8.5-top-sql-usage-select-order-by.png b/media/dashboard/v8.5-top-sql-usage-select-order-by.png new file mode 100644 index 0000000000000..a643c494a93c0 Binary files /dev/null and b/media/dashboard/v8.5-top-sql-usage-select-order-by.png differ diff --git a/mysql-compatibility.md b/mysql-compatibility.md index bea14fd7213dd..28c3310d56018 100644 --- a/mysql-compatibility.md +++ b/mysql-compatibility.md @@ -55,7 +55,6 @@ You can try out TiDB features on [TiDB Playground](https://play.tidbcloud.com/?u + Optimizer trace + XML Functions + X-Protocol [#1109](https://github.com/pingcap/tidb/issues/1109) -+ Column-level privileges [#9766](https://github.com/pingcap/tidb/issues/9766) + `XA` syntax (TiDB uses a two-phase commit internally, but this is not exposed via an SQL interface) + `CREATE TABLE tblName AS SELECT stmt` syntax [#4754](https://github.com/pingcap/tidb/issues/4754) + `CHECK TABLE` syntax [#4673](https://github.com/pingcap/tidb/issues/4673) diff --git a/privilege-management.md b/privilege-management.md index 3d789cc54ec23..4771259195c24 100644 --- a/privilege-management.md +++ b/privilege-management.md @@ -30,6 +30,8 @@ Use the following statement to grant the `xxx` user all privileges on all databa GRANT ALL PRIVILEGES ON *.* TO 'xxx'@'%'; ``` +Starting from v8.5.6, TiDB supports a MySQL-compatible column-level privilege management mechanism. You can grant or revoke `SELECT`, `INSERT`, `UPDATE`, and `REFERENCES` privileges on specific columns in a specified table. For more information, see [Column-Level Privilege Management](/column-privilege-management.md). + By default, [`GRANT`](/sql-statements/sql-statement-grant-privileges.md) statements will return an error if the user specified does not exist. This behavior depends on if the [SQL mode](/system-variables.md#sql_mode) `NO_AUTO_CREATE_USER` is specified: ```sql @@ -506,7 +508,7 @@ The following [`mysql` system tables](/mysql-schema/mysql-schema.md) are special - `mysql.user` (user account, global privilege) - `mysql.db` (database-level privilege) - `mysql.tables_priv` (table-level privilege) -- `mysql.columns_priv` (column-level privilege; not currently supported) +- `mysql.columns_priv` (column-level privilege; supported starting from v8.5.6) These tables contain the effective range and privilege information of the data. For example, in the `mysql.user` table: diff --git a/sql-statements/sql-statement-grant-privileges.md b/sql-statements/sql-statement-grant-privileges.md index 83b14b0f76581..5dfa733f90c4b 100644 --- a/sql-statements/sql-statement-grant-privileges.md +++ b/sql-statements/sql-statement-grant-privileges.md @@ -82,7 +82,7 @@ mysql> SHOW GRANTS FOR 'newuser'; ## MySQL compatibility * Similar to MySQL, the `USAGE` privilege denotes the ability to log into a TiDB server. -* Column level privileges are not currently supported. +* Starting from v8.5.6, TiDB supports a MySQL-compatible column-level privilege management mechanism. You can grant or revoke `SELECT`, `INSERT`, `UPDATE`, and `REFERENCES` privileges on specific columns in a specified table. For more information, see [Column-Level Privilege Management](/column-privilege-management.md). * Similar to MySQL, when the `NO_AUTO_CREATE_USER` sql mode is not present, the `GRANT` statement will automatically create a new user with an empty password when a user does not exist. Removing this sql-mode (it is enabled by default) presents a security risk. * In TiDB, after the `GRANT ` statement is executed successfully, the execution result takes effect immediately on the current connection. Whereas [in MySQL, for some privileges, the execution results take effect only on subsequent connections](https://dev.mysql.com/doc/refman/8.0/en/privilege-changes.html). See [TiDB #39356](https://github.com/pingcap/tidb/issues/39356) for details. diff --git a/sql-statements/sql-statement-revoke-privileges.md b/sql-statements/sql-statement-revoke-privileges.md index 9fb6f6681e9f5..0ab9b47106687 100644 --- a/sql-statements/sql-statement-revoke-privileges.md +++ b/sql-statements/sql-statement-revoke-privileges.md @@ -7,6 +7,8 @@ summary: An overview of the usage of REVOKE for the TiDB database. This statement removes privileges from an existing user. Executing this statement requires the `GRANT OPTION` privilege and all privileges you revoke. +Starting from v8.5.6, TiDB supports the MySQL-compatible column-level privilege management mechanism. You can specify a list of column names in `REVOKE`, for example `REVOKE SELECT(col2) ON test.tbl FROM 'user'@'host';`. For more information, see [Column-Level Privilege Management](/column-privilege-management.md). + ## Synopsis ```ebnf+diagram diff --git a/sql-statements/sql-statement-select.md b/sql-statements/sql-statement-select.md index 70d06730753e5..d683c76dfd155 100644 --- a/sql-statements/sql-statement-select.md +++ b/sql-statements/sql-statement-select.md @@ -106,7 +106,8 @@ TableSample ::= > **Note:** > -> Starting from v6.6.0, TiDB supports [Resource Control](/tidb-resource-control-ru-groups.md). You can use this feature to execute SQL statements with different priorities in different resource groups. By configuring proper quotas and priorities for these resource groups, you can gain better scheduling control for SQL statements with different priorities. When resource control is enabled, statement priority (`HIGH_PRIORITY`) will no longer take effect. It is recommended that you use [Resource Control](/tidb-resource-control-ru-groups.md) to manage resource usage for different SQL statements. +> - Starting from v8.5.6, TiDB supports using table aliases in the `FOR UPDATE OF` clause. To maintain backward compatibility, you can still reference the base table name when an alias is defined, but this triggers a warning that recommends using the explicit alias. When a query involves multiple tables with the same name across different databases (for example, `FROM db1.t, db2.t FOR UPDATE OF t`), TiDB now matches the target table from left to right based on the order in the `FROM` clause, rather than the current database context. To avoid ambiguity, it is recommended that you specify the database name or use aliases in the `FOR UPDATE OF` clause. +> - Starting from v6.6.0, TiDB supports [Resource Control](/tidb-resource-control-ru-groups.md). You can use this feature to execute SQL statements with different priorities in different resource groups. By configuring proper quotas and priorities for these resource groups, you can gain better scheduling control for SQL statements with different priorities. When resource control is enabled, statement priority (`HIGH_PRIORITY`) will no longer take effect. It is recommended that you use [Resource Control](/tidb-resource-control-ru-groups.md) to manage resource usage for different SQL statements. ## Examples diff --git a/statistics.md b/statistics.md index 39db445c5a660..fd4493b6eb18d 100644 --- a/statistics.md +++ b/statistics.md @@ -355,13 +355,17 @@ WHERE db_name = 'test' AND table_name = 't' AND last_analyzed_at IS NOT NULL; ## Versions of statistics -The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, two versions of statistics are supported: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. +> **Warning:** +> +> Starting from v8.5.6, Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use Statistics Version 2 (`tidb_analyze_version = 2`) and [migrate existing objects that use Statistics Version 1 to Version 2](#switch-between-statistics-versions). + +The [`tidb_analyze_version`](/system-variables.md#tidb_analyze_version-new-in-v510) variable controls the statistics collected by TiDB. Currently, TiDB supports two statistics versions: `tidb_analyze_version = 1` and `tidb_analyze_version = 2`. - For TiDB Self-Managed, the default value of this variable changes from `1` to `2` starting from v5.3.0. - For TiDB Cloud, the default value of this variable changes from `1` to `2` starting from v6.5.0. - If your cluster is upgraded from an earlier version, the default value of `tidb_analyze_version` does not change after the upgrade. -Version 2 is preferred, and will continue to be enhanced to ultimately replace Version 1 completely. Compared to Version 1, Version 2 improves the accuracy of many of the statistics collected for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics for predicate selectivity estimation, and also supporting automated collection only on selected columns (see [Collecting statistics on some columns](#collect-statistics-on-some-columns)). +Version 2 is the recommended statistics version. Compared to Version 1, Version 2 improves the accuracy of many statistics for larger data volumes. Version 2 also improves collection performance by removing the need to collect Count-Min sketch statistics. The following table lists the information collected by each version for usage in the optimizer estimates: @@ -376,11 +380,11 @@ The following table lists the information collected by each version for usage in ### Switch between statistics versions -It is recommended to ensure that all tables/indexes (and partitions) utilize statistics collection from the same version. Version 2 is recommended, however, it is not recommended to switch from one version to another without a justifiable reason such as an issue experienced with the version in use. A switch between versions might take a period of time when no statistics are available until all tables have been analyzed with the new version, which might negatively affect the optimizer plan choices if statistics are not available. +It is recommended that all tables, indexes, and partitions use the same statistics version. If your cluster still uses Statistics Version 1, migrate to Statistics Version 2 as soon as possible. Until Version 2 statistics are collected for an object (such as a table, an index, or a partition), TiDB continues to use the existing Version 1 statistics for that object. -Examples of justifications to switch might include - with Version 1, there could be inaccuracies in equal/IN predicate estimation due to hash collisions when collecting Count-Min sketch statistics. Solutions are listed in the [Count-Min Sketch](#count-min-sketch) section. Alternatively, setting `tidb_analyze_version = 2` and rerunning `ANALYZE` on all objects is also a solution. In the early release of Version 2, there was a risk of memory overflow after `ANALYZE`. This issue is resolved, but initially, one solution was to set `tidb_analyze_version = 1` and rerun `ANALYZE` on all objects. +One major reason to migrate is that Version 1 might produce inaccurate estimates for equal/IN predicates because the Count-Min Sketch can have hash collisions. For more information, see [Count-Min Sketch](#count-min-sketch). To avoid this issue, set `tidb_analyze_version = 2` and rerun `ANALYZE` on all objects. -To prepare `ANALYZE` for switching between versions: +To prepare `ANALYZE` for migrating from Statistics Version 1 to Statistics Version 2: - If the `ANALYZE` statement is executed manually, manually analyze every table to be analyzed. @@ -388,17 +392,10 @@ To prepare `ANALYZE` for switching between versions: SELECT DISTINCT(CONCAT('ANALYZE TABLE ', table_schema, '.', table_name, ';')) FROM information_schema.tables JOIN mysql.stats_histograms ON table_id = tidb_table_id - WHERE stats_ver = 2; + WHERE stats_ver = 1; ``` -- If TiDB automatically executes the `ANALYZE` statement because the auto-analysis has been enabled, execute the following statement that generates the [`DROP STATS`](/sql-statements/sql-statement-drop-stats.md) statement: - - ```sql - SELECT DISTINCT(CONCAT('DROP STATS ', table_schema, '.', table_name, ';')) - FROM information_schema.tables JOIN mysql.stats_histograms - ON table_id = tidb_table_id - WHERE stats_ver = 2; - ``` +- If TiDB automatically executes the `ANALYZE` statement because auto-analysis is enabled, after you set `tidb_analyze_version = 2`, TiDB gradually refreshes statistics to Version 2 through subsequent auto-analysis. Before Version 2 statistics are collected for an object, TiDB can continue to use its existing Version 1 statistics. To speed up the migration for important objects, run `ANALYZE` on them manually. - If the result of the preceding statement is too long to copy and paste, you can export the result to a temporary text file and then perform execution from the file like this: diff --git a/sync-diff-inspector/sync-diff-inspector-overview.md b/sync-diff-inspector/sync-diff-inspector-overview.md index c32174029ffb5..713b167045ed7 100644 --- a/sync-diff-inspector/sync-diff-inspector-overview.md +++ b/sync-diff-inspector/sync-diff-inspector-overview.md @@ -5,18 +5,9 @@ summary: Use sync-diff-inspector to compare data and repair inconsistent data. # sync-diff-inspector User Guide -[sync-diff-inspector](https://github.com/pingcap/tidb-tools/tree/master/sync_diff_inspector) is a tool used to compare data stored in the databases with the MySQL protocol. For example, it can compare the data in MySQL with that in TiDB, the data in MySQL with that in MySQL, or the data in TiDB with that in TiDB. In addition, you can also use this tool to repair data in the scenario where a small amount of data is inconsistent. +[sync-diff-inspector](https://github.com/pingcap/tiflow/tree/master/sync_diff_inspector) is a tool used to compare data stored in the databases with the MySQL protocol. For example, it can compare the data in MySQL with that in TiDB, the data in MySQL with that in MySQL, or the data in TiDB with that in TiDB. In addition, you can also use this tool to repair data in the scenario where a small amount of data is inconsistent. -This guide introduces the key features of sync-diff-inspector and describes how to configure and use this tool. To download sync-diff-inspector, use one of the following methods: - -+ Binary package. The sync-diff-inspector binary package is included in the TiDB Toolkit. To download the TiDB Toolkit, see [Download TiDB Tools](/download-ecosystem-tools.md). -+ Docker image. Execute the following command to download: - - {{< copyable "shell-regular" >}} - - ```shell - docker pull pingcap/tidb-tools:latest - ``` +This guide introduces the key features of sync-diff-inspector and describes how to configure and use this tool. ## Key features @@ -27,6 +18,36 @@ This guide introduces the key features of sync-diff-inspector and describes how * Support [data check for TiDB upstream-downstream clusters](/ticdc/ticdc-upstream-downstream-check.md) * Support [data check in the DM replication scenario](/sync-diff-inspector/dm-diff.md) +## Install sync-diff-inspector + +The installation method varies depending on your TiDB version: + +For TiDB v8.5.6 and later: + ++ Install using TiUP: + + ```shell + tiup install sync-diff-inspector + ``` + ++ Binary package: included in the TiDB Toolkit. To download the toolkit, see [Download TiDB Tools](/download-ecosystem-tools.md). + ++ Docker image: execute the following command to download: + + ```shell + docker pull pingcap/sync-diff-inspector:latest + ``` + +For versions before v8.5.6: + ++ Binary package: included in the TiDB Toolkit (from the legacy [`tidb-tools`](https://github.com/pingcap/tidb-tools) repository). To download the toolkit, see [Download TiDB Tools](/download-ecosystem-tools.md). + ++ Docker image (legacy version): execute the following command to download: + + ```shell + docker pull pingcap/tidb-tools:latest + ``` + ## Restrictions of sync-diff-inspector * Online check is not supported for data migration between MySQL and TiDB. Ensure that no data is written into the upstream-downstream checklist, and that data in a certain range is not changed. You can check data in this range by setting `range`. diff --git a/system-variables.md b/system-variables.md index dd168da0d23e5..95eca6d2d053c 100644 --- a/system-variables.md +++ b/system-variables.md @@ -588,6 +588,10 @@ This variable is an alias for [`last_insert_id`](#last_insert_id). - Unit: Seconds - The lock wait timeout for pessimistic transactions (default). +### InPacketBytes New in v8.5.6 + +- This variable is used only for internal statistics and is not visible to users. + ### interactive_timeout > **Note:** @@ -657,6 +661,10 @@ This variable is an alias for [`last_insert_id`](#last_insert_id). - In the `SESSION` scope, this variable is read-only. - This variable is compatible with MySQL. +### OutPacketBytes New in v8.5.6 + +- This variable is used only for internal statistics and is not visible to users. + ### password_history New in v6.5.0 - Scope: GLOBAL @@ -1129,6 +1137,10 @@ MPP is a distributed computing framework provided by the TiFlash engine, which a ### tidb_analyze_version New in v5.1.0 +> **Warning:** +> +> Starting from v8.5.6, Statistics Version 1 (`tidb_analyze_version = 1`) is deprecated and will be removed in a future release. It is recommended that you use `tidb_analyze_version = 2`. + - Scope: SESSION | GLOBAL - Persists to cluster: Yes - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No @@ -3119,6 +3131,15 @@ For a system upgraded to v5.0 from an earlier version, if you have not modified > > Starting from v6.6.0, TiDB supports [Resource Control](/tidb-resource-control-ru-groups.md). You can use this feature to execute SQL statements with different priorities in different resource groups. By configuring proper quotas and priorities for these resource groups, you can gain better scheduling control for SQL statements with different priorities. When resource control is enabled, statement priority will no longer take effect. It is recommended that you use [Resource Control](/tidb-resource-control-ru-groups.md) to manage resource usage for different SQL statements. +### tidb_foreign_key_check_in_shared_lock New in v8.5.6 + +- Scope: SESSION | GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No +- Type: Boolean +- Default value: `OFF` +- This variable controls whether foreign key constraint checks use shared locks instead of exclusive locks when locking rows in the parent table in pessimistic transactions. When enabled, multiple concurrent transactions can perform foreign key checks on the same parent row without blocking each other, thereby reducing lock conflicts and improving the performance of concurrent writes to child tables. + ### tidb_gc_concurrency New in v5.0 > **Note:** @@ -3475,11 +3496,11 @@ For a system upgraded to v5.0 from an earlier version, if you have not modified - Persists to cluster: Yes - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No - Type: Boolean -- Default value: `OFF` +- Default value: `ON`. Before v8.5.6, the default value is `OFF`. - This variable controls whether TiDB ignores the element differences in the `IN` list across different queries when generating Plan Digests. - - When it is the default value `OFF`, TiDB does not ignore the element differences (including the difference in the number of elements) in the `IN` list when generating Plan Digests. The element differences in the `IN` list result in different Plan Digests. - - When it is set to `ON`, TiDB ignores the element differences (including the difference in the number of elements) in the `IN` list and uses `...` to replace elements in the `IN` list in Plan Digests. In this case, TiDB generates the same Plan Digests for `IN` queries of the same type. + - When it is the default value `ON`, TiDB ignores the element differences (including the difference in the number of elements) in the `IN` list and uses `...` to replace elements in the `IN` list in Plan Digests. In this case, TiDB generates the same Plan Digests for `IN` queries of the same type. + - When it is set to `OFF`, TiDB does not ignore the element differences (including the difference in the number of elements) in the `IN` list when generating Plan Digests. The element differences in the `IN` list result in different Plan Digests. ### tidb_index_join_batch_size @@ -3888,6 +3909,22 @@ For a system upgraded to v5.0 from an earlier version, if you have not modified - Range: `[100, 16384]` - This variable is used to set the maximum number of schema versions (the table IDs modified for corresponding versions) allowed to be cached. The value range is 100 ~ 16384. +### tidb_max_dist_task_nodes New in v8.5.6 + +- Scope: SESSION | GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No +- Type: Integer +- Default value: `-1` +- Range: `-1` or `[1, 128]` +- This variable defines the maximum number of TiDB nodes that Distributed eXecution Framework (DXF) tasks can use. The default value `-1` indicates that automatic mode is enabled. In this mode, TiDB dynamically calculates the value as `min(3, tikv_nodes / 3)`, where `tikv_nodes` is the number of TiKV nodes in the cluster. + +> **Note:** +> +> If you explicitly set the [`tidb_service_scope`](#tidb_service_scope-new-in-v740) system variable for some TiDB nodes, DXF schedules tasks only to these nodes. In this case, even if you set `tidb_max_dist_task_nodes` to a larger value, DXF uses at most the number of nodes you explicitly configured with `tidb_service_scope`. +> +> For example, if the cluster has 10 TiDB nodes and 4 of them are configured with `tidb_service_scope = group1`, then even if you set `tidb_max_dist_task_nodes = 5`, only 4 nodes participate in the task execution. + ### tidb_max_paging_size New in v6.3.0 - Scope: SESSION | GLOBAL @@ -4498,6 +4535,17 @@ mysql> desc select count(distinct a) from test.t; - This variable is used to control the selection of the TiDB Join Reorder algorithm. When the number of nodes participating in Join Reorder is greater than this threshold, TiDB selects the greedy algorithm, and when it is less than this threshold, TiDB selects the dynamic programming algorithm. - Currently, for OLTP queries, it is recommended to keep the default value. For OLAP queries, it is recommended to set the variable value to 10~15 to get better connection orders in OLAP scenarios. +### tidb_opt_join_reorder_through_sel New in v8.5.6 + +- Scope: SESSION | GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): Yes +- Type: Boolean +- Default value: `OFF` +- This variable improves Join Reorder optimization for certain multi-table join queries. If you set it to `ON`, the optimizer includes filter conditions (`Selection`) between multiple consecutive joins into the candidate range for Join Reorder optimization, provided safety conditions are met. When rebuilding the join tree, the optimizer pushes these conditions down to more suitable positions, which lets more tables participate in Join Reorder optimization. +- If you observe performance regressions or unstable execution plans after enabling this variable, set it to `OFF` to disable this feature. +- To ensure the evaluation semantics of expressions remain unchanged, the optimizer does not perform condition pushdown even when this variable is enabled if the filter conditions contain non-deterministic functions or functions with side effects (such as `RAND()`). + ### tidb_opt_limit_push_down_threshold - Scope: SESSION | GLOBAL @@ -4695,6 +4743,73 @@ mysql> desc select count(distinct a) from test.t; +----------------------------------+---------+-----------+----------------------+-------------------------------------+ ``` +### `tidb_opt_partial_ordered_index_for_topn` New in v8.5.6 + +- Scope: SESSION | GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): Yes +- Type: Enum +- Default value: `DISABLE` +- Possible values: `DISABLE`, `COST` +- Controls whether the optimizer can leverage the partial order of an index to optimize TopN computation when a query contains `ORDER BY ... LIMIT`. When the sort column matches the index order (for example, the sort column is an index column or has a prefix index), the data returned by the index scan is already partially ordered on that column. In this case, the optimizer can incrementally build the TopN result during the scan and stop early once the `LIMIT` is satisfied, thereby reducing sorting overhead. +- Usage scenarios: When the sort column in an `ORDER BY ... LIMIT` clause is a long string with only a prefix index, to reduce the TopN sorting overhead, you can set this variable to `COST` and specify a `USE INDEX` or `FORCE INDEX` hint in the query to enable the partial order TopN optimization. + + - The default value is `DISABLE`, which means the partial order TopN optimization is disabled. In this case, the optimizer uses the standard global sorting approach for TopN. + - To force the use of the partial order TopN optimization, set this variable to `COST` and specify a qualifying index in the query using `USE INDEX` or `FORCE INDEX`. If the specified index does not meet the prerequisites for this optimization (for example, the `ORDER BY` clause does not match the index prefix, or the query contains unsupported ordering patterns), the optimization might not be applied even when the variable is set to `COST`, and the execution plan falls back to the standard TopN approach. + + > **Note:** + > + > Currently, the optimizer does not support dynamically deciding whether to apply the partial order TopN optimization based on the cost model. If you only set this variable to `COST` without specifying `USE INDEX` or `FORCE INDEX`, the optimizer might not apply this optimization. To ensure that the optimization is applied, use it together with `USE INDEX` or `FORCE INDEX`. + +
+View examples of partial order TopN optimization + +Create a table `t_varchar` and define a prefix index `idx_name_prefix(name(10))` on the string column `name`: + +```sql +CREATE TABLE t_varchar ( + id INT PRIMARY KEY, + name VARCHAR(255), + INDEX idx_name_prefix(name(10)) +); +``` + +- Force the partial order TopN optimization (`COST` + `USE INDEX`): + + ```sql + > SET SESSION tidb_opt_partial_ordered_index_for_topn = 'COST'; + + > EXPLAIN FORMAT='brief' SELECT /*+ use_index(t_varchar, idx_name_prefix) */ * + FROM t_varchar ORDER BY name LIMIT 5; + +-------------------------------------------+---------+-----------+------------------------------+----------------------------------------------------------------------------------------------+ + | id | estRows | task | access object | operator info | + +-------------------------------------------+---------+-----------+------------------------------+----------------------------------------------------------------------------------------------+ + | TopN | 5.00 | root | | planner__core__partial_order_topn.t_varchar.name, offset:0, count:5, prefix_col:planner__core__partial_order_topn.t_varchar.name, prefix_len:10 | + | └─IndexLookUp | 5.00 | root | | | + | ├─Limit(Build) | 5.00 | cop[tikv] | | offset:0, count:5, prefix_col:planner__core__partial_order_topn.t_varchar.name, prefix_len:10 | + | │ └─IndexFullScan | 10000.00| cop[tikv] | table:t_varchar, index:idx_name_prefix(name) | keep order:true, stats:pseudo | + | └─TableRowIDScan(Probe) | 5.00 | cop[tikv] | table:t_varchar | keep order:false, stats:pseudo | + +-------------------------------------------+---------+-----------+------------------------------+----------------------------------------------------------------------------------------------+ + ``` + +- Disable the partial order TopN optimization (`DISABLE`): + + ```sql + > SET SESSION tidb_opt_partial_ordered_index_for_topn = 'DISABLE'; + + > EXPLAIN FORMAT='brief' SELECT * FROM t_varchar ORDER BY name LIMIT 5; + +---------------------------+---------+-----------+---------------------+----------------------------------------------------+ + | id | estRows | task | access object | operator info | + +---------------------------+---------+-----------+---------------------+----------------------------------------------------+ + | TopN | 5.00 | root | | planner__core__partial_order_topn.t_varchar.name, offset:0, count:5 | + | └─TableReader | 5.00 | root | data:TopN | | + | └─TopN | 5.00 | cop[tikv] | | planner__core__partial_order_topn.t_varchar.name, offset:0, count:5 | + | └─TableFullScan | 10000.00| cop[tikv] | table:t_varchar | keep order:false, stats:pseudo | + +---------------------------+---------+-----------+---------------------+----------------------------------------------------+ + ``` + +
+ ### tidb_opt_prefer_range_scan New in v5.0 > **Note:** @@ -5696,7 +5811,7 @@ SHOW WARNINGS; - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No - Type: String - Default value: "" -- Optional value: a string with a length of up to 64 characters. Valid characters include digits `0-9`, letters `a-zA-Z`, underscores `_`, and hyphens `-`. +- Optional value: a string with a length of up to 64 characters. Valid characters include digits `0-9`, letters `a-zA-Z`, underscores `_`, and hyphens `-`. Starting from v8.5.6, the value of this variable is case-insensitive. TiDB converts the input value to lowercase for storage and comparison. - This variable is an instance-level system variable. You can use it to control the service scope of each TiDB node under the [TiDB Distributed eXecution Framework (DXF)](/tidb-distributed-execution-framework.md). The DXF determines which TiDB nodes can be scheduled to execute distributed tasks based on the value of this variable. For specific rules, see [Task scheduling](/tidb-distributed-execution-framework.md#task-scheduling). ### tidb_session_alias New in v7.4.0 @@ -5806,6 +5921,34 @@ Query OK, 0 rows affected, 1 warning (0.00 sec) > > If the character check is skipped, TiDB might fail to detect invalid UTF-8 characters written by the application, cause decoding errors when `ANALYZE` is executed, and introduce other unknown encoding issues. If your application cannot guarantee the validity of the written string, it is not recommended to skip the character check. +### tidb_slow_log_max_per_sec New in v8.5.6 + +- Scope: GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No +- Default value: `0` +- Type: Integer +- Range: `[0, 1000000]` +- This variable controls the maximum number of slow query log entries that can be written per TiDB node per second. + - A value of `0` means there is no limit on the number of slow query log entries written per second. + - A value greater than `0` means TiDB writes at most the specified number of slow query log entries per second. Any excess log entries are discarded and not written to the slow query log file. +- This variable is often used with [`tidb_slow_log_rules`](#tidb_slow_log_rules-new-in-v856) to prevent excessive slow query logs from being generated under high-workload conditions. + +### tidb_slow_log_rules New in v8.5.6 + +- Scope: SESSION | GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No +- Default value: "" +- Type: String +- This variable defines the triggering rules for slow query logs. It supports combining multi-dimensional metrics to provide more flexible and fine-grained logging. +- For more information about how to use this system variable, see [Use `tidb_slow_log_rules`](/identify-slow-queries.md#use-tidb_slow_log_rules). + +> **Tip:** +> +> - When enabling `tidb_slow_log_rules` in a production environment, it is recommended to also configure [`tidb_slow_log_max_per_sec`](#tidb_slow_log_max_per_sec-new-in-v856) to avoid excessively frequent slow query log printing. +> - It is recommended to start with stricter conditions and gradually relax them based on troubleshooting needs. For more information on performance impact, see [Recommendations](/identify-slow-queries.md#recommendations). + ### tidb_slow_log_threshold > **Note:** diff --git a/ticdc/ticdc-architecture.md b/ticdc/ticdc-architecture.md index a3c9cd37f8634..f51fb8ebda0b9 100644 --- a/ticdc/ticdc-architecture.md +++ b/ticdc/ticdc-architecture.md @@ -64,7 +64,17 @@ When this feature is enabled, TiCDC automatically splits and distributes tables > **Note:** > -> For MySQL sink changefeeds, only tables that meet one of the preceding conditions and have **exactly one primary key or non-null unique key** can be split and distributed by TiCDC, to ensure the correctness of data replication in table split mode. +> For MySQL sink changefeeds, only tables that meet one of the preceding conditions and have **exactly one primary key or non-null unique key** can be split and distributed by TiCDC, to ensure the correctness of data replication in table-level task splitting mode. + +### Recommended configurations for table-level task splitting + +After switching to the new TiCDC architecture, do not reuse the table-splitting configurations from the classic architecture. In most scenarios, use the default configuration of the new architecture. Make incremental adjustments based on the default values only in special scenarios where replication performance bottlenecks or scheduling imbalance occur. + +In table split mode, pay attention to the following settings: + +- [`scheduler.region-threshold`](/ticdc/ticdc-changefeed-config.md#region-threshold): the default value is `10000`. When the number of Regions in a table exceeds this threshold, TiCDC splits the table. For tables with relatively few Regions but high overall write throughput, you can reduce this value appropriately. This parameter must be greater than or equal to `scheduler.region-count-per-span`. Otherwise, tasks might be rescheduled repeatedly, which increases replication latency. +- [`scheduler.region-count-per-span`](/ticdc/ticdc-changefeed-config.md#region-count-per-span-new-in-v854): the default value is `100`. During changefeed initialization, TiCDC splits tables that meet the split conditions according to this parameter. After splitting, each sub-table contains at most `region-count-per-span` Regions. +- [`scheduler.write-key-threshold`](/ticdc/ticdc-changefeed-config.md#write-key-threshold): the default value is `0` (disabled). When the sink write throughput of a table exceeds this threshold, TiCDC triggers table splitting. In most cases, keep this parameter at `0`. ## Compatibility diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index f71fa7a1950ee..fc8337de54d0f 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -163,6 +163,11 @@ For more information, see [Event filter rules](/ticdc/ticdc-filter.md#event-filt - The value is `false` by default. Set it to `true` to enable this feature. - Default value: `false` +#### `region-count-per-span` New in v8.5.4 + +- Introduced in the [TiCDC new architecture](/ticdc/ticdc-architecture.md). During changefeed initialization, TiCDC splits tables that meet the split conditions according to this parameter. After splitting, each sub-table contains at most `region-count-per-span` Regions. +- Default value: `100` + #### `region-threshold` - Default value: for the [TiCDC new architecture](/ticdc/ticdc-architecture.md), the default value is `10000`; for the [TiCDC classic architecture](/ticdc/ticdc-classic-architecture.md), the default value is `100000`. diff --git a/ticdc/ticdc-csv.md b/ticdc/ticdc-csv.md index 40d6de9161fb8..25e26b355aa3b 100644 --- a/ticdc/ticdc-csv.md +++ b/ticdc/ticdc-csv.md @@ -28,6 +28,7 @@ quote = '"' null = '\N' include-commit-ts = true output-old-value = false +output-field-header = false # New in v8.5.6 (only available in the TiCDC new architecture) ``` ## Transactional constraints @@ -51,6 +52,12 @@ In the CSV file, each column is defined as follows: - Column 5: The `is-update` column only exists when the value of `output-old-value` is true, which is used to identify whether the row data change comes from the UPDATE event (the value of the column is true) or the INSERT/DELETE event (the value is false). - Column 6 to the last column: One or more columns with data changes. +For the [TiCDC new architecture](/ticdc/ticdc-architecture.md), when `output-field-header = true`, the CSV file includes a header row. The column names in the header row are as follows: + +| Column 1 | Column 2 | Column 3 | Column 4 (optional) | Column 5 (optional) | Column 6 | ... | Last column | +| --- | --- | --- | --- | --- | --- | --- | --- | +| `ticdc-meta$operation` | `ticdc-meta$table` | `ticdc-meta$schema` | `ticdc-meta$commit-ts` | `ticdc-meta$is-update` | The first column with data changes | ... | The last column with data changes | + Assume that table `hr.employee` is defined as follows: ```sql @@ -85,6 +92,19 @@ When `include-commit-ts = true` and `output-old-value = true`, the DML events of "I","employee","hr",433305438660591630,true,102,"Alex","Alice","2018-06-15","Beijing" ``` +When `include-commit-ts = true`, `output-old-value = true`, and `output-field-header = true`, the DML events of this table are stored in the CSV format as follows: + +```csv +ticdc-meta$operation,ticdc-meta$table,ticdc-meta$schema,ticdc-meta$commit-ts,ticdc-meta$is-update,Id,LastName,FirstName,HireDate,OfficeLocation +"I","employee","hr",433305438660591626,false,101,"Smith","Bob","2014-06-04","New York" +"D","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Shanghai" +"I","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Los Angeles" +"D","employee","hr",433305438660591629,false,101,"Smith","Bob","2017-03-13","Dallas" +"I","employee","hr",433305438660591630,false,102,"Alex","Alice","2017-03-14","Shanghai" +"D","employee","hr",433305438660591630,true,102,"Alex","Alice","2017-03-14","Beijing" +"I","employee","hr",433305438660591630,true,102,"Alex","Alice","2018-06-15","Beijing" +``` + ## Data type mapping | MySQL type | CSV type | Example | Description | diff --git a/tidb-cloud/tidb-cloud-clinic.md b/tidb-cloud/tidb-cloud-clinic.md index 12885991e8381..62f98130728e6 100644 --- a/tidb-cloud/tidb-cloud-clinic.md +++ b/tidb-cloud/tidb-cloud-clinic.md @@ -29,7 +29,7 @@ To view the **Cluster** page, take the following steps: - Advanced Metrics - Top Slow Queries (only supported when the TiDB version of the cluster is v8.1.1 or later, v7.5.4 or later) - - TopSQL (only supported when the TiDB version of the cluster is v8.1.1 or later, v7.5.4 or later) + - Top SQL (only supported when the TiDB version of the cluster is v8.1.1 or later, v7.5.4 or later) - Benchmark Report ## Monitor advanced metrics @@ -91,21 +91,21 @@ The retention policy for slow queries is 7 days. For more information, see [Slow Queries in TiDB Dashboard](https://docs.pingcap.com/tidb/stable/dashboard-slow-query). -## Monitor TopSQL +## Monitor Top SQL -TiDB Cloud Clinic provides TopSQL information, enabling you to monitor and visually explore the CPU overhead of each SQL statement in your database in real time. This helps you optimize and resolve database performance issues. +TiDB Cloud Clinic provides Top SQL information to help you visually analyze the most resource-intensive queries on a specific TiDB or TiKV node over a period of time. By default, Top SQL continuously collects CPU load data. For TiKV nodes, if TiKV network IO collection is enabled, you can also inspect `Network Bytes` and `Logical IO Bytes`, and analyze hotspots by `Query`, `Table`, `DB`, or `Region`. This helps you identify and troubleshoot performance issues across multiple resource dimensions, not just CPU. -To view TopSQL, take the following steps: +To view Top SQL, take the following steps: 1. In the [TiDB Cloud Clinic console](https://clinic.pingcap.com/), navigate to the **Cluster** page of a cluster. -2. Click **TopSQL**. +2. Click **Top SQL**. -3. Select a specific TiDB or TiKV instance to observe its load. You can use the time picker or select a time range in the chart to refine your analysis. +3. Select a specific TiDB or TiKV node to observe its workload. You can use the time picker or select a time range in the chart to refine your analysis. -4. Analyze the charts and tables displayed by TopSQL. +4. Analyze the charts and tables displayed by Top SQL. Depending on the selected node and enabled metrics, you can use `Order By` and the available aggregation dimensions to inspect CPU, network, or logical I/O hotspots. -For more information, see [TopSQL in TiDB Dashboard](https://docs.pingcap.com/tidb/stable/top-sql). +For more information, see [Top SQL in TiDB Dashboard](https://docs.pingcap.com/tidb/stable/top-sql). ## Generate benchmark reports diff --git a/tidb-resource-control-background-tasks.md b/tidb-resource-control-background-tasks.md index b7abbb156072e..bb48d9274adf7 100644 --- a/tidb-resource-control-background-tasks.md +++ b/tidb-resource-control-background-tasks.md @@ -5,12 +5,6 @@ summary: Introduces how to control background tasks through Resource Control. # Use Resource Control to Manage Background Tasks -> **Warning:** -> -> This feature is experimental. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://docs.pingcap.com/tidb/stable/support) on GitHub. -> -> The background task management in resource control is based on TiKV's dynamic adjustment of resource quotas for CPU/IO utilization. Therefore, it relies on the available resource quota of each instance. If multiple components or instances are deployed on a single server, it is mandatory to set the appropriate resource quota for each instance through `cgroup`. It is difficult to achieve the expected effect in deployment with shared resources such as TiUP Playground. - > **Note:** > > This feature is not available on [{{{ .starter }}}](https://docs.pingcap.com/tidbcloud/select-cluster-tier#starter) and [{{{ .essential }}}](https://docs.pingcap.com/tidbcloud/select-cluster-tier#essential) clusters. @@ -19,6 +13,10 @@ Background tasks, such as data backup and automatic statistics collection, are l Starting from v7.4.0, the [TiDB resource control](/tidb-resource-control-ru-groups.md) feature supports managing background tasks. When a task is marked as a background task, TiKV dynamically limits the resources used by this type of task to avoid the impact on the performance of other foreground tasks. TiKV monitors the CPU and IO resources consumed by all foreground tasks in real time, and calculates the resource threshold that can be used by background tasks based on the total resource limit of the instance. All background tasks are restricted by this threshold during execution. +> **Note:** +> +> The background task management in resource control is based on TiKV's dynamic adjustment of resource quotas for CPU/IO utilization. Therefore, it relies on the available resource quota of each instance. If multiple components or instances are deployed on a single server, it is mandatory to set the appropriate resource quota for each instance through `cgroup`. It is difficult to achieve the expected effect in deployment with shared resources such as TiUP Playground. + ## `BACKGROUND` parameters - `TASK_TYPES`: specifies the task types that need to be managed as background tasks. Use commas (`,`) to separate multiple task types. @@ -28,7 +26,7 @@ TiDB supports the following types of background tasks: -- `lightning`: perform import tasks using [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) or [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md). Both TiDB Lightning physical and logical import modes are supported. +- `import`: perform import tasks using [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) or [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md). Both TiDB Lightning physical and logical import modes are supported. - `br`: perform backup and restore tasks using [BR](/br/backup-and-restore-overview.md). PITR is not supported. - `ddl`: control the resource usage during the batch data write back phase of Reorg DDLs. - `stats`: the [collect statistics](/statistics.md#collect-statistics) tasks that are manually executed or automatically triggered by TiDB. @@ -38,7 +36,7 @@ TiDB supports the following types of background tasks: -- `lightning`: perform import tasks using [TiDB Lightning](https://docs.pingcap.com/tidb/stable/tidb-lightning-overview). Both physical and logical import modes of TiDB Lightning are supported. +- `import`: perform import tasks using [TiDB Lightning](https://docs.pingcap.com/tidb/stable/tidb-lightning-overview). Both physical and logical import modes of TiDB Lightning are supported. - `br`: perform backup and restore tasks using [BR](https://docs.pingcap.com/tidb/stable/backup-and-restore-overview). PITR is not supported. - `ddl`: control the resource usage during the batch data write back phase of Reorg DDLs. - `stats`: the [collect statistics](/statistics.md#collect-statistics) tasks that are manually executed or automatically triggered by TiDB. diff --git a/tikv-configuration-file.md b/tikv-configuration-file.md index 566c25a63d835..2d2fe12b27fcf 100644 --- a/tikv-configuration-file.md +++ b/tikv-configuration-file.md @@ -2345,6 +2345,23 @@ Configures the behavior of TiKV automatic compaction. + Controls whether to force compaction on the bottommost files in RocksDB. + Default value: `true` +### `mvcc-read-aware-enabled` New in v8.5.6 + ++ Controls whether to enable MVCC-read-aware compaction. When enabled, TiKV tracks the number of MVCC versions scanned during read requests and uses this information to prioritize compaction for Regions with high MVCC read amplification. This reduces read latency for hot Regions that encounter many stale versions during scans. ++ Default value: `false` + +### `mvcc-scan-threshold` New in v8.5.6 + ++ The minimum number of MVCC versions scanned per read request to mark a Region as a compaction candidate. This configuration item takes effect only when [`mvcc-read-aware-enabled`](#mvcc-read-aware-enabled-new-in-v856) is set to `true`. ++ Default value: `1000` ++ Minimum value: `0` + +### `mvcc-read-weight` New in v8.5.6 + ++ The weight multiplier applied to MVCC read activity when calculating the compaction priority score for a Region. A higher value gives more weight to MVCC read amplification relative to other compaction triggers, such as tombstone density. This configuration item takes effect only when [`mvcc-read-aware-enabled`](#mvcc-read-aware-enabled-new-in-v856) is set to `true`. ++ Default value: `3.0` ++ Minimum value: `0.0` + ## backup Configuration items related to BR backup. @@ -2644,6 +2661,24 @@ To reduce write latency, TiKV periodically fetches and caches a batch of timesta + In a default TSO physical time update interval (`50ms`), PD provides at most 262144 TSOs. When requested TSOs exceed this number, PD provides no more TSOs. This configuration item is used to avoid exhausting TSOs and the reverse impact of TSO exhaustion on other businesses. If you increase the value of this configuration item to improve high availability, you need to decrease the value of [`tso-update-physical-interval`](/pd-configuration-file.md#tso-update-physical-interval) at the same time to get enough TSOs. + Default value: `8192` +## resource-metering + +Configuration items related to resource metering. + +### `enable-network-io-collection` New in v8.5.6 + ++ Controls whether to collect TiKV network traffic and logical I/O information in [Top SQL](/dashboard/top-sql.md) in addition to CPU data. ++ When enabled, TiKV additionally records inbound network bytes, outbound network bytes, logical read bytes, and logical write bytes during request processing. ++ When reporting resource consumption, TiKV filters the Top N records based on CPU time, network traffic, and logical I/O, and additionally reports these statistics by Region for more fine-grained analysis of hotspot requests or resource usage sources. ++ Default value: `false` + +> **Note:** +> +> Logical I/O is not equivalent to physical I/O and cannot be directly correlated: +> +> - Logical I/O refers to the logical amount of data processed by requests at the TiKV storage layer, such as data scanned or processed during reads and data written by write requests. +> - Physical I/O refers to the actual disk read/write traffic on the underlying storage device, which is affected by block cache, compaction, flush, and other factors. + ## resource-control Configuration items related to resource control of the TiKV storage layer. diff --git a/upgrade-tidb-using-tiup.md b/upgrade-tidb-using-tiup.md index c9fc7b65bb798..0d3c5bda5813b 100644 --- a/upgrade-tidb-using-tiup.md +++ b/upgrade-tidb-using-tiup.md @@ -71,6 +71,7 @@ The following provides release notes you need to know when you upgrade from v8.4 - TiDB v8.5.3 [compatibility changes](/releases/release-8.5.3.md#compatibility-changes) - TiDB v8.5.4 [compatibility changes](/releases/release-8.5.4.md#compatibility-changes) - TiDB v8.5.5 [compatibility changes](https://docs.pingcap.com/tidb/stable/release-8.5.5/#compatibility-changes) +- TiDB v8.5.6 [compatibility changes](https://docs.pingcap.com/tidb/stable/release-8.5.6/#compatibility-changes) ### Step 2: Upgrade TiUP or TiUP offline mirror diff --git a/variables.json b/variables.json index 2e2509e7ab677..779b49bfb941d 100644 --- a/variables.json +++ b/variables.json @@ -1,7 +1,7 @@ { "tidb": "TiDB", - "tidb-version": "8.5.5", - "tidb-release-date": "2026-01-15", + "tidb-version": "8.5.6", + "tidb-release-date": "2026-04-14", "tidb-operator-version": "v1.6.4", "self-managed": "TiDB Self-Managed", "starter": "TiDB Cloud Starter",