release notes
release notes
Published 4/7/2021
MajorContains breaking changesAdd the tidb_executor_concurrency system variable to control the concurrency of multiple operators. The previous tidb_*_concurrency settings (such as tidb_projection_concurrency) still take effect but with a warning when you use them.
Add the tidb_skip_ascii_check system variable to specify whether to skip the ASCII validation check when the ASCII character set is written. This default value is OFF.
Add the tidb_enable_strict_double_type_check system variable to determine whether the syntax like double(N) can be defined in the table schema. This default value is OFF.
Change the default value of tidb_dml_batch_size from 20000 to 0. This means that batch DML statements are no longer used by default in LOAD/INSERT INTO SELECT .... Instead, large transactions are used to comply with the strict ACID semantics.
Note:
The scope of the variable is changed from session to global, and the default value is changed from
20000to0. If the application relies on the original default value, you need to use theset globalstatement to modify the variable to the original value after the upgrade.
Control temporary tables’ syntax compatibility using the tidb_enable_noop_functions system variable. When this variable value is OFF, the CREATE TEMPORARY TABLE syntax returns an error.
Add the following system variables to directly control the garbage collection-related parameters:
Change the default value of enable-joint-consensus from false to true, which enables the Joint Consensus feature by default.
Change the value of tidb_enable_amend_pessimistic_txn from 0 or 1 to ON or OFF.
Change the default value of tidb_enable_clustered_index from OFF to INT_ONLY with the following new meanings:
OFF: clustered index is enabled. Adding or deleting non-clustered indexes is supported.ON: clustered index is disabled. Adding or deleting non-clustered indexes is supported.INT_ONLY: the default value. The behavior is consistent with that before v5.0. You can control whether to enable clustered index for the INT type together with alter-primary-key = false.Note:
The
INT_ONLYvalue oftidb_enable_clustered_indexin 5.0 GA has the same meaning as theOFFvalue in 5.0 RC. After upgrading from a 5.0 RC cluster with theOFFsetting to 5.0 GA, it will be displayed asINT_ONLY.
Add the index-limit configuration item for TiDB to be compatible with MySQL’s limit on the maximum number of indexes. If the index setting exceeds the default value of this configuration item, when the table schema is re-imported into MySQL, an error is reported. This item’s value defaults to 64 and ranges between [64,64*8].
Add the enable-enum-length-limit configuration item for TiDB to be compatible and consistent with MySQL’s ENUM/SET length (ENUM length < 255). The default value is true.
Replace the pessimistic-txn.enable configuration item with the tidb_txn_mode environment variable.
Replace the performance.max-memory configuration item with performance.server-memory-quota
Replace the tikv-client.copr-cache.enable configuration item with tikv-client.copr-cache.capacity-mb. If the item’s value is 0.0, this feature is disabled. If the item’s value is greater than 0.0, this feature is enabled. Its default value is 1000.0.
Replace the rocksdb.auto-tuned configuration item with rocksdb.rate-limiter-auto-tuned.
Delete the raftstore.sync-log configuration item. By default, written data is forcibly spilled to the disk. Before v5.0, you can explicitly disable raftstore.sync-log. Since v5.0, the configuration value is forcibly set to true.
Change the default value of the gc.enable-compaction-filter configuration item from false to true.
Change the default value of the enable-cross-table-merge configuration item from false to true.
Change the default value of the rate-limiter-auto-tuned configuration item from false to true.
With the list partitioning feature, you can effectively query and maintain tables with a large amount of data.
With this feature enabled, partitions and how data is distributed among partitions are defined according to the PARTITION BY LIST(expr) PARTITION part_name VALUES IN (...) expression. The partitioned tables’ data set supports at most 1024 distinct integer values. You can define the values using the PARTITION ... VALUES IN (...) clause.
To enable list partitioning, set the session variable tidb_enable_list_partition to ON.
List COLUMNS partitioning is a variant of list partitioning. You can use multiple columns as partition keys. Besides the integer data type, you can also use the columns in the string, DATE, and DATETIME data types as partition columns.
To enable List COLUMNS partitioning, set the session variable tidb_enable_list_partition to ON.
When you tune performance or select optimal indexes, you can set an index to be Visible or Invisible by using SQL statements. This setting can avoid performing resource-consuming operations, such as DROP INDEX and ADD INDEX.
To modify the visibility of an index, use the ALTER INDEX statement. After the modification, the optimizer decides whether to add this index to the index list based on the index visibility.
EXCEPT and INTERSECT operatorsThe INTERSECT operator is a set operator, which returns the intersection of the result sets of two or more queries. To some extent, it is an alternative to the Inner Join operator.
The EXCEPT operator is a set operator, which combines the result sets of two queries and returns elements that are in the first query result but not in the second.
In the pessimistic transaction mode, if the tables involved in a transaction contain concurrent DDL operations or SCHEMA VERSION changes, the system automatically updates the transaction's SCHEMA VERSION to the latest to ensure the successful transaction commit, and to avoid that the client receives the Information schema is changed error when the transaction is interrupted by DDL operations or SCHEMA VERSION changes.
This feature is disabled by default. To enable the feature, modify the value of tidb_enable_amend_pessimistic_txn system variable. This feature is introduced in v4.0.7 and has the following issues fixed in v5.0:
Add Column operationsCurrently, this feature still has the following incompatibility issues:
Change Columnutf8mb4_unicode_ci and utf8_unicode_ci collations. User document, #17596To meet security compliance requirements (such as General Data Protection Regulation, or GDPR), the system supports desensitizing information (such as ID and credit card number) in the output error messages and logs, which can avoid leaking sensitive information.
TiDB supports desensitizing the output log information. To enable this feature, use the following switches:
tidb_redact_log. Its default value is 0, which means that desensitization is disabled. To enable desensitization for tidb-server logs, set the variable value to 1.security.redact-info-log. Its default value is false, which means that desensitization is disabled. To enable desensitization for tikv-server logs, set the variable value to true.security.redact-info-log. Its default value is false, which means that desensitization is disabled. To enable desensitization for pd-server logs, set the variable value to true.security.redact_info_log for tiflash-server and security.redact-info-log for tiflash-learner. Their default values are both false, which means that desensitization is disabled. To enable desensitization for tiflash-server and tiflash-learner logs, set the values of both variables to true.This feature is introduced in v5.0. To use the feature, enable the system variable and all configuration items above.
TiDB introduces the MPP architecture through TiFlash nodes. This architecture allows multiple TiFlash nodes to share the execution workload of large join queries.
When the MPP mode is on, TiDB determines whether to send a query to the MPP engine for computation based on the calculation cost. In the MPP mode, TiDB distributes the computation of table joins to each running TiFlash node by redistributing the join key during data calculation (Exchange operation), and thus accelerates the calculation. Furthermore, with the aggregation computing feature that TiFlash has already supported, TiDB can pushdown the computation of a query to the TiFlash MPP cluster. Then the distributed environment can help accelerate the entire execution process and dramatically increase the speed of analytic queries.
In the TPC-H 100 benchmark test, TiFlash MPP delivers significant processing speed over analytic engines of traditional analytic databases and SQL on Hadoop. With this architecture, you can perform large-scale analytic queries directly on the latest transaction data, with a higher performance than traditional offline analytic solutions. According to the benchmark, with the same cluster resource, TiDB 5.0 MPP shows 2 to 3 times of speedup over Greenplum 6.15.0 and Apache Spark 3.1.1, and some queries have 8 times better performance.
Currently, the main features that the MPP mode does not support are as follows (For details, refer to Use TiFlash):
When you are designing table structures or analyzing database behaviors, it is recommended to use the clustered index feature if you find that some columns with primary keys are often grouped and sorted, queries on these columns often return a certain range of data or a small amount of data with different values, and the corresponding data does not cause read or write hotspot issues.
Clustered indexes, also known as index-organized tables in some database management systems, is a storage structure associated with the data of a table. When creating a clustered index, you can specify one or more columns from the table as the keys for the index. TiDB stores these keys in a specific structure, which allows TiDB to quickly and efficiently find the rows associated with the keys, thus improves the performance of querying and writing data.
When the clustered index feature is enabled, the TiDB performance improves significantly (for example in the Sysbench test, the read and write performance of TiDB, with clustered index enabled, improves by 58.1%) in the following cases:
Each table can either use a clustered or non-clustered index to sort and store data. The differences of these two storage structures are as follows:
When table data is modified, the database system automatically maintains clustered indexes and non-clustered indexes for you.
All primary keys are created as non-clustered indexes by default. You can create a primary key as a clustered index or non-clustered index in either of the following two ways:
It is recommended that users use the CLUSTERED and NON-CLUSTERED usage, deliberately leaving out INT_ONLY
CLUSTERED | NONCLUSTERED in the statement when creating a table, then the system creates the table in the specified way. The syntax is as follows:CREATE TABLE `t` (`a` VARCHAR(255), `b` INT, PRIMARY KEY (`a`, `b`) CLUSTERED);
Or
CREATE TABLE `t` (`a` VARCHAR(255) PRIMARY KEY CLUSTERED, `b` INT);
You can execute the statement SHOW INDEX FROM tbl-name to query whether a table has a clustered index.
tidb_enable_clustered_index to control the clustered index feature. Supported values are ON, OFF, and INT_ONLY.
ON: Indicates that the clustered index feature is enabled for all types of primary keys. Adding and dropping non-clustered indexes are supported.OFF: Indicates that the clustered index feature is disabled for all types of primary keys. Adding and dropping non-clustered indexes are supported.INT_ONLY: The default value. If the variable is set to INT_ONLY and alter-primary-key is set to false, the primary keys which consist of single integer columns are created as clustered indexes by default. The behavior is consistent with that of TiDB v5.0 and earlier versions.If a CREATE TABLE statement contains the keyword CLUSTERED | NONCLUSTERED, the statement overrides the configuration of the system variable and the configuration item.
You are recommended to use the clustered index feature by specifying the keyword CLUSTERED | NONCLUSTERED in statements. In this way, it is more flexible for TiDB to use all data types of clustered and non-clustered indexes in the system at the same time as required.
It is not recommended to use tidb_enable_clustered_index = INT_ONLY, because INT_ONLY is temporarily used to make this feature compatible and will be deprecated in the future.
Limitations for the clustered index are as follows:
ALTER TABLE statements are not supported.UNIQUE KEY as a clustered index is not supported.SHARD_ROW_ID_BITS and PRE_SPLIT_REGIONS is not supported.The client of the database will wait for the database system to complete the transaction commit in two phases (2PC) synchronously. The transaction returns the result to the client after the first phase commit is successful, and the system executes the second phase commit operation in the background asynchronously to reduce the transaction commit latency. If the transaction write involves only one Region, the second phase is omitted directly, and the transaction becomes a one-phase commit.
After the Async Commit feature is enabled, with the same hardware and configuration, when Sysbench is set to test the Update index with 64 threads, the average latency decreases by 41.7% from 12.04ms to 7.01ms.
When Async Commit feature is enabled, to reduce one network interaction latency and improve the performance of data writes, database application developers are recommended to consider reducing the consistency of transactions from linear consistency to causal consistency. The SQL statement to enable causal consistency is START TRANSACTION WITH CAUSAL CONSISTENCY.
After the causal consistency is enabled, with the same hardware and configuration, when Sysbench is set to test oltp_write_only with 64 threads, the average latency decreased by 5.6% from 11.86ms to 11.19ms.
After the consistency of transactions is reduced from the linear consistency to causal consistency, if there is no interdependence between multiple transactions in the application, the transactions do not have a globally consistent order.
The Async Commit feature is enabled by default for newly created v5.0 clusters.
This feature is disabled by default for clusters upgraded from earlier versions to v5.0. You can enable this feature by executing the set global tidb_enable_async_commit = ON; and set global tidb_enable_1pc = ON; statements.
The limitation for the Async Commit feature is as follows:
In 5.0 GA, the Coprocessor cache feature is enabled by default. After this feature is enabled, to reduce the latency of reading data, TiDB caches the calculation results of the operators pushed down to tikv-server in tidb-server.
To disable the Coprocessor cache feature, you can modify the capacity-mb configuration item of tikv-client.copr-cache to 0.0.
delete * from table where id <? Limit ? statementThe p99 performance of the delete * from table where id <? limit ? statement is improved by 4 times.
When TiDB performs garbage collection (GC) and data compaction, partitions occupy CPU and I/O resources. Overlapping data exists during the execution of these two tasks.
To reduce GC’s consumption of CPU and I/O resources, the GC Compaction Filter feature combines these two tasks into one and executes them in the same task. This feature is enabled by default. You can disable it by configuring gc.enable-compaction-filter = false.
INSERT、REPLACE、UPDATE、DELETE statementsWhen tuning performance or maintaining the database, if you find that the system performance is unstable due to unstable execution plans, you can select a manually optimized SQL statement according to your judgement or tested by EXPLAIN ANALYZE. You can bind the optimized SQL statement to the SQL statement to be executed in the application code to ensure stable performance.
When manually binding SQL statements using the SQL BINDING statement, you need to ensure that the optimized SQL statement has the same syntax as the original SQL statement.
You can view the manually or automatically bound execution plan information by running the SHOW {GLOBAL | SESSION} BINDINGS command. The output is the same as that of versions earlier than v5.0.
When upgrading TiDB, to avoid performance jitter, you can enable the baseline capturing feature to allow the system to automatically capture and bind the latest execution plan and store it in the system table. After TiDB is upgraded, you can export the bound execution plan by running the SHOW GLOBAL BINDING command and decide whether to delete these plans.
This feature is disbled by default. You can enable it by modifying the server or setting the tidb_capture_plan_baselines global system variable to ON. When this feature is enabled, the system fetches the SQL statements that appear at least twice from the Statement Summary every bind-info-lease (the default value is 3s), and automatically captures and binds these SQL statements.
Add a system variable tidb_allow_fallback_to_tikv to fall back queries to TiKV when TiFlash fails. The default value is OFF.
Track the memory usage of aggregate functions. This feature is enabled by default. When SQL statements with aggregate functions are executed, if the total memory usage of the current query exceeds the threshold set by mem-quota-query, the system automatically performs operations defined by oom-action.
During the troubleshooting of SQL performance issues, detailed diagnostic information is needed to determine the causes of performance issues. Before TiDB 5.0, the information collected by the EXPLAIN statements was not detailed enough. The root causes of the issues can only be determined based on log information, monitoring information, or even on guess, which might be inefficient.
In TiDB v5.0, the following improvements are made to help you troubleshoot performance issues more efficiently:
EXPLAIN ANALYZE statement to analyze all DML statements to show the actual performance plans and the execution information of each operator. #18056EXPLAIN FOR CONNECTION statement to check the real-time status of all the SQL statements being executed. For example, you can use the statement to check the execution duration of each operator and the number of processed rows. #18233EXPLAIN ANALYZE statement, including the number of RPC requests sent by operators, the duration of resolving lock conflicts, network latency, the scanned volume of deleted data in RocksDB, and the hit rate of RocksDB caches. #18663EXPLAIN ANALYZE statement, which includes the time consumed by each operator, the number of processed rows, and the number of sent RPC requests. #15009TiDB adds cluster usage metrics in telemetry, such as the number of data tables, the number of queries, and whether new features are enabled.
To learn more about details and how to disable this behavior, refer to telemetry.
release notes
Published 4/7/2021
MajorContains breaking changesAdd the tidb_executor_concurrency system variable to control the concurrency of multiple operators. The previous tidb_*_concurrency settings (such as tidb_projection_concurrency) still take effect but with a warning when you use them.
Add the tidb_skip_ascii_check system variable to specify whether to skip the ASCII validation check when the ASCII character set is written. This default value is OFF.
Add the tidb_enable_strict_double_type_check system variable to determine whether the syntax like double(N) can be defined in the table schema. This default value is OFF.
Change the default value of tidb_dml_batch_size from 20000 to 0. This means that batch DML statements are no longer used by default in LOAD/INSERT INTO SELECT .... Instead, large transactions are used to comply with the strict ACID semantics.
Note:
The scope of the variable is changed from session to global, and the default value is changed from
20000to0. If the application relies on the original default value, you need to use theset globalstatement to modify the variable to the original value after the upgrade.
Control temporary tables’ syntax compatibility using the tidb_enable_noop_functions system variable. When this variable value is OFF, the CREATE TEMPORARY TABLE syntax returns an error.
Add the following system variables to directly control the garbage collection-related parameters:
Change the default value of enable-joint-consensus from false to true, which enables the Joint Consensus feature by default.
Change the value of tidb_enable_amend_pessimistic_txn from 0 or 1 to ON or OFF.
Change the default value of tidb_enable_clustered_index from OFF to INT_ONLY with the following new meanings:
OFF: clustered index is enabled. Adding or deleting non-clustered indexes is supported.ON: clustered index is disabled. Adding or deleting non-clustered indexes is supported.INT_ONLY: the default value. The behavior is consistent with that before v5.0. You can control whether to enable clustered index for the INT type together with alter-primary-key = false.Note:
The
INT_ONLYvalue oftidb_enable_clustered_indexin 5.0 GA has the same meaning as theOFFvalue in 5.0 RC. After upgrading from a 5.0 RC cluster with theOFFsetting to 5.0 GA, it will be displayed asINT_ONLY.
Add the index-limit configuration item for TiDB to be compatible with MySQL’s limit on the maximum number of indexes. If the index setting exceeds the default value of this configuration item, when the table schema is re-imported into MySQL, an error is reported. This item’s value defaults to 64 and ranges between [64,64*8].
Add the enable-enum-length-limit configuration item for TiDB to be compatible and consistent with MySQL’s ENUM/SET length (ENUM length < 255). The default value is true.
Replace the pessimistic-txn.enable configuration item with the tidb_txn_mode environment variable.
Replace the performance.max-memory configuration item with performance.server-memory-quota
Replace the tikv-client.copr-cache.enable configuration item with tikv-client.copr-cache.capacity-mb. If the item’s value is 0.0, this feature is disabled. If the item’s value is greater than 0.0, this feature is enabled. Its default value is 1000.0.
Replace the rocksdb.auto-tuned configuration item with rocksdb.rate-limiter-auto-tuned.
Delete the raftstore.sync-log configuration item. By default, written data is forcibly spilled to the disk. Before v5.0, you can explicitly disable raftstore.sync-log. Since v5.0, the configuration value is forcibly set to true.
Change the default value of the gc.enable-compaction-filter configuration item from false to true.
Change the default value of the enable-cross-table-merge configuration item from false to true.
Change the default value of the rate-limiter-auto-tuned configuration item from false to true.
With the list partitioning feature, you can effectively query and maintain tables with a large amount of data.
With this feature enabled, partitions and how data is distributed among partitions are defined according to the PARTITION BY LIST(expr) PARTITION part_name VALUES IN (...) expression. The partitioned tables’ data set supports at most 1024 distinct integer values. You can define the values using the PARTITION ... VALUES IN (...) clause.
To enable list partitioning, set the session variable tidb_enable_list_partition to ON.
List COLUMNS partitioning is a variant of list partitioning. You can use multiple columns as partition keys. Besides the integer data type, you can also use the columns in the string, DATE, and DATETIME data types as partition columns.
To enable List COLUMNS partitioning, set the session variable tidb_enable_list_partition to ON.
When you tune performance or select optimal indexes, you can set an index to be Visible or Invisible by using SQL statements. This setting can avoid performing resource-consuming operations, such as DROP INDEX and ADD INDEX.
To modify the visibility of an index, use the ALTER INDEX statement. After the modification, the optimizer decides whether to add this index to the index list based on the index visibility.
EXCEPT and INTERSECT operatorsThe INTERSECT operator is a set operator, which returns the intersection of the result sets of two or more queries. To some extent, it is an alternative to the Inner Join operator.
The EXCEPT operator is a set operator, which combines the result sets of two queries and returns elements that are in the first query result but not in the second.
In the pessimistic transaction mode, if the tables involved in a transaction contain concurrent DDL operations or SCHEMA VERSION changes, the system automatically updates the transaction's SCHEMA VERSION to the latest to ensure the successful transaction commit, and to avoid that the client receives the Information schema is changed error when the transaction is interrupted by DDL operations or SCHEMA VERSION changes.
This feature is disabled by default. To enable the feature, modify the value of tidb_enable_amend_pessimistic_txn system variable. This feature is introduced in v4.0.7 and has the following issues fixed in v5.0:
Add Column operationsCurrently, this feature still has the following incompatibility issues:
Change Columnutf8mb4_unicode_ci and utf8_unicode_ci collations. User document, #17596To meet security compliance requirements (such as General Data Protection Regulation, or GDPR), the system supports desensitizing information (such as ID and credit card number) in the output error messages and logs, which can avoid leaking sensitive information.
TiDB supports desensitizing the output log information. To enable this feature, use the following switches:
tidb_redact_log. Its default value is 0, which means that desensitization is disabled. To enable desensitization for tidb-server logs, set the variable value to 1.security.redact-info-log. Its default value is false, which means that desensitization is disabled. To enable desensitization for tikv-server logs, set the variable value to true.security.redact-info-log. Its default value is false, which means that desensitization is disabled. To enable desensitization for pd-server logs, set the variable value to true.security.redact_info_log for tiflash-server and security.redact-info-log for tiflash-learner. Their default values are both false, which means that desensitization is disabled. To enable desensitization for tiflash-server and tiflash-learner logs, set the values of both variables to true.This feature is introduced in v5.0. To use the feature, enable the system variable and all configuration items above.
TiDB introduces the MPP architecture through TiFlash nodes. This architecture allows multiple TiFlash nodes to share the execution workload of large join queries.
When the MPP mode is on, TiDB determines whether to send a query to the MPP engine for computation based on the calculation cost. In the MPP mode, TiDB distributes the computation of table joins to each running TiFlash node by redistributing the join key during data calculation (Exchange operation), and thus accelerates the calculation. Furthermore, with the aggregation computing feature that TiFlash has already supported, TiDB can pushdown the computation of a query to the TiFlash MPP cluster. Then the distributed environment can help accelerate the entire execution process and dramatically increase the speed of analytic queries.
In the TPC-H 100 benchmark test, TiFlash MPP delivers significant processing speed over analytic engines of traditional analytic databases and SQL on Hadoop. With this architecture, you can perform large-scale analytic queries directly on the latest transaction data, with a higher performance than traditional offline analytic solutions. According to the benchmark, with the same cluster resource, TiDB 5.0 MPP shows 2 to 3 times of speedup over Greenplum 6.15.0 and Apache Spark 3.1.1, and some queries have 8 times better performance.
Currently, the main features that the MPP mode does not support are as follows (For details, refer to Use TiFlash):
When you are designing table structures or analyzing database behaviors, it is recommended to use the clustered index feature if you find that some columns with primary keys are often grouped and sorted, queries on these columns often return a certain range of data or a small amount of data with different values, and the corresponding data does not cause read or write hotspot issues.
Clustered indexes, also known as index-organized tables in some database management systems, is a storage structure associated with the data of a table. When creating a clustered index, you can specify one or more columns from the table as the keys for the index. TiDB stores these keys in a specific structure, which allows TiDB to quickly and efficiently find the rows associated with the keys, thus improves the performance of querying and writing data.
When the clustered index feature is enabled, the TiDB performance improves significantly (for example in the Sysbench test, the read and write performance of TiDB, with clustered index enabled, improves by 58.1%) in the following cases:
Each table can either use a clustered or non-clustered index to sort and store data. The differences of these two storage structures are as follows:
When table data is modified, the database system automatically maintains clustered indexes and non-clustered indexes for you.
All primary keys are created as non-clustered indexes by default. You can create a primary key as a clustered index or non-clustered index in either of the following two ways:
It is recommended that users use the CLUSTERED and NON-CLUSTERED usage, deliberately leaving out INT_ONLY
CLUSTERED | NONCLUSTERED in the statement when creating a table, then the system creates the table in the specified way. The syntax is as follows:CREATE TABLE `t` (`a` VARCHAR(255), `b` INT, PRIMARY KEY (`a`, `b`) CLUSTERED);
Or
CREATE TABLE `t` (`a` VARCHAR(255) PRIMARY KEY CLUSTERED, `b` INT);
You can execute the statement SHOW INDEX FROM tbl-name to query whether a table has a clustered index.
tidb_enable_clustered_index to control the clustered index feature. Supported values are ON, OFF, and INT_ONLY.
ON: Indicates that the clustered index feature is enabled for all types of primary keys. Adding and dropping non-clustered indexes are supported.OFF: Indicates that the clustered index feature is disabled for all types of primary keys. Adding and dropping non-clustered indexes are supported.INT_ONLY: The default value. If the variable is set to INT_ONLY and alter-primary-key is set to false, the primary keys which consist of single integer columns are created as clustered indexes by default. The behavior is consistent with that of TiDB v5.0 and earlier versions.If a CREATE TABLE statement contains the keyword CLUSTERED | NONCLUSTERED, the statement overrides the configuration of the system variable and the configuration item.
You are recommended to use the clustered index feature by specifying the keyword CLUSTERED | NONCLUSTERED in statements. In this way, it is more flexible for TiDB to use all data types of clustered and non-clustered indexes in the system at the same time as required.
It is not recommended to use tidb_enable_clustered_index = INT_ONLY, because INT_ONLY is temporarily used to make this feature compatible and will be deprecated in the future.
Limitations for the clustered index are as follows:
ALTER TABLE statements are not supported.UNIQUE KEY as a clustered index is not supported.SHARD_ROW_ID_BITS and PRE_SPLIT_REGIONS is not supported.The client of the database will wait for the database system to complete the transaction commit in two phases (2PC) synchronously. The transaction returns the result to the client after the first phase commit is successful, and the system executes the second phase commit operation in the background asynchronously to reduce the transaction commit latency. If the transaction write involves only one Region, the second phase is omitted directly, and the transaction becomes a one-phase commit.
After the Async Commit feature is enabled, with the same hardware and configuration, when Sysbench is set to test the Update index with 64 threads, the average latency decreases by 41.7% from 12.04ms to 7.01ms.
When Async Commit feature is enabled, to reduce one network interaction latency and improve the performance of data writes, database application developers are recommended to consider reducing the consistency of transactions from linear consistency to causal consistency. The SQL statement to enable causal consistency is START TRANSACTION WITH CAUSAL CONSISTENCY.
After the causal consistency is enabled, with the same hardware and configuration, when Sysbench is set to test oltp_write_only with 64 threads, the average latency decreased by 5.6% from 11.86ms to 11.19ms.
After the consistency of transactions is reduced from the linear consistency to causal consistency, if there is no interdependence between multiple transactions in the application, the transactions do not have a globally consistent order.
The Async Commit feature is enabled by default for newly created v5.0 clusters.
This feature is disabled by default for clusters upgraded from earlier versions to v5.0. You can enable this feature by executing the set global tidb_enable_async_commit = ON; and set global tidb_enable_1pc = ON; statements.
The limitation for the Async Commit feature is as follows:
In 5.0 GA, the Coprocessor cache feature is enabled by default. After this feature is enabled, to reduce the latency of reading data, TiDB caches the calculation results of the operators pushed down to tikv-server in tidb-server.
To disable the Coprocessor cache feature, you can modify the capacity-mb configuration item of tikv-client.copr-cache to 0.0.
delete * from table where id <? Limit ? statementThe p99 performance of the delete * from table where id <? limit ? statement is improved by 4 times.
When TiDB performs garbage collection (GC) and data compaction, partitions occupy CPU and I/O resources. Overlapping data exists during the execution of these two tasks.
To reduce GC’s consumption of CPU and I/O resources, the GC Compaction Filter feature combines these two tasks into one and executes them in the same task. This feature is enabled by default. You can disable it by configuring gc.enable-compaction-filter = false.
INSERT、REPLACE、UPDATE、DELETE statementsWhen tuning performance or maintaining the database, if you find that the system performance is unstable due to unstable execution plans, you can select a manually optimized SQL statement according to your judgement or tested by EXPLAIN ANALYZE. You can bind the optimized SQL statement to the SQL statement to be executed in the application code to ensure stable performance.
When manually binding SQL statements using the SQL BINDING statement, you need to ensure that the optimized SQL statement has the same syntax as the original SQL statement.
You can view the manually or automatically bound execution plan information by running the SHOW {GLOBAL | SESSION} BINDINGS command. The output is the same as that of versions earlier than v5.0.
When upgrading TiDB, to avoid performance jitter, you can enable the baseline capturing feature to allow the system to automatically capture and bind the latest execution plan and store it in the system table. After TiDB is upgraded, you can export the bound execution plan by running the SHOW GLOBAL BINDING command and decide whether to delete these plans.
This feature is disbled by default. You can enable it by modifying the server or setting the tidb_capture_plan_baselines global system variable to ON. When this feature is enabled, the system fetches the SQL statements that appear at least twice from the Statement Summary every bind-info-lease (the default value is 3s), and automatically captures and binds these SQL statements.
Add a system variable tidb_allow_fallback_to_tikv to fall back queries to TiKV when TiFlash fails. The default value is OFF.
Track the memory usage of aggregate functions. This feature is enabled by default. When SQL statements with aggregate functions are executed, if the total memory usage of the current query exceeds the threshold set by mem-quota-query, the system automatically performs operations defined by oom-action.
During the troubleshooting of SQL performance issues, detailed diagnostic information is needed to determine the causes of performance issues. Before TiDB 5.0, the information collected by the EXPLAIN statements was not detailed enough. The root causes of the issues can only be determined based on log information, monitoring information, or even on guess, which might be inefficient.
In TiDB v5.0, the following improvements are made to help you troubleshoot performance issues more efficiently:
EXPLAIN ANALYZE statement to analyze all DML statements to show the actual performance plans and the execution information of each operator. #18056EXPLAIN FOR CONNECTION statement to check the real-time status of all the SQL statements being executed. For example, you can use the statement to check the execution duration of each operator and the number of processed rows. #18233EXPLAIN ANALYZE statement, including the number of RPC requests sent by operators, the duration of resolving lock conflicts, network latency, the scanned volume of deleted data in RocksDB, and the hit rate of RocksDB caches. #18663EXPLAIN ANALYZE statement, which includes the time consumed by each operator, the number of processed rows, and the number of sent RPC requests. #15009TiDB adds cluster usage metrics in telemetry, such as the number of data tables, the number of queries, and whether new features are enabled.
To learn more about details and how to disable this behavior, refer to telemetry.
TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.