From 843f637ac5156c31adbf38472e53f69765bed57f Mon Sep 17 00:00:00 2001 From: wjhuang2016 Date: Thu, 26 Sep 2024 17:37:38 +0800 Subject: [PATCH 1/3] done Signed-off-by: wjhuang2016 --- TOC.md | 1 + schema-cache.md | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 39 insertions(+) create mode 100644 schema-cache.md diff --git a/TOC.md b/TOC.md index 19adfdad2bb7..35d49006468c 100644 --- a/TOC.md +++ b/TOC.md @@ -1020,6 +1020,7 @@ - [`schema_unused_indexes`](/sys-schema/sys-schema-unused-indexes.md) - [Metadata Lock](/metadata-lock.md) - [TiDB Accelerated Table Creation](/accelerated-table-creation.md) + - [schema cache](/schema-cache.md) - UI - TiDB Dashboard - [Overview](/dashboard/dashboard-intro.md) diff --git a/schema-cache.md b/schema-cache.md new file mode 100644 index 000000000000..3a2bc0af45ff --- /dev/null +++ b/schema-cache.md @@ -0,0 +1,38 @@ +--- +title: schema cache +aliases: ['/docs-cn/dev/information-schema-cache'] +summary: TiDB adopts an LRU-based caching mechanism for schema information, which significantly reduces memory usage and improves performance in scenarios with a large number of databases and tables. +--- + +# schema cache + +In some multi-tenant scenarios, there may be tens of thousands or even millions of databases and tables. Loading all the schema information of these databases and tables into memory can consume a large amount of memory and degrade access performance. To address this issue, TiDB introduces a schema caching mechanism similar to LRU. Only the schema information of the most recently accessed databases and tables is cached in memory. + +> **Warning:** +> +> This feature is currently an experimental feature and it is not recommended to use in a production environment. This feature might change or be removed without prior notice. If you find a bug, please give feedback by raising an [issue](https://github.com/pingcap/tidb/issues) on GitHub. + +## Configuration + +You can enable the schema caching feature by configuring the system variable [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-new-in-v800) + +## Best Practices + +- In scenarios with a large number of databases and tables (e.g., over 100,000 databases and tables) or when the number of databases and tables is large enough to impact system performance, it is recommended to enable the schema caching feature. +- You can monitor the hit rate of the schema cache by observing the subpanel "Infoschema v2 Cache Operation" under the "Schema load" section in TiDB monitoring. If the hit rate is low, you can increase the value of [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-new-in-v800. +- You can monitor the current size of the schema cache being used by observing the subpanel "Infoschema v2 Cache Size" under the "Schema load" section in TiDB monitoring. +- It is recommended to disable [`performance.force-init-stats`](/tidb-configuration-file.md#force-init-stats-new-in-v657-and-v710) to reduce TiDB startup time. +- If you need to create a large number of tables (e.g., over 100,000 tables), it is recommended to set this parameter to false [`split-table`](/tidb-configuration-file.md#split-table) to reduce the number of regions and thus decrease TiKV's memory usage. + +## Known Limitations + +In scenarios with a large number of databases and tables, the following known issues exist: +- When the tables that need to be accessed are irregularly accessed, such as one set of tables accessed by t1 and another set accessed by t2, and the tidb_schema_cache_size setting is small, the schema information may be frequently evicted and cached, leading to performance fluctuations. This feature is more suitable for scenarios where frequently accessed databases and tables are relatively fixed. +- Statistics information may not be collected in a timely manner. +- Access to some metadata information may become slower. +- Switching the schema cache on or off requires a waiting period. +- Operations that involve enumerating all metadata information may become slower, such as: + + - `SHOW FULL TABLES` + - `FLASHBACK` + - `ALTER TABLE ... SET TIFLASH MODE ...` From ad5858c8e911f5611e28cc9fd53e8190482889b9 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Thu, 26 Sep 2024 22:45:45 +0800 Subject: [PATCH 2/3] Apply suggestions from code review --- TOC.md | 2 +- schema-cache.md | 33 +++++++++++++++++---------------- 2 files changed, 18 insertions(+), 17 deletions(-) diff --git a/TOC.md b/TOC.md index 35d49006468c..a904a53dc341 100644 --- a/TOC.md +++ b/TOC.md @@ -1020,7 +1020,7 @@ - [`schema_unused_indexes`](/sys-schema/sys-schema-unused-indexes.md) - [Metadata Lock](/metadata-lock.md) - [TiDB Accelerated Table Creation](/accelerated-table-creation.md) - - [schema cache](/schema-cache.md) + - [Schema Cache](/schema-cache.md) - UI - TiDB Dashboard - [Overview](/dashboard/dashboard-intro.md) diff --git a/schema-cache.md b/schema-cache.md index 3a2bc0af45ff..d544165c57fe 100644 --- a/schema-cache.md +++ b/schema-cache.md @@ -1,37 +1,38 @@ --- -title: schema cache +title: Schema Cache aliases: ['/docs-cn/dev/information-schema-cache'] -summary: TiDB adopts an LRU-based caching mechanism for schema information, which significantly reduces memory usage and improves performance in scenarios with a large number of databases and tables. +summary: TiDB adopts an LRU-based (Least Recently Used) caching mechanism for schema information, which significantly reduces memory usage and improves performance in scenarios with a large number of databases and tables. --- -# schema cache +# Schema Cache -In some multi-tenant scenarios, there may be tens of thousands or even millions of databases and tables. Loading all the schema information of these databases and tables into memory can consume a large amount of memory and degrade access performance. To address this issue, TiDB introduces a schema caching mechanism similar to LRU. Only the schema information of the most recently accessed databases and tables is cached in memory. +In some multi-tenant scenarios, there might be tens of thousands or even millions of databases and tables. Loading all the schema information of these databases and tables into memory can consume a large amount of memory and degrade access performance. To address this issue, TiDB introduces a schema caching mechanism similar to LRU (Least Recently Used). Only the schema information of the most recently accessed databases and tables is cached in memory. > **Warning:** > > This feature is currently an experimental feature and it is not recommended to use in a production environment. This feature might change or be removed without prior notice. If you find a bug, please give feedback by raising an [issue](https://github.com/pingcap/tidb/issues) on GitHub. -## Configuration +## Configure schema cache -You can enable the schema caching feature by configuring the system variable [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-new-in-v800) +You can enable the schema caching feature by configuring the system variable [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-new-in-v800). -## Best Practices +## Best practices -- In scenarios with a large number of databases and tables (e.g., over 100,000 databases and tables) or when the number of databases and tables is large enough to impact system performance, it is recommended to enable the schema caching feature. -- You can monitor the hit rate of the schema cache by observing the subpanel "Infoschema v2 Cache Operation" under the "Schema load" section in TiDB monitoring. If the hit rate is low, you can increase the value of [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-new-in-v800. -- You can monitor the current size of the schema cache being used by observing the subpanel "Infoschema v2 Cache Size" under the "Schema load" section in TiDB monitoring. +- In scenarios with a large number of databases and tables (for example, more than 100,000 databases and tables) or when the number of databases and tables is large enough to impact system performance, it is recommended to enable the schema caching feature. +- You can monitor the hit rate of the schema cache by observing the subpanel **Infoschema v2 Cache Operation** under the **Schema load** section in TiDB Dashboard. If the hit rate is low, you can increase the value of [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-new-in-v800). +- You can monitor the current size of the schema cache being used by observing the subpanel **Infoschema v2 Cache Size** under the **Schema load** section in TiDB Dashboard. - It is recommended to disable [`performance.force-init-stats`](/tidb-configuration-file.md#force-init-stats-new-in-v657-and-v710) to reduce TiDB startup time. -- If you need to create a large number of tables (e.g., over 100,000 tables), it is recommended to set this parameter to false [`split-table`](/tidb-configuration-file.md#split-table) to reduce the number of regions and thus decrease TiKV's memory usage. +- If you need to create a large number of tables (for example, more than 100,000 tables), it is recommended to set this parameter [`split-table`](/tidb-configuration-file.md#split-table) to `false` to reduce the number of regions and thus decrease TiKV's memory usage. -## Known Limitations +## Known limitations In scenarios with a large number of databases and tables, the following known issues exist: -- When the tables that need to be accessed are irregularly accessed, such as one set of tables accessed by t1 and another set accessed by t2, and the tidb_schema_cache_size setting is small, the schema information may be frequently evicted and cached, leading to performance fluctuations. This feature is more suitable for scenarios where frequently accessed databases and tables are relatively fixed. -- Statistics information may not be collected in a timely manner. -- Access to some metadata information may become slower. + +- When the tables that need to be accessed are irregularly accessed, such as one set of tables accessed by `t1` and another set accessed by `t2`, and the `tidb_schema_cache_size` setting is small, the schema information might be frequently evicted and cached, leading to performance fluctuations. This feature is more suitable for scenarios where frequently accessed databases and tables are relatively fixed. +- Statistics information might not be collected in a timely manner. +- Access to some metadata information might become slower. - Switching the schema cache on or off requires a waiting period. -- Operations that involve enumerating all metadata information may become slower, such as: +- Operations that involve enumerating all metadata information might become slower, such as: - `SHOW FULL TABLES` - `FLASHBACK` From 5afd9d170e76184fc2204fc5afd83fdc17a3fbdd Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Fri, 27 Sep 2024 10:41:49 +0800 Subject: [PATCH 3/3] Update schema-cache.md Co-authored-by: Frank945946 <108602632+Frank945946@users.noreply.github.com> --- schema-cache.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/schema-cache.md b/schema-cache.md index d544165c57fe..029f5a4515df 100644 --- a/schema-cache.md +++ b/schema-cache.md @@ -28,7 +28,7 @@ You can enable the schema caching feature by configuring the system variable [`t In scenarios with a large number of databases and tables, the following known issues exist: -- When the tables that need to be accessed are irregularly accessed, such as one set of tables accessed by `t1` and another set accessed by `t2`, and the `tidb_schema_cache_size` setting is small, the schema information might be frequently evicted and cached, leading to performance fluctuations. This feature is more suitable for scenarios where frequently accessed databases and tables are relatively fixed. +- When the tables that need to be accessed are irregularly accessed, such as one set of tables accessed at time1 and another set accessed at time2, and the value of `tidb_schema_cache_size` is small, the schema information might be frequently evicted and cached, leading to performance fluctuations. This feature is more suitable for scenarios where frequently accessed databases and tables are relatively fixed. - Statistics information might not be collected in a timely manner. - Access to some metadata information might become slower. - Switching the schema cache on or off requires a waiting period.