当前位置:网站首页>Tikv thread pool performance tuning
Tikv thread pool performance tuning
2022-07-19 11:42:00 【Tianxiang shop】
This paper mainly introduces TiKV The main means of thread pool performance tuning , as well as TiKV The main purpose of the internal thread pool .
Introduction to thread pool
stay TiKV in , Thread pool is mainly composed of gRPC、Scheduler、UnifyReadPool、Raftstore、StoreWriter、Apply、RocksDB And other occupation CPU A few timing tasks and detection components , Here we mainly introduce several occupation CPU There are many thread pools that will affect the performance of user read and write requests .
- gRPC Thread pool : Handle all network requests , It will forward requests of different task types to different thread pools .
- Scheduler Thread pool : Be responsible for detecting write transaction conflicts , Commit the two phases of the transaction 、 Pessimistic lock lock 、 Transaction rollback and other requests are converted to key-value An array , And then to Raftstore Thread is running Raft Log copy .
- Raftstore Thread pool :
- Deal with all Raft Messages and proposals for adding new logs (Propose).
- Handle Raft journal . If store-io-pool-size The value of the configuration item is
0,Raftstore Thread writes log to disk ; If the value is not0,Raftstore The thread sends the log to StoreWriter threading . - When the log is agreed in most copies ,Raftstore The thread sends the log to Apply threading .
- StoreWriter Thread pool : Responsible for bringing all Raft Log write to disk , Then return the result to Raftstore Threads .
- Apply Thread pool : When received from Raftstore After the submitted log sent by the thread pool , Be responsible for parsing it into key-value request , And then write RocksDB And call the callback function to notify gRPC The write request in the thread pool is completed , Returns the result to the client .
- RocksDB Thread pool :RocksDB Conduct Compact and Flush Thread pool for tasks , About RocksDB Architecture and Compact Please refer to RocksDB: A Persistent Key-Value Store for Flash and RAM Storage.
- UnifyReadPool Thread pool : from Coprocessor Thread pool and Storage Read Pool Combined , All read requests include kv get、kv batch get、raw kv get、coprocessor Etc. will be executed in this thread pool .
TiKV Read only request for
TiKV There are two types of read requests :
- One is a simple query that specifies to query a certain row or several rows , This kind of query will run in Storage Read Pool in .
- The other is complex aggregation computation 、 Range queries , Such requests will run in Coprocessor Read Pool in .
from TiKV 5.0 Since version , By default, all read requests are queried through a unified thread pool . If from TiKV 4.0 Upgraded TiKV The cluster is not opened before upgrading readpool.storage Of use-unified-pool To configure , After the upgrade, all read requests continue to use independent thread pools for queries , Can be readpool.storage.use-unified-pool Set to true Make all read requests query through a unified thread pool .
TiKV Thread pool tuning
gRPC The size of the thread pool is configured by default (
server.grpc-concurrency) yes 5. because gRPC Thread pools have little computational overhead , It is mainly responsible for the network IO、 Deserialize the request , Therefore, this configuration usually does not need to be adjusted .- If the deployed machine CPU The number of cores is very small ( Less than or equal to 8), Consider configuring (
server.grpc-concurrency) Set to 2. - If the machine configuration is high , also TiKV Undertake a lot of read and write requests , The observed Grafana Monitoring on Thread CPU Of gRPC poll CPU The value of exceeds server.grpc-concurrency The size of 80%, Then consider increasing it appropriately
server.grpc-concurrencyTo control the thread pool usage in 80% following ( namely Grafana The index on is lower than80% * server.grpc-concurrencyValue ).
- If the deployed machine CPU The number of cores is very small ( Less than or equal to 8), Consider configuring (
Scheduler Size configuration of thread pool (
storage.scheduler-worker-pool-size) stay TiKV Machine detected CPU The number of cores is greater than or equal to 16 According to shimmer 8, Less than 16 According to shimmer 4. It is mainly used to transform complex transaction requests into simple key-value Reading and writing . however scheduler The thread pool itself does not perform any write operations .If a transaction conflict is detected , Then it will return the conflict result to the client in advance .
If no transaction conflict is detected , Then it will write what needs to be written key-value Merge into one Raft Log to Raftstore Thread is running Raft Log copy .
Generally speaking, in order to avoid excessive thread switching , It's better to make sure that scheduler The utilization of thread pool is kept at 50%~75% Between .( If the thread pool size is 8 Words , that Grafana Upper TiKV-Details.Thread CPU.scheduler worker CPU Should be 400%~600% It is more reasonable )
Raftstore The thread pool is TiKV The most complex thread pool in , Default size ( from
raftstore.store-pool-sizecontrol ) by 2.StoreWriter The default size of the thread pool ( fromraftstore.store-io-pool-sizecontrol ) by 0.When StoreWriter Thread pool size is 0 when , All write requests are processed by the Raftstore Thread to fsync To write RocksDB. At this time, the following tuning actions are recommended :
- take Raftstore The whole thread CPU The utilization rate is controlled at 60% following . When put Raftstore The number of threads is set to the default 2 when , take Grafana Monitoring TiKV-Details、Thread CPU、Raft store CPU The value on the panel is controlled at 120% within . Due to the existence I/O request , Theoretically Raftstore Thread CPU The utilization rate is always lower than 100%.
- It is not recommended to increase the write performance blindly Raftstore Thread pool size , This may backfire , Increase disk burden , Resulting in poor performance .
When StoreWriter The thread pool size is not 0 when , All write requests are made by StoreWriter Thread to fsync To write RocksDB. At this time, the following tuning actions are recommended :
Only on the whole CPU Enable when resources are abundant StoreWriter Thread pool , And will StoreWriter Threads and Raftstore Thread CPU The utilization rate is controlled at 80% following .
With write request in Raftstore Compared with the situation of thread completion , Theoretically StoreWriter Threads processing write requests can significantly reduce write latency and read tail latency . However , Writing faster means Raft Logs have also become more , Which leads to Raftstore Threads 、Apply Threads and gRPC Thread CPU Increased expenses . under these circumstances ,CPU Insufficient resources may offset the optimization effect , On the contrary, it may be slower than the original writing speed , So if CPU It is not recommended to start if there are insufficient resources StoreWriter Threads . because Raftstore Threads put most of I/O Request to hand over to StoreWriter, therefore Raftstore Thread CPU The utilization rate is controlled at 80% Here's how it goes .
In most cases StoreWriter The size of the thread pool is set to 1 or 2 that will do . This is because StoreWriter The size of the thread pool will affect Raft Number of logs , Therefore, the value should not be too large . If CPU The utilization rate is higher than 80%, Consider increasing its size .
Be careful Raft The increase of logs is beneficial to other thread pools CPU The impact of spending , If necessary, it needs to be increased accordingly Raftstore Threads 、Apply Threads and gRPC Number of threads .
UnifyReadPool Responsible for processing all read requests . The default configuration (
readpool.unified.max-thread-count) The size is machine CPU Count 80% ( If the machine is 16 nucleus , The default thread pool size is 12).It is generally recommended to adjust its... According to the characteristics of the business load CPU The utilization rate is the size of the thread pool 60%~90% Between ( If the user Grafana On TiKV-Details.Thread CPU.Unified read pool CPU The peak value of does not exceed 800%, Then it is suggested that users will
readpool.unified.max-thread-countSet to 10, Too many threads will cause more frequent thread switching , And preempt the resources of other thread pools ).RocksDB The thread pool is RocksDB Conduct Compact and Flush Thread pool for tasks , You don't usually need to configure .
If the machine CPU Fewer cores , Can be
rocksdb.max-background-jobsAndraftdb.max-background-jobsAlso set to 4.If Write Stall, You can see Grafana Monitoring RocksDB-kv Medium Write Stall Reason Which indicators are not 0.
If it is by pending compaction bytes Caused by relevant reasons , Can be
rocksdb.max-sub-compactionsSet to 2 perhaps 3( This configuration represents a single compaction job Number of sub threads allowed ,TiKV 4.0 The version defaults to 3,3.0 The version defaults to 1).If the reason is memtable count relevant , It is recommended to increase the
max-write-buffer-number( The default is 5).If the reason is level0 file limit relevant , It is suggested to increase the following parameters to 64 Or higher :
rocksdb.defaultcf.level0-slowdown-writes-trigger rocksdb.writecf.level0-slowdown-writes-trigger rocksdb.lockcf.level0-slowdown-writes-trigger rocksdb.defaultcf.level0-stop-writes-trigger rocksdb.writecf.level0-stop-writes-trigger rocksdb.lockcf.level0-stop-writes-trigger
边栏推荐
- STC8H开发(十四): I2C驱动RX8025T高精度实时时钟芯片
- Unity3d read mpu9250 example source code
- [untitled] CV learning 1 conversion
- 02-2. Default parameters, function overloading, reference, implicit type conversion, about error reporting
- Sword finger offer II 041 Average value of sliding window
- QT -- excellent open source project
- Resources for physics based simulation in computer graphics
- 02-2、缺省参数、函数重载、引用、隐式类型转换、关于报错
- SQL UNION操作符
- A current List of AWESOME Qt and qml
猜你喜欢

【无标题】cv 学习1转换

Dual machine hot standby of Huawei firewall (NGFW)
![[untitled] CV learning 1 conversion](/img/22/55d171f49659e704951ebd82a33f06.png)
[untitled] CV learning 1 conversion

Total number of blocking and waiting in jconsole thread panel (RPM)

024.static and final use traps continued

Unity3d read mpu9250 example source code

【嵌入式单元测试】C语言单元测试框架搭建

From "passive" to "active", how can zeta technology help to upgrade "rfid2.0"?

Limit query of MySQL optimization series

Wechat applet cloud development 1 - Database
随机推荐
02-2. Default parameters, function overloading, reference, implicit type conversion, about error reporting
Chapter 1 of creating virtual machine (vmvare virtual machine)
A summary of C language pointer
Bet Net is a good thing
Redis distributed cache redis cluster
LeetCode刷题——查找和最小的 K 对数字#373#Medium
常见分布式锁介绍
A simple websocket example
Redis分布式緩存-Redis集群
NAT technology and NAT alg
Solution of connecting MySQL instance with public network
夢想CMS 前臺搜索SQL注入
Mpu9250 ky9250 attitude, angle module and mpu9250 MPL DMA comparison
Whether pjudge 21652-[pr 4] has nine [digit DP]
委派双亲之类加载器
function/symbol ‘pango_context_set_round_glyph_positions‘ not found in library ‘libpango-1.0.so.0‘x
Déléguer un chargeur tel qu'un parent
8. Fixed income investment
Send blocking, receive blocking
动态内存分配问题