当前位置:网站首页>Tikv thread pool performance tuning

Tikv thread pool performance tuning

2022-07-19 11:42:00 Tianxiang shop

This paper mainly introduces TiKV The main means of thread pool performance tuning , as well as TiKV The main purpose of the internal thread pool .

Introduction to thread pool

stay TiKV in , Thread pool is mainly composed of gRPC、Scheduler、UnifyReadPool、Raftstore、StoreWriter、Apply、RocksDB And other occupation CPU A few timing tasks and detection components , Here we mainly introduce several occupation CPU There are many thread pools that will affect the performance of user read and write requests .

  • gRPC Thread pool : Handle all network requests , It will forward requests of different task types to different thread pools .
  • Scheduler Thread pool : Be responsible for detecting write transaction conflicts , Commit the two phases of the transaction 、 Pessimistic lock lock 、 Transaction rollback and other requests are converted to key-value An array , And then to Raftstore Thread is running Raft Log copy .
  • Raftstore Thread pool :
    • Deal with all Raft Messages and proposals for adding new logs (Propose).
    • Handle Raft journal . If  store-io-pool-size  The value of the configuration item is  0,Raftstore Thread writes log to disk ; If the value is not  0,Raftstore The thread sends the log to StoreWriter threading .
    • When the log is agreed in most copies ,Raftstore The thread sends the log to Apply threading .
  • StoreWriter Thread pool : Responsible for bringing all Raft Log write to disk , Then return the result to Raftstore Threads .
  • Apply Thread pool : When received from Raftstore After the submitted log sent by the thread pool , Be responsible for parsing it into key-value request , And then write RocksDB And call the callback function to notify gRPC The write request in the thread pool is completed , Returns the result to the client .
  • RocksDB Thread pool :RocksDB Conduct Compact and Flush Thread pool for tasks , About RocksDB Architecture and Compact Please refer to  RocksDB: A Persistent Key-Value Store for Flash and RAM Storage.
  • UnifyReadPool Thread pool : from Coprocessor Thread pool and Storage Read Pool Combined , All read requests include kv get、kv batch get、raw kv get、coprocessor Etc. will be executed in this thread pool .

TiKV Read only request for

TiKV There are two types of read requests :

  • One is a simple query that specifies to query a certain row or several rows , This kind of query will run in Storage Read Pool in .
  • The other is complex aggregation computation 、 Range queries , Such requests will run in Coprocessor Read Pool in .

from TiKV 5.0 Since version , By default, all read requests are queried through a unified thread pool . If from TiKV 4.0 Upgraded TiKV The cluster is not opened before upgrading  readpool.storage  Of  use-unified-pool  To configure , After the upgrade, all read requests continue to use independent thread pools for queries , Can be  readpool.storage.use-unified-pool  Set to  true  Make all read requests query through a unified thread pool .

TiKV Thread pool tuning

  • gRPC The size of the thread pool is configured by default (server.grpc-concurrency) yes 5. because gRPC Thread pools have little computational overhead , It is mainly responsible for the network IO、 Deserialize the request , Therefore, this configuration usually does not need to be adjusted .

    • If the deployed machine CPU The number of cores is very small ( Less than or equal to 8), Consider configuring (server.grpc-concurrency) Set to 2.
    • If the machine configuration is high , also TiKV Undertake a lot of read and write requests , The observed Grafana Monitoring on Thread CPU Of gRPC poll CPU The value of exceeds server.grpc-concurrency The size of 80%, Then consider increasing it appropriately  server.grpc-concurrency  To control the thread pool usage in 80% following ( namely Grafana The index on is lower than  80% * server.grpc-concurrency  Value ).
  • Scheduler Size configuration of thread pool (storage.scheduler-worker-pool-size) stay TiKV Machine detected CPU The number of cores is greater than or equal to 16 According to shimmer 8, Less than 16 According to shimmer 4. It is mainly used to transform complex transaction requests into simple key-value Reading and writing . however  scheduler The thread pool itself does not perform any write operations .

    • If a transaction conflict is detected , Then it will return the conflict result to the client in advance .

    • If no transaction conflict is detected , Then it will write what needs to be written key-value Merge into one Raft Log to Raftstore Thread is running Raft Log copy .

      Generally speaking, in order to avoid excessive thread switching , It's better to make sure that scheduler The utilization of thread pool is kept at 50%~75% Between .( If the thread pool size is 8 Words , that Grafana Upper TiKV-Details.Thread CPU.scheduler worker CPU Should be 400%~600% It is more reasonable )

  • Raftstore The thread pool is TiKV The most complex thread pool in , Default size ( from  raftstore.store-pool-size  control ) by 2.StoreWriter The default size of the thread pool ( from  raftstore.store-io-pool-size  control ) by 0.

    • When StoreWriter Thread pool size is 0 when , All write requests are processed by the Raftstore Thread to fsync To write RocksDB. At this time, the following tuning actions are recommended :

      • take Raftstore The whole thread CPU The utilization rate is controlled at 60% following . When put Raftstore The number of threads is set to the default 2 when , take Grafana Monitoring  TiKV-DetailsThread CPURaft store CPU  The value on the panel is controlled at 120% within . Due to the existence I/O request , Theoretically Raftstore Thread CPU The utilization rate is always lower than 100%.
      • It is not recommended to increase the write performance blindly Raftstore Thread pool size , This may backfire , Increase disk burden , Resulting in poor performance .
    • When StoreWriter The thread pool size is not 0 when , All write requests are made by StoreWriter Thread to fsync To write RocksDB. At this time, the following tuning actions are recommended :

      • Only on the whole CPU Enable when resources are abundant StoreWriter Thread pool , And will StoreWriter Threads and Raftstore Thread CPU The utilization rate is controlled at 80% following .

        With write request in Raftstore Compared with the situation of thread completion , Theoretically StoreWriter Threads processing write requests can significantly reduce write latency and read tail latency . However , Writing faster means Raft Logs have also become more , Which leads to Raftstore Threads 、Apply Threads and gRPC Thread CPU Increased expenses . under these circumstances ,CPU Insufficient resources may offset the optimization effect , On the contrary, it may be slower than the original writing speed , So if CPU It is not recommended to start if there are insufficient resources StoreWriter Threads . because Raftstore Threads put most of I/O Request to hand over to StoreWriter, therefore Raftstore Thread CPU The utilization rate is controlled at 80% Here's how it goes .

      • In most cases StoreWriter The size of the thread pool is set to 1 or 2 that will do . This is because StoreWriter The size of the thread pool will affect Raft Number of logs , Therefore, the value should not be too large . If CPU The utilization rate is higher than 80%, Consider increasing its size .

      • Be careful Raft The increase of logs is beneficial to other thread pools CPU The impact of spending , If necessary, it needs to be increased accordingly Raftstore Threads 、Apply Threads and gRPC Number of threads .

  • UnifyReadPool Responsible for processing all read requests . The default configuration (readpool.unified.max-thread-count) The size is machine CPU Count 80% ( If the machine is 16 nucleus , The default thread pool size is 12).

    It is generally recommended to adjust its... According to the characteristics of the business load CPU The utilization rate is the size of the thread pool 60%~90% Between ( If the user Grafana On TiKV-Details.Thread CPU.Unified read pool CPU The peak value of does not exceed 800%, Then it is suggested that users will  readpool.unified.max-thread-count  Set to 10, Too many threads will cause more frequent thread switching , And preempt the resources of other thread pools ).

  • RocksDB The thread pool is RocksDB Conduct Compact and Flush Thread pool for tasks , You don't usually need to configure .

    • If the machine CPU Fewer cores , Can be  rocksdb.max-background-jobs  And  raftdb.max-background-jobs  Also set to 4.

    • If Write Stall, You can see Grafana Monitoring  RocksDB-kv  Medium Write Stall Reason Which indicators are not 0.

      • If it is by pending compaction bytes Caused by relevant reasons , Can be  rocksdb.max-sub-compactions  Set to 2 perhaps 3( This configuration represents a single compaction job Number of sub threads allowed ,TiKV 4.0 The version defaults to 3,3.0 The version defaults to 1).

      • If the reason is memtable count relevant , It is recommended to increase the  max-write-buffer-number( The default is 5).

      • If the reason is level0 file limit relevant , It is suggested to increase the following parameters to 64 Or higher :

        rocksdb.defaultcf.level0-slowdown-writes-trigger rocksdb.writecf.level0-slowdown-writes-trigger rocksdb.lockcf.level0-slowdown-writes-trigger rocksdb.defaultcf.level0-stop-writes-trigger rocksdb.writecf.level0-stop-writes-trigger rocksdb.lockcf.level0-stop-writes-trigger

原网站

版权声明
本文为[Tianxiang shop]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/200/202207171446344564.html