当前位置:网站首页>Ten optimization rules of Clickhouse SQL
Ten optimization rules of Clickhouse SQL
2022-07-18 07:02:00 【Aiky WOW】
as everyone knows ,ClickHouse Of SQL The optimization rules are based on RBO(Rule Based Optimization) Of , So do you know the optimization rules ?
What's next , It's the ten optimization rules I sorted out at the end of last year , Not necessarily all. , It's just for throwing bricks and attracting jade . If you have any additional information , Welcome to write to me .
1. COUNT Optimize :
Calling count Function time , If you are using count() perhaps count(*), And there's no where Conditions , It will be used directly system.tables Of total_rows, for example :
EXPLAIN
SELECT count()
FROM test_x
Query id: d255fb14-7160-4f1a-9148-9810494d792d
┌─explain──────────────────────────────────────────────┐
│ Expression ((Projection + Before ORDER BY)) │
│ MergingAggregated │
│ ReadFromPreparedSource (Optimized trivial count) │
└──────────────────────────────────────────────────────┘Be careful Optimized trivial count , That's right count The optimization of the .
If count Specific column fields , This optimization will not be used :
EXPLAIN
SELECT count(id)
FROM test_x
Query id: 170b10db-88d7-45a1-ae8a-8d683707b635
┌─explain───────────────────────────────────────────────────────────────────────┐
│ Expression ((Projection + Before ORDER BY)) │
│ Aggregating │
│ Expression (Before GROUP BY) │
│ SettingQuotaAndLimits (Set limits and quota after reading from storage) │
│ ReadFromStorage (MergeTree) │
└───────────────────────────────────────────────────────────────────────────────┘2. Eliminate duplicate fields in subqueries :
There are two repetitions in the following sentence query id Field , It's going to be removed :
EXPLAIN SYNTAX
SELECT
a.id,
b.name,
a.price,
b.id
FROM id_join_tb1 AS a
LEFT JOIN
(
SELECT
id,
id,
name,
time
FROM join_tb1
) AS b USING (id)
Query id: 6879ecc6-8579-4f01-964c-9eab4b15687a
┌─explain───────────────┐
│ SELECT │
│ id, │
│ name, │
│ price, │
│ b.id │
│ FROM id_join_tb1 AS a │
│ ALL LEFT JOIN │
│ ( │
│ SELECT │
│ id, │
│ name │
│ FROM join_tb1 │
│ ) AS b USING (id) │
└───────────────────────┘3. Predicate push-down :
When group by Yes having Clause , But no with cube、with rollup perhaps with totals When decorating ,having The filter will be pushed down to where Filtering ahead of time . For example, the following query ,HAVING name Turned into WHERE name, stay group by Before filtering :
EXPLAIN SYNTAX
SELECT name
FROM join_tb1
GROUP BY name
HAVING name = ''
Query id: 6eb2f8eb-2e29-43ae-9414-5914b921a622
┌─explain─────────┐
│ SELECT name │
│ FROM join_tb1 │
│ WHERE name = '' │
│ GROUP BY name │
└─────────────────┘alike , Subqueries also support predicate push down , For example, the WHERE id = 10:
EXPLAIN SYNTAX
SELECT *
FROM
(
SELECT id
FROM id_join_tb1
)
WHERE id = 10
Query id: 44a3e084-4b8a-4847-9909-ec34c8d8be74
┌─explain──────────────┐
│ SELECT id │
│ FROM │
│ ( │
│ SELECT id │
│ FROM id_join_tb1 │
│ WHERE id = 10 │
│ ) │
│ WHERE id = 10 │
└──────────────────────┘Let's do another example :
EXPLAIN SYNTAX
SELECT *
FROM
(
SELECT *
FROM
(
SELECT id
FROM id_join_tb1
)
UNION ALL
SELECT *
FROM
(
SELECT id
FROM id_join_tb1
)
)
WHERE id = 10
Query id: a807c968-a4b9-4f84-a80d-48c8385d2206
┌─explain──────────────────┐
│ SELECT id │
│ FROM │
│ ( │
│ SELECT id │
│ FROM │
│ ( │
│ SELECT id │
│ FROM id_join_tb1 │
│ WHERE id = 10 │
│ ) │
│ WHERE id = 10 │
│ UNION ALL │
│ SELECT id │
│ FROM │
│ ( │
│ SELECT id │
│ FROM id_join_tb1 │
│ WHERE id = 10 │
│ ) │
│ WHERE id = 10 │
│ ) │
│ WHERE id = 10 │
└──────────────────────────┘4. Aggregate computing extrapolation :
Calculations in aggregate functions , Will extrapolate , for example :
EXPLAIN SYNTAX
SELECT sum(id * 2)
FROM join_tb1
Query id: 027a5dce-fa57-447a-9615-888881069d61
┌─explain────────────┐
│ SELECT sum(id) * 2 │
│ FROM join_tb1 │
└────────────────────┘5. Aggregate functions eliminate :
If you want to aggregate bonds , That is to say group by key Use min、max、any Aggregate functions , Then the function is eliminated , for example :
EXPLAIN SYNTAX
SELECT
sum(id * 2),
max(name),
max(id)
FROM join_tb1
GROUP BY id
Query id: 4d72f7fa-5146-4365-adc4-260566f5f414
┌─explain──────────┐
│ SELECT │
│ sum(id) * 2, │
│ max(name), │
│ id │
│ FROM join_tb1 │
│ GROUP BY id │
└──────────────────┘6. Delete duplicate group by key
For example, the following statement , Repeated aggregation bonds id Fields will be de duplicated :
EXPLAIN SYNTAX
SELECT *
FROM join_tb1
ORDER BY
id ASC,
id ASC,
name ASC,
name ASC
Query id: 3fc0267a-9bf7-4811-b384-4a9e90517bbf
┌─explain───────┐
│ SELECT │
│ id, │
│ name, │
│ time │
│ FROM join_tb1 │
│ ORDER BY │
│ id ASC, │
│ name ASC │
└───────────────┘7. Delete duplicate limit by key
For example, the following statement , Repeated statement of name Fields will be de duplicated :
EXPLAIN SYNTAX
SELECT *
FROM join_tb1
LIMIT 3 BY
name,
name
LIMIT 10
Query id: e87a0ed9-66b4-49c7-b6ea-b5c8ad3d7901
┌─explain─────────┐
│ SELECT │
│ id, │
│ name, │
│ time │
│ FROM join_tb1 │
│ LIMIT 3 BY name │
│ LIMIT 10 │
└─────────────────┘8. Delete duplicate USING Key
For example, the following statement , Duplicate Association keys id Fields will be de duplicated :
EXPLAIN SYNTAX
SELECT
a.id,
a.id,
b.name,
a.price,
b.id
FROM id_join_tb1 AS a
LEFT JOIN join_tb1 AS b USING (id, id)
Query id: d0917046-71da-469e-b738-14d947bf53e3
┌─explain────────────────────────────────┐
│ SELECT │
│ id, │
│ id, │
│ name, │
│ price, │
│ b.id │
│ FROM id_join_tb1 AS a │
│ ALL LEFT JOIN join_tb1 AS b USING (id) │
└────────────────────────────────────────┘9. Scalar substitution
If a subquery returns only one row of data , Replace with scalar when referenced , For example, in the following statement total_disk_usage Field :
EXPLAIN SYNTAX
WITH
(
SELECT sum(bytes)
FROM system.parts
WHERE active
) AS total_disk_usage
SELECT
(sum(bytes) / total_disk_usage) * 100 AS table_disk_usage,
table
FROM system.parts
GROUP BY table
ORDER BY table_disk_usage DESC
LIMIT 10
Query id: a9c7431f-cd51-4a85-9fba-b6301578a8cd
┌─explain────────────────────────────────────────────────────────┐
│ WITH identity(CAST(0, 'UInt64')) AS total_disk_usage │
│ SELECT │
│ (sum(bytes) / total_disk_usage) * 100 AS table_disk_usage, │
│ table │
│ FROM system.parts │
│ GROUP BY table │
│ ORDER BY table_disk_usage DESC │
│ LIMIT 10 │
└────────────────────────────────────────────────────────────────┘10. Ternary optimization
If it's on optimize_if_chain_to_multiif Parameters , The ternary operator will be replaced with multiIf function , for example :
EXPLAIN SYNTAX
SELECT number = 1 ? 'hello' : (number = 2 ? 'world' : 'xyz')
FROM numbers(10)
settings optimize_if_chain_to_multiif = 1
Query id: fd5cde0f-a73f-4763-b823-42f9367f658b
┌─explain─────────────────────────────────────────────────────────┐
│ SELECT multiIf(number = 1, 'hello', number = 2, 'world', 'xyz') │
│ FROM numbers(10) │
│ SETTINGS optimize_if_chain_to_multiif = 1 │
└─────────────────────────────────────────────────────────────────┘You have all these rules Get Have we arrived ?
author : Zhu Kai ,ClickHouse One of the contributors ,ClickHouse evangelist , Senior architect ,《ClickHouse Principle analysis and practical application 》《 Enterprise level big data platform construction : Architecture and implementation 》 author ,
More than a decade IT Employment experience , Have in-depth research on mainstream technologies and solutions in the field of big data , Good at architecture design and integration of distributed system . He has led the planning of many big data platform level products 、 Design and R & D work , Rich experience in front line combat .
边栏推荐
猜你喜欢

初学者怎么快速学会SQL

Pd-qc-afc multi protocol decoy chip "ldr6328s"

免驱 USB 转串口 Billboard 芯片(LDR2001)

PD QC decoy power application IC "liderui ldr6328s" is widely used in all sizes of household appliances

Idea class document annotation template settings

卖的越多,挣得越少?利润增长率低于销售增长率的财务分析思路
Redis data structure practice, see how microblogging, wechat, shopping cart, lottery applet is used?

国产之光!高分时空表征学习模型 UniFormer

8. MySQL -- 触发器

A rate distortion optimization of h264 encoder (1)
随机推荐
Redis data structure practice, see how microblogging, wechat, shopping cart, lottery applet is used?
What is the difference between PM and Po & the Chinese meaning of PMO
scala Object关键字声明的类直接调用方法、伴生对象
C语言 栈的顺序表实现
Array and string assignment problem (not initialized when defining)
Yotact structure diagram
(一)MATLAB基础知识
markdown学习笔记 第二章 基本语法 (markdown编辑器下显示)
The meaning of sprint in Agile Development
[proper noun]
Reading notes: review of process consulting I II III
"Equal sign" commonly used in physics \ mathematics
C语言 第八章 数组
H264编码器的一种率失真优化(1)
MySQL-自增、索引、外键、其他操作
H264 decoding sequence display sequence reference sequence
TyPora更换文字颜色和换行
room android sqlite
Serialization and deserialization of flip word /maxqueue/ quadratic tree
About some string related functions, memory functions and some simulations