当前位置:网站首页>The second article, which is still unfinished, will be introduced again, and continue to explain oracledb_ Exporter monitors Oracle, a very low intrusive monitoring scheme.
The second article, which is still unfinished, will be introduced again, and continue to explain oracledb_ Exporter monitors Oracle, a very low intrusive monitoring scheme.
2022-07-26 03:57:00 【Rainbow operation and maintenance technology stack community】
At the beginning
Based on last oracledb_exporter monitor Oracle, A very low intrusive monitoring scheme In the article , This article continues to explain the following :
- Write user-defined monitoring indicators according to actual business needs , Let it really play in production
- oracledb_exporter Spare machine pull master To configure
Consolidate the plan again
This article is about the red box in the figure below
Red box part , yes oracledb_exporter Active / standby scheme of , Combined with the last design , This figure shows the complete monitoring architecture .
oracledb_exporter The design idea of the main and standby scheme of is to follow Prometheus The main and standby design ideas are similar , Architecture no matter how designed , All in order to avoid single points in the production environment .
Consolidate the author's environmental planning again
purpose | Active and standby roles | Physics IP | VIP To take over | VIP Address |
|---|---|---|---|---|
oracledb_exporter | Master | 192.168.11.20 | To take over | 192.168.11.200 |
oracledb_exporter | Backup | 192.168.11.21 | To be taken over | 192.168.11.200 |
Specification of user-defined indicators
- What is a custom indicator
If oracledb_exporter The default monitoring indicators are not what you want , What do I do ? It's simple !oracledb_exporter Support custom metrics , Write the corresponding indicators according to its standard format , Write custom indicators in file format with .toml In the configuration file at the end ( Index document ), that oracledb_exporter How to use this customized indicator file ? There are two ways , as follows :
- Use --custom.metrics Parameters , Specify the index file later export CUSTOM_METRICS=my-custom-metrics.toml
- Set global or local environment variables , as follows :
- Customize the index file format and specification
Write a custom indicator file , It must be written according to its specifications , Let's take the official chestnut to explain , Anatomy anatomy , The official chestnuts are as follows :
[[metric]]
context = "test"
request = "SELECT 1 as value_1, 2 as value_2 FROM DUAL"
metricsdesc = { value_1 = "Simple example returning always 1.", value_2 = "Same but returning always 2." }According to the little chestnuts above , You can know , The elements that must be included are as follows :
- One or more indicators , You need one or more [metric] part , That is, an indicator , It corresponds to a [metric] part
- For each [metric] part , At least there should be the following fields :
- context: Index name ( meaningful )
- request: Write custom sql
- metricsdesc: Description of indicators
User defined indicator practice
Now let's take a more practical case to practice , Suppose you want to get IOPS indicators , This index needs to be calculated . Special attention , Before writing custom metrics , Be sure to put sql Write well , And it should be debugged .
- The author writes a good acquisition iops Of sql as follows :select sum(decode(name,'physical read IO requests',value,'physical write IO requests',value,0)) as iops, sum(decode(name,'physical read bytes',value,'physical write bytes',value,0)) / 1024 / 1024 as mbps from v$sysstat where name in ('physical read IO requests','physical write IO requests','physical read bytes','physical read total bytes', 'physical write bytes','physical write total bytes','physical read total IO requests','physical write total IO requests');
- adopt plsql Tools connected to oracle Execute and debug , See if the results meet expectations , The effect is as follows :
perfect ! Achieved the expectation , mua .
- Create a custom indicator file ”./custom_metrics/performance_metric.toml“ Write as follows : [[metric]] context = "reads_and_writes_per_second" labels = ["iops"] request = "select sum(decode(name,'physical read IO requests',value,'physical write IO requests',value,0)) as iops, sum(decode(name,'physical read bytes',value,'physical write bytes',value,0)) / 1024 / 1024 as mbps from v$sysstat where name in ('physical read IO requests','physical write IO requests','physical read bytes','physical read total bytes', 'physical write bytes','physical write total bytes','physical read total IO requests','physical write total IO requests')" metricsdesc = { iops = " The number of read and write operations per second " }
- start-up oracledb_exporter
The startup script is as follows :
#!/bin/sh
# Monitor the test environment oracle
source .env_var/.9161_192.168.11.8_PDB1_ZABBIX.DB
nohup oracledb_exporter --log.level warn --web.listen-address :9161 --custom.metrics ./custom_metrics/performance_metric.toml >> ./logs/9161_192.168.11.8_PDB1_ZABBIX.DB.log &To start :
[[email protected] oracle]# sh start.sh The effect is as follows :
perfect ! Everything has met expectations !
Other fields about indicators
In practical applications , May also be used in the indicators section labels and ignorezeroresult Field , Let's briefly understand their usage scenarios .
- labels: seeing the name of a thing one thinks of its function , This is the meaning of the label , In addition to giving the index a meaningful name , In fact, you can also define some labels ( Of course, if necessary ) The following is its definition format :[[metric]] ... labels = ["iops", "io", "io_performance"] ... In the case just now , It's used to labels, Multiple tags can be defined for the same indicator , Separated by commas .
- ignorezeroresult: What the hell is this field ? The purpose of this field is to ignore 0 Result , Suppose your customized indicators , If the value obtained at a certain time is 0, But I want to ignore it , Then you can use this field . It is defined in the following format :ignorezeroresult = true Of course , If the specified is not displayed , That is, the default is not ignored 0 Result .
oracledb_exporter Active / standby configuration
oracledb_exporter Of slave The server needs to pull master Server configuration , When master The configuration of has changed , Notify slave, then slave Revisit masetr To pull . In fact, this principle is similar to what I designed before prometheus The principle of pulling configuration files in the active / standby scheme is the same , And the script can also be modified and reused , Let me configure it .
master To configure
According to our previous plan , The root directory of all database monitoring is /data/database_monitoring/ Under the path , So we put the following script in this directory , And pull up . When slave To pull the configuration , It accesses 8000 port ( The default port after pulling ), In this way, the index files of all businesses under this directory can be synchronized .
- Deploy profile synchronization Api
establish startOracledbExporterConfSyncApi.sh
#!/bin/sh
nohup /usr/bin/python -m SimpleHTTPServer > /dev/null &Pull up the script and check
[[email protected] database_monitoring]# sh startOracledbExporterConfSyncApi.sh
[[email protected] database_monitoring]# netstat -tulnp | grep 8000
tcp 0 0 0.0.0.0:8000 0.0.0.0:* LISTEN 1462/python - Note to the detection script when the deployment configuration file changes : This script should also run in /data/database_monitoring/ Under the path
establish startTarPackConf.sh
#!/bin/sh
time_log=`date "+%Y-%m-%d %H:%M:%S"`
echo "${time_log} The Configuration Checker starts "
task_wait_sec=4
find ./business -type f -print0 | xargs -0 md5sum > ./cfmd5/cfmd5.list
while true
do
time_bak=`date "+%Y%m%d%H%M%S"`
time_log=`date "+%Y-%m-%d %H:%M:%S"`
md5sum -c ./cfmd5/cfmd5.list > ./cfmd5/check_cfmd5.log
md5ret=`cat ./cfmd5/check_cfmd5.log | grep "FAILED" | wc -l`
while true
do
if [ ${md5ret} -gt 0 ]
then
echo "${time_log} The configuration file has changed , Trigger the packaging action ."
mv ./business.tar.gz ./backup/business.tar.gz_bak_${time_bak}
tar -zcf business.tar.gz business/
echo 1 > ./notice_slave.action
break
else
echo 0 > ./notice_slave.action
break
fi
done
find ./business -type f -print0 | xargs -0 md5sum > ./cfmd5/cfmd5.list
sleep ${task_wait_sec}
doneContinue to create the directory required by the detection script under this directory
[[email protected] database_monitoring]# mkdir cfmd5
[[email protected] database_monitoring]# mkdir backup
[[email protected] database_monitoring]# mkdir logsPull up the script and check
[[email protected] database_monitoring]# nohup sh ./startTarPackConf.sh >> ./logs/tar_pack.log &
[[email protected] database_monitoring]# ps -aux | grep Tar
root 1755 0.0 0.6 113292 1464 pts/0 S 19:40 0:00 sh ./startTarPackConf.shBackup To configure
- Create a directory of specifications under the data directory [[email protected] ~]# mkdir -p /data/database_monitoring [[email protected] ~]# cd /data/database_monitoring/ [[email protected] database_monitoring]#
- Deploy pull configuration script
Create a scheduled pull configuration script under this path startUpdateSyncConf.sh
#!/bin/sh
time_log=`date "+%Y-%m-%d %H:%M:%S"`
echo "${time_log} Configuration updater starts "
pull_wait_sec=2
while true
do
wget http://192.168.11.20:8000/notice_slave.action -O notice_slave.action > /dev/null 2>&1
status=`cat ./notice_slave.action`
if [ ${status} -eq 1 ]
then
time_bak=`date "+%Y%m%d%H%M%S"`
time_log=`date "+%Y-%m-%d %H:%M:%S"`
echo "${time_log} from master Download the configuration zip file "
wget http://192.168.11.20:8000/business.tar.gz -O business.tar.gz
echo "${time_log} Back up the original configuration directory "
mv ./business ./backup/business_bak_${time_bak}
echo "${time_log} Unzip the downloaded configuration package "
tar -zxf business.tar.gz
fi
sleep ${pull_wait_sec}
doneCreate the directory required by the script
[[email protected] database_monitoring]# mkdir backup
[[email protected] database_monitoring]# mkdir logsPull up the script and check
nohup sh startUpdateSyncConf.sh > ./logs/update_sync.log &Configure synchronous verification
- stay master The author opens the previously configured configuration file , Revised context Content , Added in the back test, Turn into :”reads_and_writes_per_second_test“
[[metric]]
context = "reads_and_writes_per_second_test"
labels = ["iops"]
request = "select sum(decode(name,'physical read IO requests',value,'physical write IO requests',value,0)) as iops, sum(decode(name,'physical read bytes',value,'physical write bytes',value,0)) / 1024 / 1024 as mbps from v$sysstat where name in ('physical read IO requests','physical write IO requests','physical read bytes','physical read total bytes', 'physical write bytes','physical write total bytes','physical read total IO requests','physical write total IO requests')"
metricsdesc = { iops = " The number of read and write operations per second " }- stay backup Check whether there is trigger pull
After modifying the configuration file , The author logs in now backup Have a look , Success and master Keep in sync .
- hold backup Of oracledb_exporter Also pull it up and see the effect
- master
- backup
Perfect ! All are OK What about , Can normally collect monitoring indicators . But we need to pay attention : When used in formal production , Just pull up master Of oracledb_exporter,backup Of oracledb_exporter Don't pull up , When master Hang up ,VIP Will drift to backup Take over . You can go at this time backup Pull it up manually oracledb_exporter, You can also write scripts to automatically pull up , I won't make a demonstration !
At the end
Only this and nothing more ,oracledb_exporter The planning and deployment of the active and standby solutions are all finished , You are welcome to practice according to the author's plan , And give a better plan , We learn and progress together . Thank you again ! Please pay more attention to us , forward 、 Collection 、 give the thumbs-up !
边栏推荐
- How to use graffiti magic color product development kit
- What is the problem of the time series database that has been developed for 5 years?
- Three solutions: when clicking the user to exit the login, press the back button of the browser, and you can still see the previous login page.
- 深度学习之SuperViT
- cpu和gpu已过时,npu和apu的时代开始
- [Reading Notes - > data analysis] Introduction to BDA textbook data analysis
- 某大厂开发和测试干了一架,还用鼠标线勒脖子...
- Six years of automated testing from scratch, I don't regret turning development to testing
- Multi merchant mall system function disassembly lecture 15 - platform side member label
- php 保存数组到文件 var_export、serialize
猜你喜欢

【程序员必备】七夕表白攻略:”月遇从云,花遇和风,晚上的夜空很美“。(附源码合集)

微信小程序实现音乐播放器(4)(使用pubsubjs实现页面间通信)

某大厂开发和测试干了一架,还用鼠标线勒脖子...

php 查找 session 存储文件位置的方法

Failed to install the hcmon driver

The B2B2C multi merchant system has rich functions and is very easy to open

Dracoo master

redux

Connect external MySQL databases in istio Service Grid

Sentinel fusing and current limiting
随机推荐
php中可以使用取绝对值函数 abs() 将负数转成正数
How does redis implement persistence? Explain the AOF trigger mechanism and its advantages and disadvantages in detail, and take you to quickly master AOF
加班一周开发了报表系统,这个低代码免费IT报表神器太好用了
leetcode: 102. 二叉树的层序遍历
Chinese database oceanbase was selected into the Forrester translational data platform report
[in depth study of 4g/5g/6g topic-42]: urllc-13 - in depth interpretation of 3GPP urllc related protocols, specifications and technical principles -7-low delay technology-1-subcarrier spacing expansio
Leetcode-202. happy number
Redis如何实现持久化?详细讲解AOF触发机制及其优缺点,带你快速掌握AOF
[cloud native kubernetes] how to use configmap under kubernetes cluster
[unity3d shader] character projection and reflection
A large factory developed and tested one, and strangled its neck with a mouse line
软考 系统架构设计师 简明教程 | 案例分析解题技巧
【单片机仿真项目】外部中断0控制8个发光二极管闪烁
Tactile intelligent sharing-rk3568 application in scenic spot navigation robot
ZK snark: about private key, ring signature, zkksp
laravel8 实现接口鉴权封装使用JWT
C language functions (2)
苹果在其产品中拿掉了最后一颗Intel芯片
E-commerce operator Xiaobai, how to get started quickly and learn data analysis?
Ali II: how to quickly query a table with tens of millions of data?