A rule-based log analyzer & filter

Overview

Flog

一个根据规则集来处理文本日志的工具。

前言

在日常开发过程中,由于缺乏必要的日志规范,导致很多人乱打一通,一个日志文件夹解压缩后往往有几十万行。

日志泛滥会导致信息密度骤减,给排查问题带来了不小的麻烦。

以前都是用grep之类的工具先挑选出有用的,再逐条进行排查,费时费力。在忍无可忍之后决定写这个工具,根据规则自动分析日志、剔除垃圾信息。

使用方法

安装

python setup.py install

基础用法

flog -r rules.yaml /path/to/1.log /path/to/2.log /path/to/3.log -o /path/to/filtered.log

其中:

  • rules.yaml是规则文件
  • /path/to/x.log是原始的日志文件,支持一次输入多个日志文件。
  • /path/to/filtered.log是过滤后的日志文件,如果不指定文件名(直接一个-o),会自动生成一个。

如果不需要过滤日志内容,只需显示分析结果,可以直接:

flog -r rules.yaml /path/to/your.log

规则语法

基础

name: Rule Name #规则集名称
patterns: #规则列表
  # 单行模式,如果匹配到 ^Hello,就输出 Match Hello
  - match: "^Hello"
    message: "Match Hello"
    action: bypass #保留此条日志(会输出到-o指定的文件中)
    
  # 多行模式,以^Hello开头,以^End结束,输出 Match Hello to End,并丢弃此条日志
  - start: "^Hello"
    end: "^End"
    message: "Match Hello to End"
    action: drop

  - start: "Start"
    start_message: "Match Start" #匹配开始时显示的信息
    end: "End"
    end_messagee: "Match End" #结束时显示的信息

纯过滤模式

name: Rule Name
patterns:
  - match: "^Hello" #删除日志中以Hello开头的行
  - start: "^Hello" #多行模式,删除从Hello到End中间的所有内容
    end: "^End"

过滤日志内容,并输出信息

name: Rule Name
patterns:
  - match: "^Hello" #删除日志中以Hello开头的行
    message: "Match Hello"
    action: drop #删除此行日志

规则嵌套

仅多行模式支持规则嵌套。

name: Rule
patterns:
  - start: "^Response.*{$"
    end: "^}"
    patterns:
      - match: "username = (.*)"
        message: "Current user: {{ capture[0] }}"

输入:

Login Response {
  username = zorro
  userid = 123456
}

输出:

Current user: zorro

action

action字段主要用于控制是否过滤此条日志,仅在指定 -o 参数后生效。 取值范围:【dropbypass】。

为了简化纯过滤类型规则的书写,action默认值的规则如下:

  • 如果规则中包含messagestart_messageend_message字段,action默认为bypass,即输出到文件中。
  • 如果规则中不包含message相关字段,action默认为drop,变成一条纯过滤规则。

message

message 字段用于在标准输出显示信息,并且支持 Jinja 模版语法来自定义输出信息内容,通过它可以实现一些简单的日志分析功能。

目前支持的参数有:

  • lines: (多行模式下)匹配到的所有行
  • content: 匹配到的日志内容
  • captures: 正则表达式(match/start/end)捕获的内容

例如:

name: Rule Name
patterns:
  - match: "^Hello (.*)"
    message: "Match {{captures[0]}}"

如果遇到:"Hello lilei",则会在终端输出"Match lilei"

context

可以把日志中频繁出现的正则提炼出来,放到context字段下,避免复制粘贴多次,例如:

name: Rule Name

context:
  timestamp: "\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2}.\\d{3}"
patterns:
  - match: "hello ([^:]*):"
    message: "{{ timestamp }} - {{ captures[0] }}"

输入:2022-04-08 16:52:37.152 hello world: this is a test message
输出:2022-04-08 16:52:37.152 - world

高亮

内置了一些 Jinjafilter,可以在终端高亮输出结果,目前包含:

black, red, green, yellow, blue, purple, cyan, white, bold, light, italic, underline, blink, reverse, strike

例如:

patterns:
  - match: "Error: (.*)"
    message: "{{ captures[0] | red }}"

输入:Error: file not found
输出:file not found

include

支持引入其它规则文件,例如:

name: Rule
include: base #引入同级目录下的 base.yaml 或 base.yml

include支持引入一个或多个文件,例如:

name: Rule
include:
  - base
  - ../base
  - base.yaml
  - base/base1
  - base/base2.yaml
  - ../base.yaml
  - /usr/etc/rules/base.yml

contextpatterns会按照引用顺序依次合并,如果有同名的context,后面的会替换之前的。

License

MIT

Owner
上山打老虎
专业造工具
上山打老虎
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)

Discriminative Sounding Objects Localization Code for our NeurIPS 2020 paper Discriminative Sounding Objects Localization via Self-supervised Audiovis

51 Dec 11, 2022
GPU Accelerated Non-rigid ICP for surface registration

GPU Accelerated Non-rigid ICP for surface registration Introduction Preivous Non-rigid ICP algorithm is usually implemented on CPU, and needs to solve

Haozhe Wu 144 Jan 04, 2023
Generative Exploration and Exploitation - This is an improved version of GENE.

GENE This is an improved version of GENE. In the original version, the states are generated from the decoder of VAE. We have to check whether the gere

33 Mar 23, 2022
Code and results accompanying our paper titled Mixture Proportion Estimation and PU Learning: A Modern Approach at Neurips 2021 (Spotlight)

Mixture Proportion Estimation and PU Learning: A Modern Approach This repository is the official implementation of Mixture Proportion Estimation and P

Approximately Correct Machine Intelligence (ACMI) Lab 23 Dec 28, 2022
Iranian Cars Detection using Yolov5s, PyTorch

Iranian Cars Detection using Yolov5 Train 1- git clone https://github.com/ultralytics/yolov5 cd yolov5 pip install -r requirements.txt 2- Dataset ../

Nahid Ebrahimian 22 Dec 05, 2022
Research on controller area network Intrusion Detection Systems

Group members information Member 1: Lixue Liang Member 2: Yuet Lee Chan Member 3: Xinruo Zhang Member 4: Yifei Han User Manual Generate Attack Packets

Roche 4 Aug 30, 2022
Jax/Flax implementation of Variational-DiffWave.

jax-variational-diffwave Jax/Flax implementation of Variational-DiffWave. (Zhifeng Kong et al., 2020, Diederik P. Kingma et al., 2021.) DiffWave with

YoungJoong Kim 37 Dec 16, 2022
Code for Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions

EMS-COLS-recourse Initial Code for Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions Folder structure: data folder contains raw an

Prateek Yadav 1 Nov 25, 2022
A generalist algorithm for cell and nucleus segmentation.

Cellpose | A generalist algorithm for cell and nucleus segmentation. Cellpose was written by Carsen Stringer and Marius Pachitariu. To learn about Cel

MouseLand 733 Dec 29, 2022
Pytorch implementation of Rosca, Mihaela, et al. "Variational Approaches for Auto-Encoding Generative Adversarial Networks."

alpha-GAN Unofficial pytorch implementation of Rosca, Mihaela, et al. "Variational Approaches for Auto-Encoding Generative Adversarial Networks." arXi

Victor Shepardson 78 Dec 08, 2022
Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

Text-AutoAugment (TAA) This repository contains the code for our paper Text AutoAugment: Learning Compositional Augmentation Policy for Text Classific

LancoPKU 105 Jan 03, 2023
Distance correlation and related E-statistics in Python

dcor dcor: distance correlation and related E-statistics in Python. E-statistics are functions of distances between statistical observations in metric

Carlos Ramos Carreño 108 Dec 27, 2022
codes for "Scheduled Sampling Based on Decoding Steps for Neural Machine Translation" (long paper of EMNLP-2022)

Scheduled Sampling Based on Decoding Steps for Neural Machine Translation (EMNLP-2021 main conference) Contents Overview Background Quick to Use Furth

Adaxry 13 Jul 25, 2022
Code and experiments for "Deep Neural Networks for Rank Consistent Ordinal Regression based on Conditional Probabilities"

corn-ordinal-neuralnet This repository contains the orginal model code and experiment logs for the paper "Deep Neural Networks for Rank Consistent Ord

Raschka Research Group 14 Dec 27, 2022
FaceAnon - Anonymize people in images and videos using yolov5-crowdhuman

Face Anonymizer Blur faces from image and video files in /input/ folder. Require

22 Nov 03, 2022
TensorFlow Tutorials with YouTube Videos

TensorFlow Tutorials Original repository on GitHub Original author is Magnus Erik Hvass Pedersen Introduction These tutorials are intended for beginne

9.1k Jan 02, 2023
A PoC Corporation Relationship Knowledge Graph System on top of Nebula Graph.

Corp-Rel is a PoC of Corpartion Relationship Knowledge Graph System. It's built on top of the Open Source Graph Database: Nebula Graph with a dataset

Wey Gu 20 Dec 11, 2022
TLXZoo - Pre-trained models based on TensorLayerX

Pre-trained models based on TensorLayerX. TensorLayerX is a multi-backend AI fra

TensorLayer Community 13 Dec 07, 2022
A GUI to automatically create a TOPAS-readable MLC simulation file

Python script to create a TOPAS-readable simulation file descriring a Multi-Leaf-Collimator. Builds the MLC using the data from a 3D .stl file.

Sebastian Schäfer 0 Jun 19, 2022
Training vision models with full-batch gradient descent and regularization

Stochastic Training is Not Necessary for Generalization -- Training competitive vision models without stochasticity This repository implements trainin

Jonas Geiping 32 Jan 06, 2023