当前位置:网站首页>Regular expression of rocky basis

Regular expression of rocky basis

2022-07-19 10:02:00 51CTO

Regular expressions

REGEXP: Regular Expressions, Patterns written by a special class of characters and text characters , Some of them are characters ( Metacharacters ) It doesn't mean the literal meaning of the character , And the function of control or configuration , Similar to the enhanced version of the wildcard function , But unlike wildcards , wildcard Function is used to handle file name , and Regular expressions Is dealing with text content character . Regular expressions are widely supported by many programs and development languages :vim, less,grep,sed,awk, nginx,mysql etc.

There are two types of regular expressions : Basic Regular Expression :BRE Basic Regular Expressions Extended regular expression :ERE Extended Regular Expressions

Metacharacter classification of regular expressions : Character matching 、 Number of matches 、 Position anchoring 、 grouping

help :man 7 regex

Basic regular expression metacharacters

Character matching

      
      
. Match any single character ( except \n), It can be a Chinese character or the characters of other countries
[] Match any single character in the specified range , Example :[wang] [0-9] [a-z] [a-zA-Z]
[^] Matches any single character outside the specified range , Example :[^wang]
[:alnum:] Letters and numbers
[:alpha:] Represents any English case character , or A-Z, a-z
[:lower:] Lowercase letters , Example :[[:lower:]], amount to [a-z]
[:upper:] Capital
[:blank:] Blank character ( Spaces and tabs )
[:space:] Including Spaces 、 tabs ( Horizontal and vertical )、 A newline 、 Various types of whitespace such as carriage return , Than [:blank:] It covers a wide range
[:cntrl:] Non printable control characters ( Backspace 、 Delete 、 Alarm bell ...)
[:digit:] Decimal number
[:xdigit:] Hexadecimal number
[:graph:] Printable non blank characters
[:print:] Printable characters
[:punct:] Punctuation
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
      
      
\s # Matches any whitespace characters , Including Spaces 、 tabs 、 Page breaks and so on . Equivalent to [\f\r\t\v]. Be careful Unicode Regular expressions match full space characters
\S # Matches any non-whitespace characters . Equivalent to [^]
\w # Match a letter , Numbers , Underline , Chinese characters , Characters in other countries , Equivalent to [_[:alnum:] word ]
\W # Match a non letter , Numbers , Underline , Chinese characters , Characters in other countries , Equivalent to [^_[:alnum:] word ]
  • 1.
  • 2.
  • 3.
  • 4.

Number of matches

Used after the character to specify the number of times , Used to specify the number of times the preceding characters will appear

      
      
* # Match preceding characters any number of times , Include 0 Time , Greedy mode : Match as long as possible
.* # Any character of any length
\? # Match the character before it to appear 0 Time or 1 Time , namely : not essential
\+ # Matches the least characters that precede it 1 Time , namely : There must be and >=1 Time
\{n\} # Match preceding characters n Time
\{m,n\} # Match preceding characters at least m Time , at most n Time
\{,n\} # Match preceding characters up to n Time ,<=n
\{n,\} # Match preceding characters at least n Time
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.

example :

      
      
[[email protected] ~]#cat goole
goole
gole
goooole
gle
go0989779ljiole
g43590kjiwele

[[email protected] ~]#cat goole | grep 'go*le'
goole
gole
goooole
gle
[[email protected] ~]#cat goole | grep 'go.*le'
goole
gole
goooole
go0989779ljiole
[[email protected] ~]#cat goole | grep 'go\?le'
gole
gle
[[email protected] ~]#cat goole | grep 'go\+le'
goole
gole
goooole
[[email protected] ~]#cat goole | grep 'go\{2\}le'
goole
[[email protected] ~]#cat goole | grep 'go\{2,4\}le'
goole
goooole
[[email protected] ~]#cat goole | grep 'go\{,4\}le'
goole
gole
goooole
gle
[[email protected] ~]#cat goole | grep 'go\{2,\}le'
goole
goooole

---------------------------------------------------------------------------
[[email protected] ~]#cat goole
-1
-2
123
-123
-234
32432
[[email protected] ~]#cat goole |grep '\-\?[0-9]\+'
-1
-2
123
-123
-234
32432
[[email protected] ~]#cat goole |grep '\-[0-9]\+'
-1
-2
-123
-234
--------------------------------------------------------------------------
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.

Position anchoring

Position anchoring can be used to locate the position that appears

      
      
^ # Anchor anchoring , Left most for mode
$ # Tail anchoring , For the far right side of the pattern
^PATTERN$ # For pattern matching entire line
^$ # Blank line
^[[:space:]]*$ # Blank line
\< or \b # Initial anchoring , For the left side of the word pattern # Try not to use \b, stay grep When searching \b and \< Different effects
\> or \b # Suffix anchor , For the right side of the word pattern # Try not to use \b, stay grep When searching \b and \< Different effects
\<PATTERN\> # Match the whole word
# Be careful : Words are made up of letters , Numbers , Underline composition
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.

example :

      
      
[[email protected] ~]#grep ^[^#] /etc/fstab
UUID=69617dca-2b4d-4664-ac18-051ffddf7f30 / xfs defaults 0 0
UUID=7c7f9ef6-8873-49ff-8abf-4dd4817ea481 /boot xfs defaults 0 0
UUID=9c5979a7-43fa-436c-8360-c2545dd81120 none swap defaults 0 0
[[email protected] ~]#grep '^$\|^#' /etc/fstab

#
# /etc/fstab
# Created by anaconda on Sat Jul 2 02:32:29 2022
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
[[email protected] ~]#grep -v '^$\|^#' /etc/fstab
UUID=69617dca-2b4d-4664-ac18-051ffddf7f30 / xfs defaults 0 0
UUID=7c7f9ef6-8873-49ff-8abf-4dd4817ea481 /boot xfs defaults 0 0
UUID=9c5979a7-43fa-436c-8360-c2545dd81120 none swap defaults 0 0
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.

grouping

grouping : Use () Binding multiple characters together , Treat it as a whole , Such as :(root) or (root)+ Backward reference : The matching contents of the patterns in grouping brackets will be recorded in the internal variables by the regular expression engine , These variables are named : \1, \2, \3, ... , \1 Represents the character to which the pattern between the first open bracket from the left and the matching right bracket matches . Be careful : Backward reference The character that matches the pattern in the grouping bracket before the reference , Not the pattern itself Be careful : \0 Represents all characters that the regular expression matches

example :

      
      
user(root):(100)/bin/(noloing).*
\1 : Indicates the contents in the first parenthesis from left to right :root
\2 : Indicates the contents in the second parenthesis from left to right :100
And so on
  • 1.
  • 2.
  • 3.
  • 4.

perhaps

perhaps :|

      
      
a\|b #a or b
C\|cat #C or cat
\(C\|c\)at #Cat or cat
  • 1.
  • 2.
  • 3.

example : Empty lines and exclusion # Beginning line

      
      
[[email protected] ~]#grep -v '^#' /etc/init.d/functions |grep -v '^$' notes :^# Said to # start ,-v Representation inversion ,^$ Indicates a line that begins with a hyphen , That is, the empty line
[[email protected] ~]#grep -v '^#|^$' /etc/init.d/functions
[[email protected] ~]#egrep -v '^(#|$)' /etc/init.d/functions
[[email protected] ~]#grep '^[^#]' /etc/init.d/functions ## No # The first character ,grep The default is line display , If added -o Options , You cannot eliminate empty lines and # Beginning line
  • 1.
  • 2.
  • 3.
  • 4.
Extended regular expression metacharacter

Character matching

      
      
. Any single character
[wang] Characters for the specified range
[^wang] Characters not in the specified range
[:alnum:] Letters and numbers
[:alpha:] Represents any English case character , or A-Z, a-z
[:lower:] Lowercase letters , Example :[[:lower:]], amount to [a-z]
[:upper:] Capital
[:blank:] Blank character ( Spaces and tabs )
[:space:] Horizontal and vertical white space characters ( Than [:blank:] It covers a wide range )
[:cntrl:] Non printable control characters ( Backspace 、 Delete 、 Alarm bell ...)
[:digit:] Decimal number
[:xdigit:] Hexadecimal number
[:graph:] Printable non blank characters
[:print:] Printable characters
[:punct:] Punctuation
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.

Number matching

      
      
* Match preceding characters any number of times
? 0 or 1 Time
+ 1 Times or times
{n} matching n Time
{m,n} At least m, at most n Time
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

Position anchoring

      
      
^ Head of line
$ At the end of the line
\<, \b Initials
\>, \b At the end of the sentence
  • 1.
  • 2.
  • 3.
  • 4.

Group other

      
      
() grouping
Backward reference :\1, \2, ... Be careful : \0 Represents all characters that the regular expression matches
| perhaps
a|b a or b
C|cat C or cat
(C|c)at Cat or cat
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.

example :

      
      
[[email protected] ~]#ifconfig |egrep -o "(([1-9]?[0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([1-9]?[0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])" ## [1-9]?[0-9] Express 1 Number of digits or 2 digit ;| Represents or ; 1[0-9]{2} Express 100-199 Of ;2[0-4][0-9] Express 200-249 ;25[0-5] Express 250-255
  • 1.


原网站

版权声明
本文为[51CTO]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/200/202207171113140867.html