Regular expressions
REGEXP: Regular Expressions, Patterns written by a special class of characters and text characters , Some of them are characters ( Metacharacters ) It doesn't mean the literal meaning of the character , And the function of control or configuration , Similar to the enhanced version of the wildcard function , But unlike wildcards , wildcard Function is used to handle file name , and Regular expressions Is dealing with text content character . Regular expressions are widely supported by many programs and development languages :vim, less,grep,sed,awk, nginx,mysql etc.
There are two types of regular expressions : Basic Regular Expression :BRE Basic Regular Expressions Extended regular expression :ERE Extended Regular Expressions
Metacharacter classification of regular expressions : Character matching 、 Number of matches 、 Position anchoring 、 grouping
help :man 7 regex
Basic regular expression metacharacters
Character matching
. Match any single character ( except \n), It can be a Chinese character or the characters of other countries
[] Match any single character in the specified range , Example :[wang] [0-9] [a-z] [a-zA-Z]
[^] Matches any single character outside the specified range , Example :[^wang]
[:alnum:] Letters and numbers
[:alpha:] Represents any English case character , or A-Z, a-z
[:lower:] Lowercase letters , Example :[[:lower:]], amount to [a-z]
[:upper:] Capital
[:blank:] Blank character ( Spaces and tabs )
[:space:] Including Spaces 、 tabs ( Horizontal and vertical )、 A newline 、 Various types of whitespace such as carriage return , Than [:blank:] It covers a wide range
[:cntrl:] Non printable control characters ( Backspace 、 Delete 、 Alarm bell ...)
[:digit:] Decimal number
[:xdigit:] Hexadecimal number
[:graph:] Printable non blank characters
[:print:] Printable characters
[:punct:] Punctuation
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
\s # Matches any whitespace characters , Including Spaces 、 tabs 、 Page breaks and so on . Equivalent to [\f\r\t\v]. Be careful Unicode Regular expressions match full space characters
\S # Matches any non-whitespace characters . Equivalent to [^]
\w # Match a letter , Numbers , Underline , Chinese characters , Characters in other countries , Equivalent to [_[:alnum:] word ]
\W # Match a non letter , Numbers , Underline , Chinese characters , Characters in other countries , Equivalent to [^_[:alnum:] word ]
Number of matches
Used after the character to specify the number of times , Used to specify the number of times the preceding characters will appear
* # Match preceding characters any number of times , Include 0 Time , Greedy mode : Match as long as possible
.* # Any character of any length
\? # Match the character before it to appear 0 Time or 1 Time , namely : not essential
\+ # Matches the least characters that precede it 1 Time , namely : There must be and >=1 Time
\{n\} # Match preceding characters n Time
\{m,n\} # Match preceding characters at least m Time , at most n Time
\{,n\} # Match preceding characters up to n Time ,<=n
\{n,\} # Match preceding characters at least n Time
example :
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
- 57.
- 58.
- 59.
- 60.
Position anchoring
Position anchoring can be used to locate the position that appears
^ # Anchor anchoring , Left most for mode
$ # Tail anchoring , For the far right side of the pattern
^PATTERN$ # For pattern matching entire line
^$ # Blank line
^[[:space:]]*$ # Blank line
\< or \b # Initial anchoring , For the left side of the word pattern # Try not to use \b, stay grep When searching \b and \< Different effects
\> or \b # Suffix anchor , For the right side of the word pattern # Try not to use \b, stay grep When searching \b and \< Different effects
\<PATTERN\> # Match the whole word
# Be careful : Words are made up of letters , Numbers , Underline composition
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
example :
[
[email protected] ~]#grep ^[^#] /etc/fstab
UUID=69617dca-2b4d-4664-ac18-051ffddf7f30 / xfs defaults 0 0
UUID=7c7f9ef6-8873-49ff-8abf-4dd4817ea481 /boot xfs defaults 0 0
UUID=9c5979a7-43fa-436c-8360-c2545dd81120 none swap defaults 0 0
[
[email protected] ~]#grep '^$\|^#' /etc/fstab
#
# /etc/fstab
# Created by anaconda on Sat Jul 2 02:32:29 2022
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
[
[email protected] ~]#grep -v '^$\|^#' /etc/fstab
UUID=69617dca-2b4d-4664-ac18-051ffddf7f30 / xfs defaults 0 0
UUID=7c7f9ef6-8873-49ff-8abf-4dd4817ea481 /boot xfs defaults 0 0
UUID=9c5979a7-43fa-436c-8360-c2545dd81120 none swap defaults 0 0
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
grouping
grouping : Use () Binding multiple characters together , Treat it as a whole , Such as :(root) or (root)+ Backward reference : The matching contents of the patterns in grouping brackets will be recorded in the internal variables by the regular expression engine , These variables are named : \1, \2, \3, ... , \1 Represents the character to which the pattern between the first open bracket from the left and the matching right bracket matches . Be careful : Backward reference The character that matches the pattern in the grouping bracket before the reference , Not the pattern itself Be careful : \0 Represents all characters that the regular expression matches
example :
user(root):(100)/bin/(noloing).*
\1 : Indicates the contents in the first parenthesis from left to right :root
\2 : Indicates the contents in the second parenthesis from left to right :100
And so on
perhaps
perhaps :|
a\|b #a or b
C\|cat #C or cat
\(C\|c\)at #Cat or cat
example : Empty lines and exclusion # Beginning line
[
[email protected] ~]#grep -v '^#' /etc/init.d/functions |grep -v '^$' notes :^# Said to # start ,-v Representation inversion ,^$ Indicates a line that begins with a hyphen , That is, the empty line
[
[email protected] ~]#grep -v '^#|^$' /etc/init.d/functions
[
[email protected] ~]#egrep -v '^(#|$)' /etc/init.d/functions
[
[email protected] ~]#grep '^[^#]' /etc/init.d/functions ## No # The first character ,grep The default is line display , If added -o Options , You cannot eliminate empty lines and # Beginning line
Extended regular expression metacharacter
Character matching
. Any single character
[wang] Characters for the specified range
[^wang] Characters not in the specified range
[:alnum:] Letters and numbers
[:alpha:] Represents any English case character , or A-Z, a-z
[:lower:] Lowercase letters , Example :[[:lower:]], amount to [a-z]
[:upper:] Capital
[:blank:] Blank character ( Spaces and tabs )
[:space:] Horizontal and vertical white space characters ( Than [:blank:] It covers a wide range )
[:cntrl:] Non printable control characters ( Backspace 、 Delete 、 Alarm bell ...)
[:digit:] Decimal number
[:xdigit:] Hexadecimal number
[:graph:] Printable non blank characters
[:print:] Printable characters
[:punct:] Punctuation
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
Number matching
* Match preceding characters any number of times
? 0 or 1 Time
+ 1 Times or times
{n} matching n Time
{m,n} At least m, at most n Time
Position anchoring
^ Head of line
$ At the end of the line
\<, \b Initials
\>, \b At the end of the sentence
Group other
() grouping
Backward reference :\1, \2, ... Be careful : \0 Represents all characters that the regular expression matches
| perhaps
a|b a or b
C|cat C or cat
(C|c)at Cat or cat
example :
[
[email protected] ~]#ifconfig |egrep -o "(([1-9]?[0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([1-9]?[0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])" ## [1-9]?[0-9] Express 1 Number of digits or 2 digit ;| Represents or ; 1[0-9]{2} Express 100-199 Of ;2[0-4][0-9] Express 200-249 ;25[0-5] Express 250-255
原网站版权声明
本文为[51CTO]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/200/202207171113140867.html