youtubebeat/vendor/github.com/elastic/beats/libbeat/docs/regexp.asciidoc

//////////////////////////////////////////////////////////////////////////
//// This content is shared by all Elastic Beats. Make sure you keep the
//// descriptions here generic enough to work for all Beats that include
//// this file. When using cross references, make sure that the cross
//// references resolve correctly for any files that include this one.
//// Use the appropriate variables defined in the index.asciidoc file to
//// resolve Beat names: beatname_uc and beatname_lc.
//// Use the following include to pull this content into a doc file:
//// include::../../libbeat/docs/regexp.asciidoc[]
//////////////////////////////////////////////////////////////////////////

[[regexp-support]]
== Regular expression support

{beatname_uc} regular expression support is based on https://godoc.org/regexp/syntax[RE2].

ifeval::["{beatname_lc}"=="filebeat"]

{beatname_uc} has several configuration options that accept regular expressions.
For example, `multiline.pattern`, `include_lines`, `exclude_lines`, and
`exclude_files` all accept regular expressions. Some options, however, such as
the input `paths` option, accept only glob-based paths.

endif::[]

Before using a regular expression in the config file, refer to the documentation
to verify that the option you are setting accepts a regular expression.

NOTE: We recommend that you wrap regular expressions in single quotation marks to work around YAML's string escaping rules. For example, `'^\[?[0-9][0-9]:?[0-9][0-9]|^[[:graph:]]+'`.

For more examples of supported regexp patterns, see {filebeat}/multiline-examples.html[Managing Multiline Messages].
Although the examples pertain to Filebeat, the regexp patterns are applicable to other use cases.

The following patterns are supported:

* <<single-characters, Single Characters>>
* <<composites, Composites>>
* <<repetitions, Repetitions>>
* <<grouping, Groupings>>
* <<empty-strings, Empty Strings>>
* <<escape-sequences, Escape Sequences>>
* <<ascii-character-classes, ASCII Character Classes>>
* <<perl-character-classes, Perl Character Classes>>

[options="header"]
|=======================
|Pattern          |Description
|[[single-characters]]*Single Characters* 1+|
|`x`              |single character
|`.`              |any character
|`[xyz]`          |character class
|`[^xyz]`         |negated character class
|`[[:alpha:]]`    |ASCII character class
|`[[:^alpha:]]`   |negated ASCII character class
|`\d`             |Perl character class
|`\D`             |negated Perl character class
|`\pN`            |Unicode character class (one-letter name)
|`\p{Greek}`      |Unicode character class
|`\PN`            |negated Unicode character class (one-letter name)
|`\P{Greek}`      |negated Unicode character class
|[[composites]]*Composites* 1+|
|`xy`             |`x` followed by `y`
|`x\|y`           |`x` or `y` (prefer `x`)
|[[repetitions]]*Repetitions* 1+|
|`x*`             |zero or more `x`
|`x+`             |one or more `x`
|`x?`             |zero or one `x`
|`x{n,m}`         |`n` or `n+1` or ... or `m` `x`, prefer more
|`x{n,}`          |`n` or more `x`, prefer more
|`x{n}`           |exactly `n` `x`
|`x*?`            |zero or more `x`, prefer fewer
|`x+?`            |one or more `x`, prefer fewer
|`x??`            |zero or one `x`, prefer zero
|`x{n,m}?`        |`n` or `n+1` or ... or `m` `x`, prefer fewer
|`x{n,}?`         |`n` or more `x`, prefer fewer
|`x{n}?`          |exactly `n` `x`
|[[grouping]]*Grouping* 1+|
|`(re)`           |numbered capturing group (submatch)
|`(?P<name>re)`   |named & numbered capturing group (submatch)
|`(?:re)`         |non-capturing group
|`(?i)abc`        |set flags within current group, non-capturing
|`(?i:re)`        |set flags during re, non-capturing
|`(?i)PaTTeRN`    |case-insensitive (default false)
|`(?m)multiline`  |multi-line mode: `^` and `$` match begin/end line in addition to begin/end text (default false)
|`(?s)pattern.`   |let `.` match `\n` (default false)
|`(?U)x*abc`      |ungreedy: swap meaning of `x*` and `x*?`, `x+` and `x+?`, etc (default false)
|[[empty-strings]]*Empty Strings* 1+|
|`^`              |at beginning of text or line (`m`=true)
|`$`              |at end of text (like `\z` not `\Z`) or line (`m`=true)
|`\A`             |at beginning of text
|`\b`             |at ASCII word boundary (`\w` on one side and `\W`, `\A`, or `\z` on the other)
|`\B`             |not at ASCII word boundary
|`\z`             |at end of text
|[[escape-sequences]]*Escape Sequences* 1+|
|`\a`             |bell (same as `\007`)
|`\f`             |form feed (same as `\014`)
|`\t`             |horizontal tab (same as `\011`)
|`\n`             |newline (same as `\012`)
|`\r`             |carriage return (same as `\015`)
|`\v`             |vertical tab character (same as `\013`)
|`\*`             |literal `*`, for any punctuation character `*`
|`\123`           |octal character code (up to three digits)
|`\x7F`           |two-digit hex character code
|`\x{10FFFF}`     |hex character code
|`\Q...\E`        |literal text `...` even if `...` has punctuation
|[[ascii-character-classes]]*ASCII Character Classes* 1+|
|`[[:alnum:]]`    |alphanumeric (same as `[0-9A-Za-z]`)
|`[[:alpha:]]`    |alphabetic (same as `[A-Za-z]`)
|`[[:ascii:]]`    |ASCII (same as `\x00-\x7F]`)
|`[[:blank:]]`    |blank (same as `[\t ]`)
|`[[:cntrl:]]`    |control (same as `[\x00-\x1F\x7F]`)
|`[[:digit:]]`    |digits (same as `[0-9]`)
|`[[:graph:]]`    |graphical (same as `[!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\\]^_`` `{\|}~]`)
|`[[:lower:]]`    |lower case (same as `[a-z]`)
|`[[:print:]]`    |printable (same as `[ -~] == [ [:graph:]]`)
|`[[:punct:]]`    |punctuation (same as ++[!-/:-@[-`{-~]++)
|`[[:space:]]`    |whitespace (same as `[\t\n\v\f\r ]`)
|`[[:upper:]]`    |upper case (same as `[A-Z]`)
|`[[:word:]]`     |word characters (same as `[0-9A-Za-z_]`)
|`[[:xdigit:]]`   |hex digit (same as `[0-9A-Fa-f]`)
|[[perl-character-classes]]*Supported Perl Character Classes*  1+|
|`\d`             |digits (same as `[0-9]`)
|`\D`             |not digits (same as `[^0-9]`)
|`\s`             |whitespace (same as `[\t\n\f\r ]`)
|`\S`             |not whitespace (same as `[^\t\n\f\r ]`)
|`\w`             |word characters (same as `[0-9A-Za-z_]`)
|`\W`             |not word characters (same as `[^0-9A-Za-z_]`)
|=======================