Regex-Fu Special Characters
The period .
The dot means matching any characters (except line breaks)
If we want to match the period character (.) we can find at the end of a sentence, we have to use backslash \
to specify that we want the character and not the regex special character.
\w and \W
\w
Match any alphanumeric characters
In contrast \W
will match anything that is not an alphanumeric character
The curly brackets {}
To specify the number of characters.
For example, the following regex will match any set of 4 alphanumeric characters.
To match any set of 4 alphanumeric characters or more:
To match any set of 4 or 5 alphanumeric characters:
The squared brackets []
This is used to group characters.
This match any letter from a to z followed by a period (.)
We can also include capital letters. This will match any lower or uppercase letter followed by a period
The parenthesis ()
The parenthesis are also used to group characters.
The regex below will search for any t or T characters. The |
acts like a logical OR
Match any set with two or three t
, e
or r
letters followed by a period. Here the {} will affects what is contained within the parenthesis
The ^
Meaning starts with ...
Match any line that starts with the B
letter
In the image below we add the multiline options to apply the regex to each line instead of the file as a whole.
The dollar sign ($)
The dollar sign means at the end of ...
This regex will match every period (.) at the end of the line. Note that the multiline option is required here to apply the regex to each single line instead of the text as a whole.
The plus sign (+)
The plus sign means match 1 or more ...
This regular expression means match the letter "e" or set of multiple "e" characters in a row.
The question mark (?)
Means optionally
This regex will match every set of one or more e
characters and optionally ea
if a a
letter follows the e
letter.
The star (*)
Meaning: Matching 0 or more
This regex will match the r
character followed by the e
letter 0 or multiple time.
Look behind
Positive look behind
Does not capture what is behind what you want to look for. For example, this regex will select every set of alphanumeric characters that is after the word The
or the
and followed by a space, but we do not want to capture the [Tt]he[ ]
pattern.
Negative look behind
Will match what is NOT preceded by the expression specified between the parenthesis.
For example, this regular expression will match any alphanumeric character that is NOT preceded by the word The/the followed by a space.
Here we can observe that alphanumeric characters following The[ ]
or the [ ]
are not selected.
Look ahead
Positive look ahead
Will match any set of characters preceding a specific pattern.
This expression will match any set of alphanumeric characters preceding the pattern at
Negative look ahead
Match any set of characters that is not preeceded by a specific pattern.
Thiss regex will match anything except words starting with the letter c
Last updated