Metacharacters
Anonymous contributor
Anonymous contributor9 total contributions
Anonymous contributor
Published Jun 10, 2022Updated Sep 5, 2023
Contribute to Docs
In regex, certain metacharacters are used to match and qualify regular character patterns or other expressions.
Common Metacharacters
Metacharacter | Description | Example |
---|---|---|
. |
Matches any character. | r'.' matches “[email protected]“. |
[] |
Matches character class inside the brackets not excluded by ^ . |
r'Char[mander|meleon|izard]' matches “Charmander“, “Charmeleon“, and “Charizard.” |
^ |
Matches characters at the beginning of a string. | r'^C' matches “Codecademy.” |
$ |
Matches characters at the end of a string. | r'y$' matches “Codecademy.” |
Quantifiers
Some metacharacters are concerned with the frequency of certain character patterns as shown in the table below:
Metacharacter | Description | Example |
---|---|---|
? |
Matches zero or one of the preceding character. | r'neighbo?ur' matches “neighbor” and “neighbour.” |
* |
Matches zero or more of the preceding character. | r're*d' matches “red” and “reed.” |
+ |
Matches one or more of the preceding character. | r'tw+o' matches “two” but not “to.” |
| |
Matches either the pattern before or after the | . |
r'true|false' matches “true“ or “false.” |
{x} |
Matches if the preceding character occurred x times in a row. |
r're{2}d' matches “reed” (2 “e”s) but not “red” (only 1 “e”). |
Capture Groups
Capture groups can be used to check and quantify different patterns in the string. This can be done with parentheses (...)
, as shown in the example below:
import repattern_one = 'red'pattern_two = 'rad'pattern_three = 'rid'capture_group = r'r(e|a)d'print(re.match(capture_group, pattern_one))print(re.match(capture_group, pattern_two))print(re.match(capture_group, pattern_three))
The capture group uses the pipe |
quantifier to match two out of the three patterns since they both have “e” or an “a”. The following output will be printed:
<_sre.SRE_Match object; span=(0, 3), match='red'><_sre.SRE_Match object; span=(0, 3), match='rad'>None
Special Sequences
The backslash \
metacharacter has two primary uses:
- It matches a character literal that would otherwise have meaning (e.g.
^
or$
). - It finds a match within particular character classes or sequences, such as digits.
The table below describes these special metacharacters when used with the \
backslash:
Metacharacter | Description | Example |
---|---|---|
\A |
Only matches the beginning of a string. | r'\AC' matches “Codecademy” but not “codecademy.” |
\b |
Matches the boundary at the beginning or end of a string. | r'\bCode\b' matches “Code Ninja” but not “CodeNinja.” |
\B |
Matches the boundary within a string. | r'Code\B' matches “CodeNinja” but not Code Ninja.” |
\d |
Matches any digit character (0-9) in a string. | r'\d square' matches “4 square” but not “four square.” |
\D |
Matches any non-digit character in a string. | '\D square' matches “four square” but not “4 square”. |
\s |
Matches any whitespace character including tabs and line breaks. | r'Code\sNinja' matches “Code Ninja” but not “CodeNinja.” |
\S |
Matches any non-whitespace character. | r'Code Ninja\S' matches “Code Ninjas“ but not “Code Ninja.” |
\w |
Matches most word characters including numbers and the underscore. | r'\w' matches everything in “code_ninja.txt” except for the period . . |
\W |
Matches any non-word character. | r'\W' only matches the period . in “code_ninja.txt.” |
\Z |
Only matches the end of a string. | r'emy\Z' matches “Codecademy“ but not “CODECADEMY.” |
Codebyte Example
The following snippet can be used for practicing with regex metacharacters:
All contributors
- Anonymous contributorAnonymous contributor9 total contributions
- BrandonDusch580 total contributions
- Anonymous contributor
- BrandonDusch
Looking to contribute?
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.