Metacharacters

Published Jun 10, 2022Updated Sep 5, 2023
Contribute to Docs

In regex, certain metacharacters are used to match and qualify regular character patterns or other expressions.

Common Metacharacters

Metacharacter Description Example
. Matches any character. r'.' matches “[email protected]“.
[] Matches character class inside the brackets not excluded by ^. r'Char[mander|meleon|izard]' matches “Charmander“, “Charmeleon“, and “Charizard.”
^ Matches characters at the beginning of a string. r'^C' matches “Codecademy.”
$ Matches characters at the end of a string. r'y$' matches “Codecademy.”

Quantifiers

Some metacharacters are concerned with the frequency of certain character patterns as shown in the table below:

Metacharacter Description Example
? Matches zero or one of the preceding character. r'neighbo?ur' matches “neighbor” and “neighbour.”
* Matches zero or more of the preceding character. r're*d' matches “red” and “reed.”
+ Matches one or more of the preceding character. r'tw+o' matches “two” but not “to.”
| Matches either the pattern before or after the |. r'true|false' matches “true“ or “false.”
{x} Matches if the preceding character occurred x times in a row. r're{2}d' matches “reed” (2 “e”s) but not “red” (only 1 “e”).

Capture Groups

Capture groups can be used to check and quantify different patterns in the string. This can be done with parentheses (...), as shown in the example below:

import re
pattern_one = 'red'
pattern_two = 'rad'
pattern_three = 'rid'
capture_group = r'r(e|a)d'
print(re.match(capture_group, pattern_one))
print(re.match(capture_group, pattern_two))
print(re.match(capture_group, pattern_three))

The capture group uses the pipe | quantifier to match two out of the three patterns since they both have “e” or an “a”. The following output will be printed:

<_sre.SRE_Match object; span=(0, 3), match='red'>
<_sre.SRE_Match object; span=(0, 3), match='rad'>
None

Special Sequences

The backslash \ metacharacter has two primary uses:

  • It matches a character literal that would otherwise have meaning (e.g. ^ or $).
  • It finds a match within particular character classes or sequences, such as digits.

The table below describes these special metacharacters when used with the \ backslash:

Metacharacter Description Example
\A Only matches the beginning of a string. r'\AC' matches “Codecademy” but not “codecademy.”
\b Matches the boundary at the beginning or end of a string. r'\bCode\b' matches “Code Ninja” but not “CodeNinja.”
\B Matches the boundary within a string. r'Code\B' matches “CodeNinja” but not Code Ninja.”
\d Matches any digit character (0-9) in a string. r'\d square' matches “4 square” but not “four square.”
\D Matches any non-digit character in a string. '\D square' matches “four square” but not “4 square”.
\s Matches any whitespace character including tabs and line breaks. r'Code\sNinja' matches “Code Ninja” but not “CodeNinja.”
\S Matches any non-whitespace character. r'Code Ninja\S' matches “Code Ninjas“ but not “Code Ninja.”
\w Matches most word characters including numbers and the underscore. r'\w' matches everything in “code_ninja.txt” except for the period ..
\W Matches any non-word character. r'\W' only matches the period . in “code_ninja.txt.”
\Z Only matches the end of a string. r'emy\Z' matches “Codecademy“ but not “CODECADEMY.”

Codebyte Example

The following snippet can be used for practicing with regex metacharacters:

us
Visit us
code
Hide code
Code
Output
Hide output
Hide output
Loading...

All contributors

Looking to contribute?

Learn Python on Codecademy