Perl regular expressions

Appendix A - Perl regular expressions

The following table lists and describes some examples of Perl regular expressions.

Expression	Matches
abc	“abc” (the exact character sequence but anywhere in the string).
^abc	“abc” at the beginning of the string.
abc$	“abc” at the end of the string.
a\|b	Either “a” or “b”.
^abc\|abc$	The string “abc” at the beginning or at the end of the string.
ab{2,4}c	“a” followed by two, three, or four “b”s followed by a “c”.
ab{2,}c	“a” followed by at least two “b”s followed by a “c”.
ab*c	“a” followed by any number (zero or more) of “b”s followed by a “c”.
ab+c	“a” followed by one or more “b”s followed by a “c”.
ab?c	“a” followed by an optional “b” followed by a “c”; that is, either “abc” or “ac”.
a.c	“a” followed by any single character (not newline) followed by a “c”.
a\.c	“a.c” exactly.
[abc]	Any one of “a”, “b”, and “c”.
[Aa]bc	Either of “Abc” and “abc”.
[abc]+	Any (nonempty) string of “a”s, “b”s and “c”s (such as “a”, “abba”, “acbabcacaa”).
[^abc]+	Any (nonempty) string that does not contain any of “a”, “b”, and “c” (such as “defg”).
\d\d	Any two decimal digits, such as 42; same as \d{2}.
/i	Makes the pattern case insensitive. For example, /bad language/i blocks any instance of “bad language” regardless of case.
\w+	A “word”: A nonempty sequence of alphanumeric characters and low lines (underscores), such as “foo”, “12bar8”, and “foo_1”.
100\s*mk	The strings “100” and “mk” optionally separated by any amount of white space (spaces, tabs, and newlines).
abc\b	“abc” when followed by a word boundary (for example, in “abc!” but not in “abcd”).
perl\B	“perl” when not followed by a word boundary (for example, in “perlert” but not in “perl stuff”).
\x	Tells the regular expression parser to ignore white space that is neither preceded by a backslash character nor within a character class. Use this to break up a regular expression into slightly more readable parts.
/x	Used to add regular expressions within other text. If the first character in a pattern is forward slash “/”, the “/” is treated as the delimiter. The pattern must contain a second “/”. The pattern between the “/” is taken as a regular expression, and anything after the second “/” is parsed as a list of regular expression options (“i”,“x”, and so on). An error occurs if the second “/” is missing. In regular expressions, the leading and trailing space is treated as part of the regular expression.

Block common spam phrases

Block common phrases found in spam messages with the following expressions:

/try it for free/i

/student loans/i

/you’re already approved/i

/special[\+\-\*=<>\.\,;!\?%&~#§@\^°\$£\{\}()\[\]\|\\_1]offer/i

Block purposely misspelled words

Random characters are often inserted between the letters of a word to bypass spam-blocking software. The following expressions can help to block those messages:

/^.*v.*i.*a.*g.*r.*o.*$/i

/cr[eéèêë][\+\-\*=<>\.\,;!\?%&§@\^°\$£\{\}()\[\]\|\\_01]dit/i

Block any word in a phrase

Use the following expression to block any word in a phrase:

/block|any|word/

Appendix A - Perl regular expressions

The following table lists and describes some examples of Perl regular expressions.

Expression

Matches

abc

“abc” (the exact character sequence but anywhere in the string).

^abc

“abc” at the beginning of the string.

abc$

“abc” at the end of the string.

a|b

Either “a” or “b”.

^abc|abc$

The string “abc” at the beginning or at the end of the string.

ab{2,4}c

“a” followed by two, three, or four “b”s followed by a “c”.

ab{2,}c

“a” followed by at least two “b”s followed by a “c”.

ab*c

“a” followed by any number (zero or more) of “b”s followed by a “c”.

ab+c

“a” followed by one or more “b”s followed by a “c”.

ab?c

“a” followed by an optional “b” followed by a “c”; that is, either “abc” or “ac”.

a.c

“a” followed by any single character (not newline) followed by a “c”.

a\.c

“a.c” exactly.

[abc]

Any one of “a”, “b”, and “c”.

[Aa]bc

Either of “Abc” and “abc”.

[abc]+

Any (nonempty) string of “a”s, “b”s and “c”s (such as “a”, “abba”, “acbabcacaa”).

[^abc]+

Any (nonempty) string that does not contain any of “a”, “b”, and “c” (such as “defg”).

\d\d

Any two decimal digits, such as 42; same as \d{2}.

Makes the pattern case insensitive. For example, /bad language/i blocks any instance of “bad language” regardless of case.

\w+

A “word”: A nonempty sequence of alphanumeric characters and low lines (underscores), such as “foo”, “12bar8”, and “foo_1”.

100\s*mk

The strings “100” and “mk” optionally separated by any amount of white space (spaces, tabs, and newlines).

abc\b

“abc” when followed by a word boundary (for example, in “abc!” but not in “abcd”).

perl\B

“perl” when not followed by a word boundary (for example, in “perlert” but not in “perl stuff”).

Tells the regular expression parser to ignore white space that is neither preceded by a backslash character nor within a character class.

Use this to break up a regular expression into slightly more readable parts.

Used to add regular expressions within other text.

If the first character in a pattern is forward slash “/”, the “/” is treated as the delimiter. The pattern must contain a second “/”. The pattern between the “/” is taken as a regular expression, and anything after the second “/” is parsed as a list of regular expression options (“i”,“x”, and so on). An error occurs if the second “/” is missing.

In regular expressions, the leading and trailing space is treated as part of the regular expression.

Administration Guide

Perl regular expressions

Appendix A - Perl regular expressions

Block common spam phrases

Block purposely misspelled words

Block any word in a phrase

Appendix A - Perl regular expressions

Block common spam phrases

Block purposely misspelled words

Block any word in a phrase