Skip to content

Commit f536098

Browse files
committed
bunch of readbility fixes, no semantic changes
2 parents f72cc7a + 44e1469 commit f536098

File tree

1 file changed

+66
-69
lines changed

1 file changed

+66
-69
lines changed

cheatsheet.rst

+66-69
Original file line numberDiff line numberDiff line change
@@ -1,77 +1,74 @@
11
Python 2.7 Regular Expressions
22
==============================
33

4-
Special characters::
5-
6-
\ escapes special characters.
7-
. matches any character
8-
^ matches start of the string (or line if MULTILINE)
9-
$ matches end of the string (or line if MULTILINE)
10-
[5b-d] matches any chars '5', 'b', 'c' or 'd'
11-
[^a-c6] matches any char except 'a', 'b', 'c' or '6'
12-
R|S matches either regex R or regex S.
13-
() Creates a capture group, and indicates precedence.
14-
15-
Within ``[]``, no special chars do anything special, hence they don't need
16-
escaping, except for ``']'`` and ``'-'``, which only need escaping if they are
17-
not the 1st char. e.g. ``'[]]'`` matches ``']'``. ``'^'`` also has special
18-
meaning, it negates the group if it's the first character in the ``[]``, and
19-
needs to be escaped to match it literally.
20-
21-
Quantifiers::
22-
23-
* 0 or more (append ? for non-greedy)
24-
+ 1 or more "
25-
? 0 or 1 "
26-
{m} exactly 'm'
27-
{m,n} from m to n. 'm' defaults to 0, 'n' to infinity
28-
{m,n}? from m to n, as few as possible
4+
Non-special chars match themselves. Exceptions are special characters::
5+
6+
\ Escape special char
7+
. Match any char except newline, see re.DOTALL
8+
^ Match start of the string, see re.MULTILINE
9+
$ Match end of the string, see re.MULTILINE
10+
[] Enclose a set of matchable chars
11+
R|S Match either regex R or regex S.
12+
() Create capture group, and indicate precedence
13+
14+
After '``[``', enclose a set, the only special chars are::
15+
16+
] End the set, if not the 1st char
17+
- A range, eg. a-c matches a, b or c
18+
^ Negate the set only if it is the 1st char
19+
20+
Quantifiers (append '``?``' for non-greedy)::
21+
22+
* 0 or more
23+
+ 1 or more
24+
? 0 or 1
25+
{m} Exactly 'm'
26+
{m,n} From m (default 0) to n (default infinity)
2927

3028
Special sequences::
3129

3230
\A Start of string
33-
\b Matches empty string at word boundary (between \w and \W)
34-
\B Matches empty string not at word boundary
31+
\b Match empty string at word (\w+) boundary
32+
\B Match empty string not at word boundary
3533
\d Digit
3634
\D Non-digit
37-
\s Whitespace: [ \t\n\r\f\v], more if LOCALE or UNICODE
35+
\s Whitespace [ \t\n\r\f\v], see LOCALE,UNICODE
3836
\S Non-whitespace
39-
\w Alphanumeric: [0-9a-zA-Z_], or is LOCALE dependant
37+
\w Alphanumeric: [0-9a-zA-Z_], see LOCALE
4038
\W Non-alphanumeric
4139
\Z End of string
42-
43-
\g<id> Match previous group, '<' & '>' are literal
44-
e.g. \g<0> or \g<name> (not \g0 or \gname)
40+
\g<id> Match prev named or numbered group,
41+
'<' & '>' are literal, e.g. \g<0>
42+
or \g<name> (not \g0 or \gname)
4543

4644
Special character escapes are much like those already escaped in Python string
4745
literals. Hence regex '``\n``' is same as regex '``\\n``'::
4846

4947
\a ASCII Bell (BEL)
5048
\f ASCII Formfeed
5149
\n ASCII Linefeed
52-
\r ASCII Carraige return
50+
\r ASCII Carriage return
5351
\t ASCII Tab
5452
\v ASCII Vertical tab
5553
\\ A single backslash
56-
57-
\xHH Two digit hex character
58-
\OOO Three digit octal char
59-
(or use a preceding zero, e.g. \0, \09)
60-
\DD Decimal number 1 to 99, matches previous
61-
numbered group
62-
63-
Extensions. These do not cause grouping, except for ``(?P<name>...)``::
64-
65-
(?iLmsux) Matches empty string, sets re.X flags
66-
(?:...) Non-capturing version of regular parentheses
67-
(?P<name>...) Creates a named capturing group.
68-
(?P=name) Matches whatever matched previously named group
69-
(?#...) A comment; ignored.
70-
(?=...) Lookahead assertion: Matches without consuming
71-
(?!...) Negative lookahead assertion
72-
(?<=...) Lookbehind assertion: Matches if preceded
73-
(?<!...) Negative lookbehind assertion
74-
(?(id)yes|no) Match 'yes' if group 'id' matched, else 'no'
54+
\xHH Two digit hexadecimal character goes here
55+
\OOO Three digit octal char (or just use an
56+
initial zero, e.g. \0, \09)
57+
\DD Decimal number 1 to 99, match
58+
previous numbered group
59+
60+
Extensions. Do not cause grouping, except '``P<name>``'::
61+
62+
(?iLmsux) Match empty string, sets re.X flags
63+
(?:...) Non-capturing version of regular parens
64+
(?P<name>...) Create a named capturing group.
65+
(?P=name) Match whatever matched prev named group
66+
(?#...) A comment; ignored.
67+
(?=...) Lookahead assertion, match without consuming
68+
(?!...) Negative lookahead assertion
69+
(?<=...) Lookbehind assertion, match if preceded
70+
(?<!...) Negative lookbehind assertion
71+
(?(id)y|n) Match 'y' if group 'id' matched, else 'n'
7572

7673
Flags for re.compile(), etc. Combine with ``'|'``::
7774

@@ -105,30 +102,30 @@ RegexObjects (returned from ``compile()``)::
105102
.split(string[, maxsplit]) -> list of strings
106103
.sub(repl, string[, count]) -> string
107104
.subn(repl, string[, count]) -> (string, int)
108-
.flags # int passed to compile()
109-
.groups # int number of capturing groups
110-
.groupindex # {} maps group names to ints
111-
.pattern # string passed to compile()
105+
.flags # int, Passed to compile()
106+
.groups # int, Number of capturing groups
107+
.groupindex # {}, Maps group names to ints
108+
.pattern # string, Passed to compile()
112109

113110
MatchObjects (returned from ``match()`` and ``search()``)::
114111

115-
.expand(template) -> string, backslash and group expansion
112+
.expand(template) -> string, Backslash & group expansion
116113
.group([group1...]) -> string or tuple of strings, 1 per arg
117-
.groups([default]) -> (,) of all groups, non-matching=default
118-
.groupdict([default]) -> {} of named groups, non-matching=default
119-
.start([group]) -> int, start/end of substring matched by group
120-
.end([group]) (group defaults to 0, the whole match)
114+
.groups([default]) -> tuple of all groups, non-matching=default
115+
.groupdict([default]) -> {}, Named groups, non-matching=default
116+
.start([group]) -> int, Start/end of substring match by group
117+
.end([group]) -> int, Group defaults to 0, the whole match
121118
.span([group]) -> tuple (match.start(group), match.end(group))
122-
.pos # value passed to search() or match()
123-
.endpos # "
124-
.lastindex # int index of last matched capturing group
125-
.lastgroup # string name of last matched capturing group
126-
.re # regex passed to search() or match()
127-
.string # string passed to search() or match()
119+
.pos int, Passed to search() or match()
120+
.endpos int, "
121+
.lastindex int, Index of last matched capturing group
122+
.lastgroup string, Name of last matched capturing group
123+
.re regex, As passed to search() or match()
124+
.string string, "
128125

129126

130127
Gleaned from the python 2.7 're' docs. http://docs.python.org/library/re.html
131128

132-
:Version: v0.3.1
133-
:Contact: tartley@tartley.com
129+
https://github.com/tartley/python-regex-cheatsheet
130+
Version: v0.3.3
134131

0 commit comments

Comments
 (0)