7.36. Regex Cheatsheet
Also known as: "Regular Expressions", "regexp", "regex" or "re"
7.36.1. Syntax
a- exacta|b- alternative[abc]- enumerated character class[a-z]- range character class.- any character except a newline (changes meaning withre.DOTALL)^- start of line (changes meaning withre.MULTILINE)$- end of line (changes meaning withre.MULTILINE)\A- start of string (doesn't change meaning withre.MULTILINE)\Z- end of string (doesn't change meaning withre.MULTILINE)[^]- negation\d- digit (alias to[0-9])\D- anything but digit (alias to[^0-9])\s- whitespace (space, tab, newline, non-breaking space)\S- anything but whitespace\b- word boundary\B- anything but word boundary\w- any unicode alphabet character (lower or upper, also with diacritics (i.e. ąćęłńóśżź...), numbers and underscores\W- anything but any unicode alphabet character (i.e. whitespace, dots, comas, dashes){n}- exactly n repetitions, exact{,n}- maximum n repetitions, greedy (prefer longest){n,}- minimum n repetitions, greedy (prefer longest){n,m}- minimum n repetitions, maximum m times, greedy (prefer longest)*- minimum 0 repetitions, no maximum, greedy (prefer longest), alias to{0,}+- minimum 1 repetitions, no maximum, greedy (prefer longest), alias to{1,}?- minimum 0 repetitions, maximum 1 repetitions, greedy (prefer longest), alias to{0,1}{,n}?- maximum n repetitions, lazy (prefer shorter){n,}?- minimum n repetitions, lazy (prefer shorter){n,m}?- minimum n repetitions, maximum m times, lazy (prefer shorter)*?- minimum 0 repetitions, no maximum, lazy (prefer shorter), alias to{0,}?+?- minimum 1 repetitions, no maximum, lazy (prefer shorter), alias to{1,}???- minimum 0 repetitions, maximum 1 repetition, lazy (prefer shorter), alias to{0,1}?()- matches whatever regular expression is inside the parentheses, and indicates the start and end of a group(...)- unnamed group (positional)(?P<mygroup>...)- named group mygroup(?:...)- non-capturing group(?#...)- comment(?P=name)- backreferencing by group name\g<number>- backreferencing by group number\g<name>- backreferencing by group name
7.36.2. Python
re.findall(pattern, string)- all occurrences of a pattern, results as alist[str]re.finditer(pattern, string)- all occurrences of a pattern, results as anIterator[re.Match]re.search(pattern, string)- check if pattern is in the string (stops after first match), results asre.Match | Nonere.match(pattern, string)- checks if string matches pattern (validation, ie. email, ssn, tax id, phone), results asre.Match | Nonere.split(pattern, string)- split string by a pattern, results as alist[str]re.sub(pattern, replace, string)- replaces occurrences of a pattern in string with other string, results as astrre.compile(pattern)- prepare pattern for further use for example in a loop, results as are.Pattern
7.36.3. Flags
re.ASCII- perform ASCII-only matching instead of full Unicode matchingre.IGNORECASE- case-insensitive searchre.MULTILINE- match can start in one line, and end in anotherre.DOTALL- dot (.) matches also newline charactersre.UNICODE- turns on unicode character support for\wre.VERBOSE- ignores spaces (except\s) and allows for comments in inre.compile()re.DEBUG- display debugging information during pattern compilation