6.18. Regex RE Compile

  • re.compile()

  • Used when pattern is reused (especially in the loop)

Prepare:

>>> x = re.compile(pattern)  

Usage:

>>> x.findall(string)  
>>> x.match(string)  
>>> x.search(string)  

6.18.1. SetUp

>>> import re
>>> DATA = [
...     'mwatney@nasa.gov',
...     'mlewis@nasa.gov',
...     'rmartinez@nasa.gov',
...     'avogel@esa.int',
...     'bjohanssen@nasa.gov',
...     'cbeck@nasa.gov',
... ]

6.18.2. Without Compilation

  • Python will compile pattern during every loop iteration

  • After compilation it will perform matching

Compiles at every loop iteration, and then matches:

>>> valid = r'^[a-z]+@nasa.gov$'
>>>
>>> for email in DATA:
...     if re.match(valid, email):
...         print(f'valid: {email}')
...     else:
...         print(f'error: {email}')
...
valid: mwatney@nasa.gov
valid: mlewis@nasa.gov
valid: rmartinez@nasa.gov
error: avogel@esa.int
valid: bjohanssen@nasa.gov
valid: cbeck@nasa.gov

6.18.3. With Compilation

  • Python will compile pattern once, before loop

  • Then in the loop, only matching is performed

Compiling before loop, hence matching only inside:

>>> valid = re.compile(r'^[a-z]+@nasa.gov$')
>>>
>>> for email in DATA:
...     if valid.match(email):
...         print(f'valid: {email}')
...     else:
...         print(f'error: {email}')
...
valid: mwatney@nasa.gov
valid: mlewis@nasa.gov
valid: rmartinez@nasa.gov
error: avogel@esa.int
valid: bjohanssen@nasa.gov
valid: cbeck@nasa.gov