7.22. Regex Backreference

  • \g<number> - backreferencing by position

  • \g<name> - backreferencing by name

  • (?P=name) - backreferencing by name

  • Note, that for backreference, must use raw-sting or double backslash

7.22.1. SetUp

import re

7.22.2. Backreference

html = '<p>Hello World</p>'
tag = re.findall(r'<(?P<tag>.+)>(?:.+)</(?P=tag)>', html)

tag
['p']

7.22.3. Recall Group by Position

  • \g<number> - backreferencing by position

html = '<p>Hello World</p>'

search = r'<p>(.+)</p>'
replace = r'<strong>\g<1></strong>'

re.sub(search, replace, html)
'<strong>Hello World</strong>'

7.22.4. Recall Group by Name

  • \g<name> - backreferencing by name

html = '<p>Hello World</p>'

search = r'<p>(?P<text>.+)</p>'
replace = r'<strong>\g<text></strong>'

re.sub(search, replace, html)
'<strong>Hello World</strong>'

7.22.5. Example

  • (?P<tag><.*?>).+(?P=tag) - matches text inside of a <tag> (opening and closing tag is the same)

7.22.6. Use Case - 1

import re


year = r'(?P<year>\d{4})'
month = r'(?P<month>[A-Z][a-z]{2})'
day = r'(?P<day>\d{1,2})'
date = f'{month} {day}(?:st|nd|rd|th), {year}'

TEXT = 'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'

Recall group by position:

re.sub(date, r'\g<3> \g<1> \g<2>', TEXT)
'On Sun, 2000 Jan 1 at 12:00 AM Alice <alice@example.com> wrote'

Recall group by name:

re.sub(date, r'\g<year> \g<month> \g<day>', TEXT)
'On Sun, 2000 Jan 1 at 12:00 AM Alice <alice@example.com> wrote'

Although this is not working in Python:

re.sub(f'{month} {day}nd, {year}', '(?P=day) (?P=month) (?P=year)', TEXT)
'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'

7.22.7. Use Case - 2

import re


HTML = '<p>Hello World</p>'

re.findall(r'<(?P<tag>.*?)>(.*?)</(?P=tag)>', HTML)
[('p', 'Hello World')]

7.22.8. Use Case - 3

import re


html = '<p>We choose to go to the <strong>Moon</strong></p>'


re.findall('<(?P<tagname>[a-z]+)>.*</(?P=tagname)>', html)
['p']

re.findall('<(?P<tagname>[a-z]+)>(.*)</(?P=tagname)>', html)
[('p', 'We choose to go to the <strong>Moon</strong>')]

7.22.9. Assignments