7.16. Regex Quantifier Shorthand
7.16.1. SetUp
>>> import re
7.16.2. Greedy
*
- minimum 0 repetitions, no maximum, prefer longer (alias to{0,}
)+
- minimum 1 repetitions, no maximum, prefer longer (alias to{1,}
)?
- minimum 0 repetitions, maximum 1 repetitions, prefer longer (alias to{0,1}
)
Plus:
>>> TEXT = 'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'
>>> re.findall(r'\d{1,}', TEXT)
['1', '2000', '12', '00']
>>> TEXT = 'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'
>>> re.findall(r'\d+', TEXT)
['1', '2000', '12', '00']
Star:
>>> TEXT = 'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'
>>> re.findall(r'\d{0,}', TEXT)
['', '', '', '', '', '', '', '', '', '', '', '', '1', '', '', '',
'', '2000', '', '', '', '', '12', '', '00', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
>>> TEXT = 'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'
>>> re.findall(r'\d*', TEXT)
['', '', '', '', '', '', '', '', '', '', '', '', '1', '', '', '',
'', '2000', '', '', '', '', '12', '', '00', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
Question mark:
>>> TEXT = 'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'
>>> re.findall(r'\d{0,1}', TEXT)
['', '', '', '', '', '', '', '', '', '', '', '', '1', '', '', '',
'', '2', '0', '0', '0', '', '', '', '', '1', '2', '', '0', '0',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '']
>>> TEXT = 'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'
>>> re.findall(r'\d?', TEXT)
['', '', '', '', '', '', '', '', '', '', '', '', '1', '', '', '',
'', '2', '0', '0', '0', '', '', '', '', '1', '2', '', '0', '0',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '']
Note
Both star and question mark does not make any sense with numbers. They works better with text.
7.16.3. Lazy
*?
- minimum 0 repetitions, no maximum, prefer shorter (alias to{0,}?
)+?
- minimum 1 repetitions, no maximum, prefer shorter (alias to{1,}?
)??
- minimum 0 repetitions, maximum 1 repetition, prefer shorter (alias to{0,1}?
)
>>> TEXT = 'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'
>>> re.findall(r'\d{1,}?', TEXT)
['1', '2', '0', '0', '0', '1', '2', '0', '0']
>>> TEXT = 'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'
>>> re.findall(r'\d+?', TEXT)
['1', '2', '0', '0', '0', '1', '2', '0', '0']
Star:
>>> TEXT = 'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'
>>> re.findall(r'\d{0,}?', TEXT)
['', '', '', '', '', '', '', '', '', '', '', '', '', '1', '', '', '',
'', '', '2', '', '0', '', '0', '', '0', '', '', '', '', '', '1', '',
'2', '', '', '0', '', '0', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '']
>>> TEXT = 'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'
>>> re.findall(r'\d*?', TEXT)
['', '', '', '', '', '', '', '', '', '', '', '', '', '1', '', '', '',
'', '', '2', '', '0', '', '0', '', '0', '', '', '', '', '', '1', '',
'2', '', '', '0', '', '0', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '']
Question mark:
>>> TEXT = 'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'
>>> re.findall(r'\d{0,1}?', TEXT)
['', '', '', '', '', '', '', '', '', '', '', '', '', '1', '', '', '',
'', '', '2', '', '0', '', '0', '', '0', '', '', '', '', '', '1', '',
'2', '', '', '0', '', '0', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '']
>>> TEXT = 'On Sun, Jan 1st, 2000 at 12:00 AM Alice <alice@example.com> wrote'
>>> re.findall(r'\d??', TEXT)
['', '', '', '', '', '', '', '', '', '', '', '', '', '1', '', '', '',
'', '', '2', '', '0', '', '0', '', '0', '', '', '', '', '', '1', '',
'2', '', '', '0', '', '0', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '']