9.7. Iterator Filter
filter(callable, *iterables)
Select elements from sequence
Generator (lazy evaluated)
required
callable
- Functionrequired
iterables
- 1 or many sequence or iterator objects
>>> def even(x):
... return x % 2 == 0
>>>
>>> result = (x for x in range(0,5) if even(x))
>>> result = filter(even, range(0,5))
9.7.1. Not-a-Generator
>>> from inspect import isgeneratorfunction, isgenerator
>>>
>>>
>>> def even(x):
... return x % 2 == 0
>>>
>>>
>>> isgeneratorfunction(filter)
False
>>>
>>> result = filter(even, [1,2,3])
>>> isgenerator(result)
False
9.7.2. Problem
Plain code:
>>> def even(x):
... return x % 2 == 0
>>>
>>>
>>> DATA = [1, 2, 3, 4, 5, 6]
>>> result = []
>>>
>>> for x in DATA:
... if even(x):
... result.append(x)
>>>
>>> print(result)
[2, 4, 6]
Comprehension:
>>> def even(x):
... return x % 2 == 0
>>>
>>>
>>> DATA = [1, 2, 3, 4, 5, 6]
>>> result = [x for x in DATA if even(x)]
>>>
>>> print(result)
[2, 4, 6]
9.7.3. Solution
>>> def even(x):
... return x % 2 == 0
>>>
>>>
>>> DATA = [1, 2, 3, 4, 5, 6]
>>> result = filter(even, DATA)
>>>
>>> list(result)
[2, 4, 6]
9.7.4. Lazy Evaluation
>>> def even(x):
... return x % 2 == 0
>>>
>>>
>>> DATA = [1, 2, 3, 4, 5, 6]
>>> result = filter(even, DATA)
>>>
>>> next(result)
2
>>> next(result)
4
>>> next(result)
6
>>> next(result)
Traceback (most recent call last):
StopIteration
9.7.5. Performance
>>> def even(x):
... return x % 2 == 0
>>>
>>>
>>> data = [1, 2, 3, 4, 5, 6]
>>>
... %%timeit -r 1000 -n 1000
... result = [x for x in data if even(x)]
1.11 µs ± 139 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
>>>
... %%timeit -r 1000 -n 1000
... result = list(filter(even, data))
921 ns ± 112 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
9.7.6. Use Case - 1
>>> users = [
... {'age': 41, 'username': 'mwatney'},
... {'age': 40, 'username': 'mlewis'},
... {'age': 39, 'username': 'rmartinez'},
... {'age': 40, 'username': 'avogel'},
... {'age': 29, 'username': 'bjohanssen'},
... {'age': 36, 'username': 'cbeck'},
... ]
>>> def above40(user):
... return user['age'] >= 40
>>>
>>> def under40(user):
... return user['age'] < 40
>>> result = filter(above40, users)
>>> list(result)
[{'age': 41, 'username': 'mwatney'},
{'age': 40, 'username': 'mlewis'},
{'age': 40, 'username': 'avogel'}]
>>> result = filter(under40, users)
>>> list(result)
[{'age': 39, 'username': 'rmartinez'},
{'age': 29, 'username': 'bjohanssen'},
{'age': 36, 'username': 'cbeck'}]
9.7.7. Use Case - 2
>>> users = [
... {'is_admin': False, 'name': 'Mark Watney'},
... {'is_admin': True, 'name': 'Melissa Lewis'},
... {'is_admin': False, 'name': 'Rick Martinez'},
... {'is_admin': False, 'name': 'Alex Vogel'},
... {'is_admin': True, 'name': 'Beth Johanssen'},
... {'is_admin': False, 'name': 'Chris Beck'},
... ]
>>>
>>>
>>> def admin(user):
... return user['is_admin'] is True
>>>
>>>
>>> result = filter(admin, users)
>>> list(result)
[{'is_admin': True, 'name': 'Melissa Lewis'},
{'is_admin': True, 'name': 'Beth Johanssen'}]
9.7.8. Use Case - 3
>>> users = [
... 'mwatney',
... 'mlewis',
... 'rmartinez',
... 'avogel',
... 'bjohanssen',
... 'cbeck',
... ]
>>>
>>> admins = [
... 'mlewis',
... 'bjohanssen',
... ]
>>>
>>>
>>> def is_admin(user):
... return user in admins
>>>
>>>
>>> result = filter(is_admin, users)
>>> list(result)
['mlewis', 'bjohanssen']
9.7.9. Use Case - 4
>>> class User:
... firstname: str
... lastname: str
... groups: list[str]
...
... def __init__(self, firstname, lastname, groups):
... self.firstname = firstname
... self.lastname = lastname
... self.groups = groups
...
... def __repr__(self):
... return f'{self.firstname}'
...
>>> DATABASE = [
... User('Mark', 'Watney', groups=['user', 'staff']),
... User('Melissa', 'Lewis', groups=['user', 'staff', 'admin']),
... User('Rick', 'Martinez', groups=['user', 'staff']),
... User('Alex', 'Vogel', groups=['user']),
... User('Beth', 'Johanssen', groups=['user', 'staff', 'admin']),
... User('Chris', 'Beck', groups=['user', 'staff']),
... ]
>>> def is_user(user: User) -> bool:
... return 'user' in user.groups
>>>
>>> def is_staff(user: User) -> bool:
... return 'staff' in user.groups
>>>
>>> def is_admin(user: User) -> bool:
... return 'admin' in user.groups
>>> users = filter(is_user, DATABASE)
>>> staff = filter(is_staff, DATABASE)
>>> admins = filter(is_admin, DATABASE)
>>> list(users)
[Mark, Melissa, Rick, Alex, Beth, Chris]
>>>
>>> list(staff)
[Mark, Melissa, Rick, Beth, Chris]
>>>
>>> list(admins)
[Melissa, Beth]
9.7.10. Assignments
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -v myfile.py`
# %% About
# - Name: Iterator Filter Apply
# - Difficulty: easy
# - Lines: 3
# - Minutes: 2
# %% English
# 1. Define function `odd()`:
# - takes one argument
# - returns True if argument is odd
# - returns False if argument is even
# 2. Use `filter()` to apply function `odd()` to DATA
# 3. Define `result: filter` with result
# 4. Run doctests - all must succeed
# %% Polish
# 1. Zdefiniuj funckję `odd()`:
# - przyjmuje jeden argument
# - zwraca True jeżeli argument jest nieparzysty
# - zwraca False jeżeli argument jest parzysty
# 2. Użyj `filter()` aby zaaplikować funkcję `odd()` do DATA
# 3. Zdefiniuj `result: filter` z wynikiem
# 4. Uruchom doctesty - wszystkie muszą się powieść
# %% Hints
# - `filter()`
# - `%` - modulo operator
# %% Tests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python 3.9+ required'
>>> from inspect import isfunction
>>> assert isfunction(odd), \
'Object `odd` must be a function'
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> assert type(result) is filter, \
'Variable `result` has invalid type, should be filter'
>>> result = list(result)
>>> assert type(result) is list, \
'Evaluated `result` has invalid type, should be list'
>>> assert all(type(x) is int for x in result), \
'All rows in `result` should be int'
>>> from pprint import pprint
>>> pprint(result, width=72, sort_dicts=False)
[1, 3, 5, 7, 9]
"""
DATA = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
# Returns if number is odd (modulo divisible by 2 without reminder)
# type: Callable[[int], bool]
def odd(x):
...
# Filter odd numbers in DATA
# type: filter
result = ...
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -v myfile.py`
# %% About
# - Name: Iterator Filter Apply
# - Difficulty: easy
# - Lines: 7
# - Minutes: 5
# %% English
# 1. Filter-out lines from `DATA` when:
# - line is empty
# - line has only spaces
# - starts with # (comment)
# 2. Use `filter()` to apply function `valid()` to DATA
# 3. Define `result: filter` with result
# 4. Run doctests - all must succeed
# %% Polish
# 1. Odfiltruj linie z `DATA` gdy:
# - linia jest pusta
# - linia ma tylko spacje
# - zaczyna się od # (komentarz)
# 2. Użyj `filter()` aby zaaplikować funkcję `valid()` do DATA
# 3. Zdefiniuj `result: filter` z wynikiem
# 4. Uruchom doctesty - wszystkie muszą się powieść
# %% Hints
# - `filter()`
# - `str.splitlines()`
# - `str.startswith()`
# - `len()`
# %% Tests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python 3.9+ required'
>>> from inspect import isfunction
>>> assert isfunction(valid), \
'Object `valid` must be a function'
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> assert type(result) is filter, \
'Variable `result` has invalid type, should be filter'
>>> result = list(result)
>>> assert type(result) is list, \
'Evaluated `result` has invalid type, should be list'
>>> assert all(type(x) is str for x in result), \
'All rows in `result` should be str'
>>> from pprint import pprint
>>> pprint(result, width=72, sort_dicts=False)
['127.0.0.1 localhost',
'127.0.0.1 astromatt',
'10.13.37.1 nasa.gov esa.int',
'255.255.255.255 broadcasthost',
'::1 localhost']
"""
DATA = """##
# `/etc/hosts` structure:
# - ip: internet protocol address (IPv4 or IPv6)
# - hosts: host names
##
127.0.0.1 localhost
127.0.0.1 astromatt
10.13.37.1 nasa.gov esa.int
255.255.255.255 broadcasthost
::1 localhost"""
# Filter-out lines from `DATA` when:
# - line is empty
# - line has only spaces
# - starts with # (comment)
# type: Callable[[str], bool]
def valid(line):
...
# Use `filter()` to apply function `valid()` to DATA
# type: filter
result = ...
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -v myfile.py`
# %% About
# - Name: Iterator Filter Apply
# - Difficulty: easy
# - Lines: 3
# - Minutes: 5
# %% English
# 1. Filter-out non-numeric (int or float) values from `DATA`
# 2. Define `result: filter` with result
# 3. Run doctests - all must succeed
# %% Polish
# 1. Odfiltruj nie numeryczne (int lub float) wartości z `DATA`
# 2. Zdefiniuj `result: filter` z wynikiem
# 3. Uruchom doctesty - wszystkie muszą się powieść
# %% Hints
# - `filter()`
# - `isinstance()`
# - `type()`
# %% Tests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python 3.9+ required'
>>> from inspect import isfunction
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> assert type(result) is filter, \
'Variable `result` has invalid type, should be filter'
>>> result = list(result)
>>> assert type(result) is list, \
'Evaluated `result` has invalid type, should be list'
>>> assert all(type(x) in (int,float) for x in result), \
'All rows in `result` should be str'
>>> from pprint import pprint
>>> pprint(result, width=72, sort_dicts=False)
[0, 2.0, 4, 5.0]
"""
DATA = [0, True, 2.0, 'three', 4, 5.0, ['six']]
# Filter-out non-numeric (int or float) values from `DATA`
# type: filter
result = ...