5.5. Case Study: Hosts
Define result: list[dict], where each dict has keys:
ip: str
hosts: list[str]
Iterate over lines in DATA skipping comments (#) and empty lines
Extract from each line: ip and hosts
Add ip and hosts to result as a dict, example:
{'ip': '127.0.0.1', 'hosts': ['localhost', 'astromatt']}
Each line must be a separate dict
5.5.1. SetUp
>>>
... DATA = """##
... # `/etc/hosts` structure:
... # - ip: internet protocol address (IPv4 or IPv6)
... # - hosts: host names
... ##
...
... 127.0.0.1 localhost
... 127.0.0.1 astromatt
... 10.13.37.1 nasa.gov esa.int
... 255.255.255.255 broadcasthost
... ::1 localhost"""
5.5.2. Solution 1
>>>
... %%timeit -r 1000 -n 1000
... result = []
... for line in DATA.splitlines():
... line = line.strip()
... if len(line) == 0:
... continue
... if line.startswith('#'):
... continue
... ip, *hosts = line.split()
... result.append({'ip':ip, 'hosts':hosts})
# 4.97 µs ± 443 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
# 5.18 µs ± 496 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
# 5.33 µs ± 698 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
# 5.01 µs ± 432 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
# 5.16 µs ± 645 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
5.5.3. Solution 2
>>>
... %%timeit -r 1000 -n 1000
... result = []
... for line in DATA.splitlines():
... if line and not (line[0] == '#' or line[1] == '#'):
... ip, *hosts = line.split()
... result.append({'ip':ip, 'hosts':hosts})
# 4.12 µs ± 758 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
# 3.82 µs ± 384 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
# 3.89 µs ± 651 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
# 4.64 µs ± 1.81 µs per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
# 4.32 µs ± 896 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
5.5.4. Solution 3
>>>
... %%timeit -r 1000 -n 1000
... def is_valid_line(line):
... return line and not (line[0] == '#' or line[1] == '#')
...
... result = []
... for line in DATA.splitlines():
... if is_valid_line(line):
... ip, *hosts = line.split()
... result.append({'ip':ip, 'hosts':hosts})
# 4.97 µs ± 492 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
# 4.91 µs ± 580 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
# 5.15 µs ± 836 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
# 5.08 µs ± 688 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
# 5.42 µs ± 934 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)