7.1. Mapping Define¶
dict
are key-value storage (HashMap)Mutable - can add, remove, and modify items
Since Python 3.7:
dict
keeps order of elementsBefore Python 3.7:
dict
order is not ensured!!
How are dictionaries implemented in CPython? [1]
CPython's dictionaries are implemented as resizable hash tables. Compared to B-trees, this gives better performance for lookup (the most common operation by far) under most circumstances, and the implementation is simpler.
Dictionaries work by computing a hash code for each key stored in the
dictionary using the hash()
built-in function. The hash code varies
widely depending on the key and a per-process seed; for example, "Python"
could hash to -539294296 while "python", a string that differs by a single
bit, could hash to 1142331976. The hash code is then used to calculate
a location in an internal array where the value will be stored. Assuming
that you're storing keys that all have different hash values, this means
that dictionaries take constant time – O(1), in Big-O notation – to retrieve
a key.
7.1.1. Syntax¶
{}
is used more oftendict()
is more readableComma after last element is optional
>>> data = {}
>>> data = dict()
>>> data = {
... 'commander': 'Melissa Lewis',
... 'botanist': 'Mark Watney',
... 'pilot': 'Rick Martinez',
... }
>>> data = dict(
... commander='Melissa Lewis',
... botanist='Mark Watney',
... pilot='Rick Martinez',
... )
7.1.2. Duplicates¶
Duplicating items are overridden by latter:
>>> data = {
... 'commander': 'Mark Watney',
... 'commander': 'Melissa Lewis',
... }
>>>
>>> data
{'commander': 'Melissa Lewis'}
7.1.3. Dict vs Set¶
Both
set
anddict
keys must be hashableBoth
set
anddict
uses the same{
and}
bracesDespite similar syntax, they are different types
>>> data = {1, 2}
>>> type(data)
<class 'set'>
>>>
>>>
>>> data = {1: 2}
>>> type(data)
<class 'dict'>
>>> data = {1, 2, 3, 4}
>>> type(data)
<class 'set'>
>>>
>>>
>>> data = {1: 2, 3: 4}
>>> type(data)
<class 'dict'>
Empty dict
and empty set
:
>>> data = {1: None}
>>> _ = data.pop(1)
>>>
>>> data
{}
>>> data = {1}
>>> _ = data.pop()
>>>
>>> data
set()
7.1.4. List of Pairs¶
Pair:
>>> data = ('commander', 'Melissa Lewis')
List of pairs:
>>> data = [
... ('commander', 'Melissa Lewis'),
... ('botanist', 'Mark Watney'),
... ('pilot', 'Rick Martinez'),
... ]
Convert list
of pairs to dict
:
>>> data = [
... ('commander', 'Melissa Lewis'),
... ('botanist', 'Mark Watney'),
... ('pilot', 'Rick Martinez')
... ]
>>>
>>> dict(data)
{'commander': 'Melissa Lewis',
'botanist': 'Mark Watney',
'pilot': 'Rick Martinez'}
7.1.5. Length¶
>>> crew = {
... 'commander': 'Melissa Lewis',
... 'botanist': 'Mark Watney',
... 'pilot': 'Rick Martinez',
... }
>>>
>>>
>>> len(crew)
3
>>>
>>> len(crew.keys())
3
>>>
>>> len(crew.values())
3
>>>
>>> len(crew.items())
3
7.1.6. Use Case - 0x1¶
GIT - version control system
>>> git = {
... 'ce16a8ce': 'commit/1',
... 'cae6b510': 'commit/2',
... '895444a6': 'commit/3',
... 'aef731b5': 'commit/4',
... '4a92bc79': 'branch/master',
... 'b3bbd85a': 'tag/v1.0',
... }
7.1.7. References¶
7.1.8. Assignments¶
"""
* Assignment: Mapping Define Dict
* Type: class assignment
* Complexity: easy
* Lines of code: 3 lines
* Time: 3 min
English:
1. Create `result: dict` representing input data
2. Non-functional requirements:
a. Assignmnet verifies creation of `dict()`
b. Do not parse data, simply model it using dict
c. Do not use `str.split()`, `slice`, `getitem`, `for`, `while` or
any other control-flow statement
3. Run doctests - all must succeed
Polish:
1. Stwórz `result: dict` reprezentujący dane wejściowe
2. Wymagania niefunkcjonalne:
a. Zadanie sprawdza tworzenie `dict()`
b. Nie parsuj danych, po prostu zamodeluj je jako dict
c. Nie używaj `str.split()`, `slice`, `getitem`, `for`, `while` lub
jakiejkolwiek innej instrukcji sterującej
3. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert type(result) is dict, \
'Variable `result` has invalid type, should be dict'
>>> assert 'firstname' in result.keys(), \
'Value `firstname` is not in the result keys'
>>> assert 'lastname' in result.keys(), \
'Value `lastname` is not in the result keys'
>>> assert 'missions' in result.keys(), \
'Value `missions` is not in the result keys'
>>> assert 'Mark' in result['firstname'], \
'Value `Mark` is not in the result values'
>>> assert 'Watney' in result['lastname'], \
'Value `Watney` is not in the result values'
>>> assert 'Ares1' in result['missions'], \
'Value `Ares1` is not in the result values'
>>> assert 'Ares3' in result['missions'], \
'Value `Ares3` is not in the result values'
"""
# firstname - Mark
# lastname - Watney
# missions - Ares1, Ares3
# Define dict with keys: firstname, lastname and missions
# type: dict[str,str|list]
result = ...
"""
* Assignment: Mapping Generate Pairs
* Type: class assignment
* Complexity: easy
* Lines of code: 1 lines
* Time: 2 min
English:
1. Define `result: dict`
2. Convert `DATA` to `dict` and assign to `result`
3. Run doctests - all must succeed
Polish:
1. Zdefiniuj `result: dict`
2. Przekonwertuj `DATA` do `dict` i przypisz do `result`
3. Uruchom doctesty - wszystkie muszą się powieść
Hints:
* `dict()`
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> from pprint import pprint
>>> assert type(result) is dict, \
'Variable `result` has invalid type, should be dict'
>>> assert all(type(x) is str for x in result.keys()), \
'All dict keys should be str'
>>> assert 'sepal_length' in result.keys()
>>> assert 'sepal_width' in result.keys()
>>> assert 'petal_length' in result.keys()
>>> assert 'petal_width' in result.keys()
>>> assert 'species' in result.keys()
>>> assert 5.8 in result.values()
>>> assert 2.7 in result.values()
>>> assert 5.1 in result.values()
>>> assert 1.9 in result.values()
>>> assert 'virginica' in result.values()
>>> pprint(result, sort_dicts=False)
{'sepal_length': 5.8,
'sepal_width': 2.7,
'petal_length': 5.1,
'petal_width': 1.9,
'species': 'virginica'}
"""
DATA = [
('sepal_length', 5.8),
('sepal_width', 2.7),
('petal_length', 5.1),
('petal_width', 1.9),
('species', 'virginica'),
]
# Dict with converted DATA
# type: dict[str,float|str]
result = ...