7.1. Mapping Dict

  • dict are key-value storage (HashMap)

  • Mutable - can add, remove, and modify items

  • Since Python 3.7: dict keeps order of elements

  • Before Python 3.7: dict order is not ensured!!

How are dictionaries implemented in CPython? [1]

CPython's dictionaries are implemented as resizable hash tables. Compared to B-trees, this gives better performance for lookup (the most common operation by far) under most circumstances, and the implementation is simpler.

Dictionaries work by computing a hash code for each key stored in the dictionary using the hash() built-in function. The hash code varies widely depending on the key and a per-process seed; for example, "Python" could hash to -539294296 while "python", a string that differs by a single bit, could hash to 1142331976. The hash code is then used to calculate a location in an internal array where the value will be stored. Assuming that you're storing keys that all have different hash values, this means that dictionaries take constant time – O(1), in Big-O notation – to retrieve a key.

basics/type-mapping/img/type-dict-hashmap.png

7.1.1. Syntax

  • {} is used more often

  • dict() is more readable

  • Comma after last element is optional

>>> data = {}
>>> data = dict()
>>> data = {
...     'commander': 'Melissa Lewis',
...     'botanist': 'Mark Watney',
...     'pilot': 'Rick Martinez',
... }
>>> data = dict(
...     commander='Melissa Lewis',
...     botanist='Mark Watney',
...     pilot='Rick Martinez',
... )

7.1.2. Duplicates

Duplicating items are overridden by latter:

>>> data = {
...     'commander': 'Mark Watney',
...     'commander': 'Melissa Lewis',
... }
>>>
>>> data
{'commander': 'Melissa Lewis'}

7.1.3. Dict vs Set

  • Both set and dict keys must be hashable

  • Both set and dict uses the same { and } braces

  • Despite similar syntax, they are different types

>>> data = {1, 2}
>>> type(data)
<class 'set'>
>>>
>>>
>>> data = {1: 2}
>>> type(data)
<class 'dict'>
>>> data = {1, 2, 3, 4}
>>> type(data)
<class 'set'>
>>>
>>>
>>> data = {1: 2, 3: 4}
>>> type(data)
<class 'dict'>

Empty dict and empty set:

>>> data = {1: None}
>>> _ = data.pop(1)
>>>
>>> data
{}
>>> data = {1}
>>> _ = data.pop()
>>>
>>> data
set()

7.1.4. List of Pairs

Pair:

>>> data = ('commander', 'Melissa Lewis')

List of pairs:

>>> data = [
...     ('commander', 'Melissa Lewis'),
...     ('botanist', 'Mark Watney'),
...     ('pilot', 'Rick Martinez'),
... ]

Convert list of pairs to dict:

>>> data = [
...     ('commander', 'Melissa Lewis'),
...     ('botanist', 'Mark Watney'),
...     ('pilot', 'Rick Martinez')
... ]
>>>
>>> dict(data)  
{'commander': 'Melissa Lewis',
  'botanist': 'Mark Watney',
  'pilot': 'Rick Martinez'}

7.1.5. Length

>>> crew = {
...     'commander': 'Melissa Lewis',
...     'botanist': 'Mark Watney',
...     'pilot': 'Rick Martinez',
... }
>>>
>>>
>>> len(crew)
3
>>>
>>> len(crew.keys())
3
>>>
>>> len(crew.values())
3
>>>
>>> len(crew.items())
3

7.1.6. Use Case - 0x1

  • GIT - version control system

>>> git = {
...    'ce16a8ce': 'commit/1',
...    'cae6b510': 'commit/2',
...    '895444a6': 'commit/3',
...    'aef731b5': 'commit/4',
...    '4a92bc79': 'branch/master',
...    'b3bbd85a': 'tag/v1.0',
... }

7.1.7. References

7.1.8. Assignments

"""
* Assignment: Mapping Dict Create
* Type: class assignment
* Complexity: easy
* Lines of code: 3 lines
* Time: 2 min

English:
    1. Define `result: dict` with:
        a. key `firstname` with value `Mark`
        b. key `lastname` with value `Watney`
        c. key `group` with value `users`
    2. Run doctests - all must succeed

Polish:
    1. Zdefiniuj `result: dict` z:
        a. kluczem `firstname` o wartości `Mark`
        b. kluczem `lastname` o wartości `Watney`
        c. kluczem `group` o wartości `users`
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert type(result) is dict, \
    'Variable `result` has invalid type, should be dict'

    >>> assert 'firstname' in result.keys(), \
    'Value `firstname` is not in the result keys'
    >>> assert 'lastname' in result.keys(), \
    'Value `lastname` is not in the result keys'
    >>> assert 'group' in result.keys(), \
    'Value `group` is not in the result keys'

    >>> assert 'Mark' in result['firstname'], \
    'Value `Mark` is not in the result values'
    >>> assert 'Watney' in result['lastname'], \
    'Value `Watney` is not in the result values'
    >>> assert 'users' in result['group'], \
    'Value `users` is not in the result values'
"""


# Define `result: dict` with:
# - key `firstname` with value `Mark`
# - key `lastname` with value `Watney`
# - key `group` with value `users`
# type: dict[str,str]
result = ...

"""
* Assignment: Mapping Dict ListOfPairs
* Type: class assignment
* Complexity: easy
* Lines of code: 1 lines
* Time: 2 min

English:
    1. Use `DATA: list[tuple]`
    2. Define `result: dict` with `DATA` converted to `dict`
    3. Run doctests - all must succeed

Polish:
    1. Użyj `DATA: list[tuple]`
    2. Zdefiniuj `result: dict` z przekonwertowanym `DATA` do `dict`
    3. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * dict()

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from pprint import pprint

    >>> assert type(result) is dict, \
    'Variable `result` has invalid type, should be dict'

    >>> assert all(type(x) is str for x in result.keys()), \
    'All dict keys should be str'

    >>> assert 'firstname' in result.keys()
    >>> assert 'lastname' in result.keys()
    >>> assert 'group' in result.keys()

    >>> assert 'Mark' in result.values()
    >>> assert 'Watney' in result.values()
    >>> assert 'users' in result.values()

    >>> pprint(result, width=40, sort_dicts=False)
    {'firstname': 'Mark',
     'lastname': 'Watney',
     'group': 'users'}
"""

DATA = [
    ('firstname', 'Mark'),
    ('lastname', 'Watney'),
    ('group', 'users'),
]

# Define `result: dict` with `DATA` converted to `dict`
# type: dict[str,str]
result = ...