7.1. Serialization Load

7.1.1. Assignments

"""
* Assignment: Serialization Load String
* Complexity: easy
* Lines of code: 1 lines
* Time: 3 min

English:
    1. Convert `DATA` to `result: list[tuple[str]]`
    2. Do not convert numeric values to `float`, leave them as `str`
    3. Run doctests - all must succeed

Polish:
    1. Przekonwertuj `DATA` to `result: list[tuple[str]]`
    2. Nie konwertuj wartości numerycznych do `float`, zostaw jako `str`
    3. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `tuple()`
    * `str.splitlines()`
    * `str.split()`

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from pprint import pprint

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> result = list(result)
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is tuple for x in result), \
    'All rows in `result` should be tuple'

    >>> pprint(result)
    [('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
     ('5.8', '2.7', '5.1', '1.9', 'virginica'),
     ('5.1', '3.5', '1.4', '0.2', 'setosa'),
     ('5.7', '2.8', '4.1', '1.3', 'versicolor')]
"""

DATA = """sepal_length,sepal_width,petal_length,petal_width,species
5.8,2.7,5.1,1.9,virginica
5.1,3.5,1.4,0.2,setosa
5.7,2.8,4.1,1.3,versicolor"""

# data from file in list[tuple] format
# type: list[tuple]
result = ...

"""
* Assignment: Serialization Load Switch
* Complexity: easy
* Lines of code: 6 lines
* Time: 5 min

English:
    1. Convert `DATA` to `result: list[tuple[str]]`
    2. Substitute last element (class label) with value from `LABEL_ENCODER`
    3. Run doctests - all must succeed

Polish:
    1. Przekonwertuj `DATA` to `result: list[tuple[str]]`
    2. Podmień ostatni element (etykietę klasową) z wartością z `LABEL_ENCODER`
    3. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `str.splitlines()`
    * `str.split()`
    * `dict.get()`
    * `tuple()`
    * `tuple() + tuple()`
    * `list.append()`

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from pprint import pprint

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> result = list(result)  # expand map object
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is tuple for x in result), \
    'All rows in `result` should be tuple'

    >>> pprint(result)
    [('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
     ('5.8', '2.7', '5.1', '1.9', 'virginica'),
     ('5.1', '3.5', '1.4', '0.2', 'setosa'),
     ('5.7', '2.8', '4.1', '1.3', 'versicolor')]
"""

DATA = """sepal_length,sepal_width,petal_length,petal_width,species
5.8,2.7,5.1,1.9,0
5.1,3.5,1.4,0.2,1
5.7,2.8,4.1,1.3,2"""

LABEL_ENCODER = {
    '0': 'virginica',
    '1': 'setosa',
    '2': 'versicolor',
}

# data from file (note the list[tuple] format!)
# type: list[tuple]
result = ...

"""
* Assignment: Serialization Load LabelEncoder
* Complexity: medium
* Lines of code: 10 lines
* Time: 13 min

English:
    1. Convert `DATA` to `result: list[tuple[str]]`
    2. Generate `LABEL_ENCODER: dict[int,str]` from `header: list[str]`
    3. Substitute last element (class label) with value from `LABEL_ENCODER`
    4. Run doctests - all must succeed

Polish:
    1. Przekonwertuj `DATA` to `result: list[tuple[str]]`
    2. Wygeneruj `LABEL_ENCODER: dict[int,str]` z `header: list[str]`
    3. Podmień ostatni element (etykietę klasową) z wartością z `LABEL_ENCODER`
    4. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `a, *b = ...`
    * `dict(enumerate())`
    * `str.splitlines()`
    * `str.split()`
    * `dict.get()`
    * `int()`
    * `tuple()`
    * `tuple() + tuple()`
    * `list.append()`

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from pprint import pprint

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> result = list(result)  # expand map object
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is tuple for x in result), \
    'All rows in `result` should be tuple'

    >>> pprint(result)
    [('5.8', '2.7', '5.1', '1.9', 'virginica'),
     ('5.1', '3.5', '1.4', '0.2', 'setosa'),
     ('5.7', '2.8', '4.1', '1.3', 'versicolor')]
"""

DATA = """3,4,setosa,virginica,versicolor
5.8,2.7,5.1,1.9,1
5.1,3.5,1.4,0.2,0
5.7,2.8,4.1,1.3,2"""

# values from file (note the list[tuple] format!)
# type: list[tuple]
result = ...

"""
* Assignment: Serialization Load TypeCast
* Complexity: easy
* Lines of code: 9 lines
* Time: 8 min

English:
    1. Convert `DATA` to `result: list[tuple[str]]`
    2. Convert numeric values to `float`
    3. Run doctests - all must succeed

Polish:
    1. Przekonwertuj `DATA` to `result: list[tuple[str]]`
    2. Przekonwertuj wartości numeryczne do `float`
    3. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `a, *b = ...`
    * `str.splitlines()`
    * `str.split()`
    * `dict.get()`
    * `float()`
    * `tuple()`
    * `tuple() + tuple()`
    * `list.append()`

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from pprint import pprint

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> result = list(result)  # expand map object
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is tuple for x in result), \
    'All rows in `result` should be tuple'

    >>> pprint(result)
    [('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
     (5.8, 2.7, 5.1, 1.9, 'virginica'),
     (5.1, 3.5, 1.4, 0.2, 'setosa'),
     (5.7, 2.8, 4.1, 1.3, 'versicolor')]
"""

DATA = """sepal_length,sepal_width,petal_length,petal_width,species
5.8,2.7,5.1,1.9,virginica
5.1,3.5,1.4,0.2,setosa
5.7,2.8,4.1,1.3,versicolor"""

# values from file (note the list[tuple] format!)
# type: list[tuple]
result = ...

"""
* Assignment: Serialization Load FixedHeader
* Complexity: easy
* Lines of code: 5 lines
* Time: 5 min

English:
    1. Convert `DATA` to `result: list[dict]`
    2. Use `HEADER` as dict keys
    3. Do not convert numeric values to `float`, leave them as `str`
    4. Run doctests - all must succeed

Polish:
    1. Przekonwertuj `DATA` to `result: list[dict]`
    2. Użyj `HEADER` jako kluczy dictów
    3. Nie konwertuj wartości numerycznychh do `float`, pozostaw je jako `str`
    4. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `str.splitlines()`
    * `str.split()`
    * `dict(zip())`
    * `list.append()`

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from pprint import pprint

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> result = list(result)  # expand map object
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is dict for x in result), \
    'All rows in `result` should be dict'

    >>> pprint(result)
    [{'petal_length': '5.1',
      'petal_width': '1.9',
      'sepal_length': '5.8',
      'sepal_width': '2.7',
      'species': 'virginica'},
     {'petal_length': '1.4',
      'petal_width': '0.2',
      'sepal_length': '5.1',
      'sepal_width': '3.5',
      'species': 'setosa'},
     {'petal_length': '4.1',
      'petal_width': '1.3',
      'sepal_length': '5.7',
      'sepal_width': '2.8',
      'species': 'versicolor'}]
"""

DATA = """5.8,2.7,5.1,1.9,virginica
5.1,3.5,1.4,0.2,setosa
5.7,2.8,4.1,1.3,versicolor"""

HEADER = [
    'sepal_length',
    'sepal_width',
    'petal_length',
    'petal_width',
    'species',
]

# Replace keys with `HEADER`
# type: list[dict[str,str]]
result = ...

"""
* Assignment: Serialization Load GenerateHeader
* Complexity: easy
* Lines of code: 7 lines
* Time: 8 min

English:
    1. Generate `header: list[str]` from first line `DATA`
    2. Convert `DATA` to `result: list[dict]`
    3. Use `header` as keys
    4. Convert numeric values to `float`
    5. Run doctests - all must succeed

Polish:
    1. Wygeneruj `header: list[str]` z pierwszej linii `DATA`
    2. Przekonwertuj `DATA` to `result: list[dict]`
    3. Użyj nagłówka jako kluczy
    4. Przekonwertuj wartości numeryczne do `float`
    5. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `str.split()`
    * `map()`
    * `list() + list()`
    * `list.append()`
    * `tuple()`

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from pprint import pprint

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> result = list(result)
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'
    >>> assert all(type(x) is dict for x in result), \
    'All rows in `result` should be dict'

    >>> pprint(result)
    [{'petal_length': 5.1,
      'petal_width': 1.9,
      'sepal_length': 5.8,
      'sepal_width': 2.7,
      'species': 'virginica'},
     {'petal_length': 1.4,
      'petal_width': 0.2,
      'sepal_length': 5.1,
      'sepal_width': 3.5,
      'species': 'setosa'},
     {'petal_length': 4.1,
      'petal_width': 1.3,
      'sepal_length': 5.7,
      'sepal_width': 2.8,
      'species': 'versicolor'}]
"""

DATA = """sepal_length,sepal_width,petal_length,petal_width,species
5.8,2.7,5.1,1.9,virginica
5.1,3.5,1.4,0.2,setosa
5.7,2.8,4.1,1.3,versicolor"""

# replace fieldnames with `FIELDNAMES`
# type: list[dict]
result = ...