8.2. Unpack Slice

  • Slice argument must be int (positive, negative or zero)

  • Positive Index starts with 0

  • Negative index starts with -1

8.2.1. Slice Forwards

  • sequence[start:stop]

>>> data = 'abcde'
>>> data[0:3]
'abc'
>>> data = 'abcde'
>>> data[2:5]
'cde'

8.2.2. Slice Defaults

  • sequence[start:stop]

  • start defaults to 0

  • stop defaults to len(sequence)

>>> data = 'abcde'
>>> data[:3]
'abc'
>>> data = 'abcde'
>>> data[3:]
'de'
>>> data = 'abcde'
>>> data[:]
'abcde'

8.2.3. Slice Backwards

  • Negative index starts from the end and go right to left

>>> data = 'abcde'
>>> data[-3:-1]
'cd'
>>> data = 'abcde'
>>> data[-3:]
'cde'
>>> data = 'abcde'
>>> data[0:-3]
'ab'
>>> data = 'abcde'
>>> data[:-3]
'ab'
>>> data = 'abcde'
>>> data[-3:0]
''

8.2.4. Step Forward

  • Every n-th element

  • sequence[start:stop:step]

  • start defaults to 0

  • stop defaults to len(sequence)

  • step defaults to 1

>>> data = 'abcde'
>>> data[::1]
'abcde'
>>> data = 'abcde'
>>> data[::2]
'ace'
>>> data = 'abcde'
>>> data[::3]
'ad'
>>> data = 'abcde'
>>> data[1:4:2]
'bd'

8.2.5. Step Backward

  • Every n-th element

  • sequence[start:stop:step]

  • start defaults to 0

  • stop defaults to len(sequence)

  • step defaults to 1

>>> data = 'abcde'
>>> data[::-1]
'edcba'
>>> data = 'abcde'
>>> data[::-2]
'eca'
>>> data = 'abcde'
>>> data[::-3]
'eb'
>>> data = 'abcde'
>>> data[4:1:-2]
'ec'

8.2.6. Slice Errors

>>> data = 'abcde'
>>> data[::0]
Traceback (most recent call last):
ValueError: slice step cannot be zero
>>> data = 'abcde'
>>> data[::1.0]
Traceback (most recent call last):
TypeError: slice indices must be integers or None or have an __index__ method

8.2.7. Out of Range

>>> data = 'abcde'
>>> data[:100]
'abcde'
>>> data = 'abcde'
>>> data[100:]
''

8.2.8. Slice str

>>> data = 'abcde'
>>>
>>>
>>> data[0:3]
'abc'
>>> data[3:5]
'de'
>>> data[:3]
'abc'
>>> data[3:]
'de'
>>> data[::1]
'abcde'
>>> data[::-1]
'edcba'
>>> data[::2]
'ace'
>>> data[::-2]
'eca'
>>> data[1::2]
'bd'
>>> data[1:4:2]
'bd'

8.2.9. Slice tuple

>>> data = ('a', 'b', 'c', 'd', 'e')
>>>
>>>
>>> data[0:3]
('a', 'b', 'c')
>>> data[3:5]
('d', 'e')
>>> data[:3]
('a', 'b', 'c')
>>> data[3:]
('d', 'e')
>>> data[::2]
('a', 'c', 'e')
>>> data[::-1]
('e', 'd', 'c', 'b', 'a')
>>> data[1::2]
('b', 'd')
>>> data[1:4:2]
('b', 'd')

8.2.10. Slice list

>>> data = ['a', 'b', 'c', 'd', 'e']
>>>
>>>
>>> data[0:3]
['a', 'b', 'c']
>>> data[3:5]
['d', 'e']
>>> data[:3]
['a', 'b', 'c']
>>> data[3:]
['d', 'e']
>>> data[::2]
['a', 'c', 'e']
>>> data[::-1]
['e', 'd', 'c', 'b', 'a']
>>> data[1::2]
['b', 'd']
>>> data[1:4:2]
['b', 'd']

8.2.11. Slice set

Slicing set is not possible:

>>> data = {'a', 'b', 'c', 'd', 'e'}
>>>
>>> data[:3]
Traceback (most recent call last):
TypeError: 'set' object is not subscriptable

8.2.12. Slice dict

  • Slicing on dict is not possible

>>> crew = {
...     'commander': 'Melissa Lewis',
...     'botanist': 'Mark Watney',
...     'pilot': 'Rick Martinez',
... }
>>>
>>>
>>> crew[1:2]
Traceback (most recent call last):
KeyError: slice(1, 2, None)
>>>
>>> crew[:2]
Traceback (most recent call last):
KeyError: slice(None, 2, None)
>>>
>>> crew[::2]
Traceback (most recent call last):
KeyError: slice(None, None, 2)

8.2.13. Nested Sequences

>>> DATA = [
...     ('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
...     (5.8, 2.7, 5.1, 1.9, 'virginica'),
...     (5.1, 3.5, 1.4, 0.2, 'setosa'),
...     (5.7, 2.8, 4.1, 1.3, 'versicolor'),
...     (6.3, 2.9, 5.6, 1.8, 'virginica'),
...     (6.4, 3.2, 4.5, 1.5, 'versicolor'),
...     (4.7, 3.2, 1.3, 0.2, 'setosa'),
... ]
>>>
>>>
>>> DATA[1:]  
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
 (5.1, 3.5, 1.4, 0.2, 'setosa'),
 (5.7, 2.8, 4.1, 1.3, 'versicolor'),
 (6.3, 2.9, 5.6, 1.8, 'virginica'),
 (6.4, 3.2, 4.5, 1.5, 'versicolor'),
 (4.7, 3.2, 1.3, 0.2, 'setosa')]
>>>
>>> DATA[-3:]  
[(6.3, 2.9, 5.6, 1.8, 'virginica'),
 (6.4, 3.2, 4.5, 1.5, 'versicolor'),
 (4.7, 3.2, 1.3, 0.2, 'setosa')]

8.2.14. Column Selection

Column selection unfortunately does not work on list:

>>> data = [[1, 2, 3],
...         [4, 5, 6],
...         [7, 8, 9]]
...
>>> data[:]
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>>
>>> data[:, 1]
Traceback (most recent call last):
TypeError: list indices must be integers or slices, not tuple
>>>
>>> data[:][1]
[4, 5, 6]

However this syntax is valid in numpy and pandas.

8.2.15. Index Arithmetic

>>> text = 'We choose to go to the Moon!'
>>> first = 23
>>> last = 28
>>> step = 2
>>>
>>> text[first:last]
'Moon!'
>>> text[first:last-1]
'Moon'
>>> text[first:last:step]
'Mo!'
>>> text[first:last-1:step]
'Mo'

8.2.16. Use Case - 0x01

>>> from pprint import pprint
>>>
>>>
>>> DATA = [
...     ('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
...     (5.8, 2.7, 5.1, 1.9, 'virginica'),
...     (5.1, 3.5, 1.4, 0.2, 'setosa'),
...     (5.7, 2.8, 4.1, 1.3, 'versicolor'),
...     (6.3, 2.9, 5.6, 1.8, 'virginica'),
...     (6.4, 3.2, 4.5, 1.5, 'versicolor'),
...     (4.7, 3.2, 1.3, 0.2, 'setosa'),
... ]
>>>
>>>
>>> pprint(DATA[1:])
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
 (5.1, 3.5, 1.4, 0.2, 'setosa'),
 (5.7, 2.8, 4.1, 1.3, 'versicolor'),
 (6.3, 2.9, 5.6, 1.8, 'virginica'),
 (6.4, 3.2, 4.5, 1.5, 'versicolor'),
 (4.7, 3.2, 1.3, 0.2, 'setosa')]
>>>
>>> pprint(DATA[1::2])
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
 (5.7, 2.8, 4.1, 1.3, 'versicolor'),
 (6.4, 3.2, 4.5, 1.5, 'versicolor')]
>>>
>>> pprint(DATA[1::-2])
[(5.8, 2.7, 5.1, 1.9, 'virginica')]
>>>
>>> pprint(DATA[:1:-2])
[(4.7, 3.2, 1.3, 0.2, 'setosa'),
 (6.3, 2.9, 5.6, 1.8, 'virginica'),
 (5.1, 3.5, 1.4, 0.2, 'setosa')]
>>>
>>> pprint(DATA[:-5:-2])
[(4.7, 3.2, 1.3, 0.2, 'setosa'), (6.3, 2.9, 5.6, 1.8, 'virginica')]
>>>
>>> pprint(DATA[1:-5:-2])
[]

8.2.17. Use Case - 0x02

>>> data = [[1, 2, 3],
...         [4, 5, 6],
...         [7, 8, 9]]
...
>>> data[::2]  
[[1, 2, 3],
 [7, 8, 9]]
>>>
>>> data[::2][1]
[7, 8, 9]
>>>
>>> data[::2][:1]
[[1, 2, 3]]
>>>
>>> data[::2][1][1:]
[8, 9]

8.2.18. Use Case - 0x03

>>> text = 'We choose to go to the Moon!'
>>> word = 'Moon'
>>>
>>>
>>> start = text.find(word)
>>> stop = start + len(word)
>>>
>>> text[start:stop]
'Moon'
>>>
>>> text[:start]
'We choose to go to the '
>>>
>>> text[stop:]
'!'
>>>
>>> text[:start] + text[stop:]
'We choose to go to the !'

8.2.19. Assignments

Code 8.6. Solution
"""
* Assignment: Iterable Slice Text
* Type: class assignment
* Complexity: easy
* Lines of code: 8 lines
* Time: 8 min

English:
    1. Remove title and military rank in each variable
    2. Remove also whitespaces at the beginning and end of a text
    3. Use only `slice` to clean text
    4. Run doctests - all must succeed

Polish:
    1. Usuń tytuł naukowy i stopień wojskowy z każdej zmiennej
    2. Usuń również białe znaki na początku i końcu tekstu
    3. Użyj tylko `slice` do oczyszczenia tekstu
    4. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert a is not Ellipsis, \
    'Assign your result to variable `a`'
    >>> assert b is not Ellipsis, \
    'Assign your result to variable `b`'
    >>> assert c is not Ellipsis, \
    'Assign your result to variable `c`'
    >>> assert d is not Ellipsis, \
    'Assign your result to variable `d`'
    >>> assert e is not Ellipsis, \
    'Assign your result to variable `e`'
    >>> assert f is not Ellipsis, \
    'Assign your result to variable `f`'
    >>> assert g is not Ellipsis, \
    'Assign your result to variable `g`'
    >>> assert type(a) is str, \
    'Variable `a` has invalid type, should be str'
    >>> assert type(b) is str, \
    'Variable `b` has invalid type, should be str'
    >>> assert type(c) is str, \
    'Variable `c` has invalid type, should be str'
    >>> assert type(d) is str, \
    'Variable `d` has invalid type, should be str'
    >>> assert type(e) is str, \
    'Variable `e` has invalid type, should be str'
    >>> assert type(f) is str, \
    'Variable `f` has invalid type, should be str'
    >>> assert type(g) is str, \
    'Variable `g` has invalid type, should be str'

    >>> example
    'Mark Watney'
    >>> a
    'Pan Twardowski'
    >>> b
    'Pan Twardowski'
    >>> c
    'Mark Watney'
    >>> d
    'Melissa Lewis'
    >>> e
    'Ryan Stone'
    >>> f
    'Ryan Stone'
    >>> g
    'Pan Twardowski'
"""

EXAMPLE = 'lt. Mark Watney, PhD'
A = 'dr hab. inż. Pan Twardowski, prof. AATC'
B = 'gen. pil. Pan Twardowski'
C = 'Mark Watney, PhD'
D = 'lt. col. ret. Melissa Lewis'
E = 'dr n. med. Ryan Stone'
F = 'Ryan Stone, MD-PhD'
G = 'lt. col. Pan Twardowski\t'

example = EXAMPLE[4:-5]

# String with: 'Pan Twardowski'
# type: str
a = ...

# String with: 'Pan Twardowski'
# type: str
b = ...

# String with: 'Mark Watney'
# type: str
c = ...

# String with: 'Melissa Lewis'
# type: str
d = ...

# String with: 'Ryan Stone'
# type: str
e = ...

# String with: 'Ryan Stone'
# type: str
f = ...

# String with: 'Pan Twardowski'
# type: str
g = ...

Code 8.7. Solution
"""
* Assignment: Iterable Slice Substr
* Type: class assignment
* Complexity: easy
* Lines of code: 3 lines
* Time: 5 min

English:
    1. Use `str.find()`, `len()` and slicing
    2. Print `TEXT` without fragment from `REMOVE`
    3. Output should be: 'We choose the Moon!'
    4. Do not use `str.replace()`
    5. Run doctests - all must succeed

Polish:
    1. Użyj `str.find()`, `len()` oraz wycinania
    2. Wypisz `TEXT` bez fragmentu znajdującego się w `REMOVE`
    3. Wynik powinien być: 'We choose the Moon!'
    4. Nie używaj `str.replace()`
    5. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign your result to variable `result`'
    >>> assert type(result) is str, \
    'Variable `result` has invalid type, should be str'

    >>> result
    'We choose the Moon!'
"""

TEXT = 'We choose to go to the Moon!'
REMOVE = 'to go to '

# String with TEXT without REMOVE part
# type: str
result = ...

Code 8.8. Solution
"""
* Assignment: Iterable Slice Sequence
* Type: class assignment
* Complexity: easy
* Lines of code: 2 lines
* Time: 3 min

English:
    1. Create set `result` with every second element from `a` and `b`
    2. Run doctests - all must succeed

Polish:
    1. Stwórz zbiór `result` z co drugim elementem `a` i `b`
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign your result to variable `result`'
    >>> assert type(result) is set, \
    'Variable `result` has invalid type, should be set'

    >>> result
    {0, 2, 4}
"""

a = (0, 1, 2, 3)
b = [2, 3, 4, 5]

# Set with every second element from `a` and `b`
# type: set[int]
result = ...

Code 8.9. Solution
"""
* Assignment: Iterable Slice Header/Rows
* Type: class assignment
* Complexity: easy
* Lines of code: 2 lines
* Time: 3 min

English:
    1. Separate header (first line) from rows:
       a. Define `header: tuple[str]` with header
       b. Define `rows: list[tuple]` with other rows
    2. Run doctests - all must succeed

Polish:
    1. Odseparuj nagłówek (pierwsza linia) od danych:
       a. Zdefiniuj `header: tuple[str]` z nagłówkiem
       b. Zdefiniuj `rows: list[tuple]` z pozostałymi wierszami
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from pprint import pprint

    >>> assert header is not Ellipsis, \
    'Assign your result to variable `header`'
    >>> assert rows is not Ellipsis, \
    'Assign your result to variable `rows`'
    >>> assert type(header) is tuple, \
    'Variable `header` has invalid type, should be tuple'
    >>> assert all(type(x) is tuple for x in rows), \
    'All elements in `rows` should be tuple'
    >>> assert header not in rows, \
    'Header should not be in `rows`'

    >>> pprint(header)
    ('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species')

    >>> pprint(rows)
    [(5.8, 2.7, 5.1, 1.9, 'virginica'),
     (5.1, 3.5, 1.4, 0.2, 'setosa'),
     (5.7, 2.8, 4.1, 1.3, 'versicolor'),
     (6.3, 2.9, 5.6, 1.8, 'virginica'),
     (6.4, 3.2, 4.5, 1.5, 'versicolor'),
     (4.7, 3.2, 1.3, 0.2, 'setosa'),
     (7.0, 3.2, 4.7, 1.4, 'versicolor'),
     (7.6, 3.0, 6.6, 2.1, 'virginica'),
     (4.9, 3.0, 1.4, 0.2, 'setosa'),
     (4.9, 2.5, 4.5, 1.7, 'virginica')]
"""

DATA = [
    ('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
    (5.8, 2.7, 5.1, 1.9, 'virginica'),
    (5.1, 3.5, 1.4, 0.2, 'setosa'),
    (5.7, 2.8, 4.1, 1.3, 'versicolor'),
    (6.3, 2.9, 5.6, 1.8, 'virginica'),
    (6.4, 3.2, 4.5, 1.5, 'versicolor'),
    (4.7, 3.2, 1.3, 0.2, 'setosa'),
    (7.0, 3.2, 4.7, 1.4, 'versicolor'),
    (7.6, 3.0, 6.6, 2.1, 'virginica'),
    (4.9, 3.0, 1.4, 0.2, 'setosa'),
    (4.9, 2.5, 4.5, 1.7, 'virginica'),
]


# Tuple with row at index 0 from DATA
# type: tuple[str]
header = ...

# List with rows at all the other indexes from DATA
# type: list[tuple]
rows = ...

Code 8.10. Solution
"""
* Assignment: Iterable Slice Train/Test
* Type: class assignment
* Complexity: easy
* Lines of code: 4 lines
* Time: 8 min

English:
    1. Divide `rows` into two lists:
       a. `train`: 60% - training data
       b. `test`: 40% - testing data
    2. Calculate split point:
       a. `rows` length multiplied by percent
       b. From `rows` slice training data from start to split
       c. From `rows` slice test data from split to end
    3. Run doctests - all must succeed

Polish:
    1. Podziel `rows` na dwie listy:
       a. `train`: 60% - dane do uczenia
       b. `test`: 40% - dane do testów
    2. Aby to zrobić wylicz punkt podziału:
       a. Długość `rows` razy procent
       c. Z `rows` wytnij do uczenia rekordy od początku do punktu podziału
       d. Z `rows` zapisz do testów rekordy od punktu podziału do końca
    3. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from pprint import pprint

    >>> assert split is not Ellipsis, \
    'Assign your result to variable `split`'
    >>> assert train is not Ellipsis, \
    'Assign your result to variable `train`'
    >>> assert test is not Ellipsis, \
    'Assign your result to variable `test`'
    >>> assert type(split) is int, \
    'Variable `split` has invalid type, should be int'
    >>> assert type(train) is list, \
    'Variable `train` has invalid type, should be list'
    >>> assert type(train) is list, \
    'Variable `train` has invalid type, should be list'
    >>> assert type(test) is list, \
    'Variable `test` has invalid type, should be list'
    >>> assert all(type(x) is tuple for x in train), \
    'All elements in `train` should be tuple'
    >>> assert all(type(x) is tuple for x in test), \
    'All elements in `test` should be tuple'

    >>> pprint(split)
    6

    >>> pprint(train)
    [(5.8, 2.7, 5.1, 1.9, 'virginica'),
     (5.1, 3.5, 1.4, 0.2, 'setosa'),
     (5.7, 2.8, 4.1, 1.3, 'versicolor'),
     (6.3, 2.9, 5.6, 1.8, 'virginica'),
     (6.4, 3.2, 4.5, 1.5, 'versicolor'),
     (4.7, 3.2, 1.3, 0.2, 'setosa')]

    >>> pprint(test)
    [(7.0, 3.2, 4.7, 1.4, 'versicolor'),
     (7.6, 3.0, 6.6, 2.1, 'virginica'),
     (4.9, 3.0, 1.4, 0.2, 'setosa'),
     (4.9, 2.5, 4.5, 1.7, 'virginica')]
"""

DATA = [
    ('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
    (5.8, 2.7, 5.1, 1.9, 'virginica'),
    (5.1, 3.5, 1.4, 0.2, 'setosa'),
    (5.7, 2.8, 4.1, 1.3, 'versicolor'),
    (6.3, 2.9, 5.6, 1.8, 'virginica'),
    (6.4, 3.2, 4.5, 1.5, 'versicolor'),
    (4.7, 3.2, 1.3, 0.2, 'setosa'),
    (7.0, 3.2, 4.7, 1.4, 'versicolor'),
    (7.6, 3.0, 6.6, 2.1, 'virginica'),
    (4.9, 3.0, 1.4, 0.2, 'setosa'),
    (4.9, 2.5, 4.5, 1.7, 'virginica'),
]


header = DATA[0]
rows = DATA[1:]

# Result of `rows` length multiplied by percent
# type: int
split = ...

# List with first 60% from rows
# type: list[tuple]
train = ...

# List with last 40% from rows
# type: list[tuple]
test = ...