6.2. Type List

  • Mutable - can add, remove, and modify items

  • Stores elements of any type

CPython's lists are really variable-length arrays, not Lisp-style linked lists. The implementation uses a contiguous array of references to other objects, and keeps a pointer to this array and the array's length in a list head structure.

This makes indexing a list data[i] an operation whose cost is independent of the size of the list or the value of the index.

When items are appended or inserted, the array of references is resized. Some cleverness is applied to improve the performance of appending items repeatedly; when the array must be grown, some extra space is allocated so the next few times don't require an actual resize.

6.2.1. Syntax

  • data = [] - empty list

  • data = [1, 2.2, 'abc'] - list with values

  • data = [] is faster than data = list()

Defining empty list with [] is used more often, but list() is more explicit:

>>> data = list()
>>> data = []

Comma after last element is optional:

>>> data = [1]
>>> data = [1,]

Can store elements of any types:

>>> data = [1, 2, 3]
>>> data = [1.1, 2.2, 3.3]
>>> data = [True, False]
>>> data = ['a', 'b', 'c']
>>> data = ['a', 1, 2.2, True, None]

Brackets are required

>>> data = [1, 2, 3]

Performance:

>>> %%timeit -r 10_000 -n 10_000  
... data = list()
...
53.8 ns ± 8.15 ns per loop (mean ± std. dev. of 10000 runs, 10,000 loops each)
>>>
>>>
>>> %%timeit -r 10_000 -n 10_000  
... data = []
...
23.9 ns ± 4.23 ns per loop (mean ± std. dev. of 10000 runs, 10,000 loops each)

6.2.2. Type Conversion

  • list() converts argument to list

  • Takes one iterable as an argument

  • Multiple arguments are not allowed

Builtin function list() converts argument to list

>>> text = 'hello'
>>> list(text)
['h', 'e', 'l', 'l', 'o']
>>> colors = ['red', 'green', 'blue']
>>> list(colors)
['red', 'green', 'blue']
>>> colors = ('red', 'green', 'blue')
>>> list(colors)
['red', 'green', 'blue']
>>> list('red', 'green', 'blue')
Traceback (most recent call last):
TypeError: list expected at most 1 argument, got 3

6.2.3. Get Item

  • Returns a value at given index

  • Note, that Python start counting at zero (zero based indexing)

  • Raises IndexError if the index is out of range

  • More information in Iterable GetItem

  • More information in Iterable Slice

>>> colors = ['red', 'green', 'blue']
>>>
>>> colors[0]
'red'
>>> colors[1]
'green'
>>> colors[2]
'blue'

6.2.4. Set Item

>>> colors = ['red', 'green', 'blue']
>>> colors[0] = 'black'
>>>
>>> print(colors)
['black', 'green', 'blue']
>>> colors = ['red', 'green', 'blue']
>>> colors[4] = 'black'
Traceback (most recent call last):
IndexError: list assignment index out of range

6.2.5. Del Item

>>> colors = ['red', 'green', 'blue']
>>> del colors[2]
>>>
>>> print(colors)
['red', 'green']
>>> colors = ['red', 'green', 'blue']
>>> result = colors.pop()
>>>
>>> colors
['red', 'green']
>>> result
'blue'

6.2.6. Append

  • list + list - add

  • list += list - increment add

  • list.extend() - extend

  • list.append() - append

  • O(1) complexity

Add:

>>> colors = ['red', 'green', 'blue']
>>> result = colors + ['black']
>>>
>>> print(colors)
['red', 'green', 'blue']
>>>
>>> print(result)
['red', 'green', 'blue', 'black']

Increment Add:

>>> colors = ['red', 'green', 'blue']
>>> colors += ['black']
>>>
>>> print(colors)
['red', 'green', 'blue', 'black']

Extend:

>>> colors = ['red', 'green', 'blue']
>>> colors.extend(['black', 'white'])
>>>
>>> print(colors)
['red', 'green', 'blue', 'black', 'white']

Append: >>> colors = ['red', 'green', 'blue'] >>> colors.append(['black', 'white']) >>> >>> print(colors) ['red', 'green', 'blue', ['black', 'white']]

Errors:

>>> colors + 'black'
Traceback (most recent call last):
TypeError: can only concatenate list (not "str") to list
>>> colors = ['red', 'green', 'blue']
>>> colors += 'black'
>>>
>>> print(colors)
['red', 'green', 'blue', 'b', 'l', 'a', 'c', 'k']

6.2.7. Insert

  • list.insert(idx, object)

  • Insert object at specific position

  • O(n) complexity

>>> colors = ['red', 'green', 'blue']
>>> colors.insert(0, 'black')
>>>
>>> print(colors)
['black', 'red', 'green', 'blue']
>>> colors = ['red', 'green', 'blue']
>>> colors.insert(1, 'black')
>>>
>>> print(colors)
['red', 'black', 'green', 'blue']

6.2.8. Sort

  • sorted() - returns new sorted list, but does not modify the original

  • list.sort() - sorts list and returns None

Why doesn't list.sort() return the sorted list? [3]

In situations where performance matters, making a copy of the list just to sort it would be wasteful. Therefore, list.sort() sorts the list in place. In order to remind you of that fact, it does not return the sorted list. This way, you won't be fooled into accidentally overwriting a list when you need a sorted copy but also need to keep the unsorted version around.

If you want to return a new list, use the built-in sorted() function instead. This function creates a new list from a provided iterable, sorts it and returns it. For example, here's how to iterate over the keys of a dictionary in sorted order

Timsort is a hybrid stable sorting algorithm, derived from merge sort and insertion sort, designed to perform well on many kinds of real-world data. It was implemented by Tim Peters in 2002 for use in the Python programming language. The algorithm finds subsequences of the data that are already ordered (runs) and uses them to sort the remainder more efficiently. This is done by merging runs until certain criteria are fulfilled. Timsort has been Python's standard sorting algorithm since version 2.3. It is also used to sort arrays of non-primitive type in Java SE 7, on the Android platform, in GNU Octave, on V8, Swift, and Rust. [1]

  • Worst-case performance: \(O(n\log{n})\)

  • Best-case performance: \(O(n)\)

  • Average performance: \(O(n\log{n})\)

  • Worst-case space complexity: \(O(n)\)

  • sorted() - Returns sorted list, do not modify the original

  • list.sort() - Changes object permanently, returns None

Return sorted values without modifying a list:

>>> values = [3, 1, 2]
>>>
>>> sorted(values)
[1, 2, 3]
>>>
>>> sorted(values, reverse=True)
[3, 2, 1]

Permanent sorting with list modification (note that list.sort() modifies values, and returns None, not values):

>>> values = [3, 1, 2]
>>>
>>> values.sort()
>>> values
[1, 2, 3]
>>>
>>> values.sort(reverse=True)
>>> values
[3, 2, 1]

You can also use list.sort() and/or sorted() with str. It will sort strings according to Unicode (UTF-8) value, that is ASCII table for latin alphabet and Unicode for extended encoding. This kind of sorting is called lexicographic order.

>>> colors = ['red', 'green', 'blue']
>>>
>>> sorted(colors)
['blue', 'green', 'red']

6.2.9. Reverse

  • reversed()

  • list.reverse()

>>> colors = ['red', 'green', 'blue']
>>> colors.reverse()
>>> colors
['blue', 'green', 'red']
>>> colors = ['red', 'green', 'blue']
>>> result = reversed(colors)
>>>
>>> list(result)
['blue', 'green', 'red']

Why?:

>>> colors = ['red', 'green', 'blue']
>>> result = reversed(colors)
>>>
>>> result  
<list_reverseiterator object at 0x...>
>>>
>>> next(result)
'blue'
>>> next(result)
'green'
>>> next(result)
'red'
>>> next(result)
Traceback (most recent call last):
StopIteration

6.2.10. Index

  • list.index() - position at which something is in the list

  • Note, that Python start counting at zero (zero based indexing)

  • Raises ValueError if the value is not present

>>> colors = ['red', 'green', 'blue']
>>> result = colors.index('blue')
>>>
>>> print(result)
2

6.2.11. Count

  • list.count() - number of occurrences of value

>>> colors = ['red', 'green', 'blue', 'red', 'blue', 'red']
>>> result = colors.count('red')
>>>
>>> print(result)
3

6.2.12. Method Chaining

>>> colors = ['red', 'green', 'blue']
>>> colors.sort()
>>> colors.append('black')
>>>
>>> print(colors)
['blue', 'green', 'red', 'black']
>>> colors = ['red', 'green', 'blue']
>>>
>>> colors.sort().append('black')
Traceback (most recent call last):
AttributeError: 'NoneType' object has no attribute 'append'

6.2.13. Built-in Functions

  • min() - Minimal value

  • max() - Maximal value

  • sum() - Sum of elements

  • len() - Length of a list

  • all() - All values are True

  • any() - Any values is True

List with numeric values:

>>> data = [3, 1, 2]
>>>
>>> len(data)
3
>>> min(data)
1
>>> max(data)
3
>>> sum(data)
6

List with string values:

>>> data = ['a', 'c', 'b']
>>>
>>> len(data)
3
>>> min(data)
'a'
>>> max(data)
'c'
>>> sum(data)
Traceback (most recent call last):
TypeError: unsupported operand type(s) for +: 'int' and 'str'

List with boolean values:

>>> data = [True, False, True]
>>>
>>> any(data)
True
>>> all(data)
False

6.2.14. Memory

../../_images/type-list-memory.png

Figure 6.2. Memory representation for list

6.2.15. Shallow Copy vs Deep Copy

  • Shallow Copy (by reference) - identifiers are pointing to the same object in memory

  • Deep Copy - identifiers are pointing to distinct objects

  • Shallow Copy is faster and requires less memory (no duplicated objects)

  • Deep Copy is slower and requires twice sa much memory, but is safe for modification

Shallow Copy:

>>> a = ['red', 'green', 'blue']
>>> b = a
>>>
>>> a.append('black')
>>>
>>> a
['red', 'green', 'blue', 'black']
>>> b
['red', 'green', 'blue', 'black']
>>>
>>> id(a)  
4417433984
>>> id(b)  
4417433984

Deep Copy:

>>> a = ['red', 'green', 'blue']
>>> b = a.copy()
>>>
>>> a.append('black')
>>>
>>> a
['red', 'green', 'blue', 'black']
>>> b
['red', 'green', 'blue']
>>>
>>> id(first)  
4391796976
>>> id(second)  
4391797008

6.2.16. Recap

  • Mutable - can add, remove, and modify items

  • Stores elements of any type

  • Extensible and flexible

6.2.17. References

6.2.18. Assignments

Code 6.5. Solution
"""
* Assignment: Iterable List Create
* Type: class assignment
* Complexity: easy
* Lines of code: 5 lines
* Time: 5 min

English:
    1. Create lists:
        a. `result_a` without elements
        b. `result_b` with elements: 1, 2, 3
        c. `result_c` with elements: 1.1, 2.2, 3.3
        d. `result_d` with elements: 'a', 'b', 'c'
        e. `result_e` with elements: True, False, None
        f. `result_f` with elements: 1, 2.2, True, 'a'
    2. Run doctests - all must succeed

Polish:
    1. Stwórz listy:
        a. `result_a` bez elementów
        b. `result_b` z elementami: 1, 2, 3
        c. `result_c` z elementami: 1.1, 2.2, 3.3
        d. `result_d` z elementami: 'a', 'b', 'c'
        e. `result_e` z elementami: True, False, None
        f. `result_f` z elementami: 1, 2.2, True, 'a'
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result_a is not Ellipsis, \
    'Assign your result to variable `result_a`'
    >>> assert result_b is not Ellipsis, \
    'Assign your result to variable `result_b`'
    >>> assert result_c is not Ellipsis, \
    'Assign your result to variable `result_c`'
    >>> assert result_d is not Ellipsis, \
    'Assign your result to variable `result_d`'
    >>> assert result_e is not Ellipsis, \
    'Assign your result to variable `result_e`'
    >>> assert result_f is not Ellipsis, \
    'Assign your result to variable `result_f`'

    >>> assert type(result_a) is list, \
    'Variable `result_a` has invalid type, should be list'
    >>> assert type(result_b) is list, \
    'Variable `result_b` has invalid type, should be list'
    >>> assert type(result_c) is list, \
    'Variable `result_c` has invalid type, should be list'
    >>> assert type(result_d) is list, \
    'Variable `result_d` has invalid type, should be list'
    >>> assert type(result_e) is list, \
    'Variable `result_e` has invalid type, should be list'
    >>> assert type(result_f) is list, \
    'Variable `result_f` has invalid type, should be list'

    >>> assert result_a == [], \
    'Variable `result_a` has invalid value, should be []'
    >>> assert result_b == [1, 2, 3], \
    'Variable `result_b` has invalid value, should be [1, 2, 3]'
    >>> assert result_c == [1.1, 2.2, 3.3], \
    'Variable `result_c` has invalid value, should be [1.1, 2.2, 3.3]'
    >>> assert result_d == ['a', 'b', 'c'], \
    'Variable `result_d` has invalid value, should be ["a", "b", "c"]'
    >>> assert result_e == [True, False, None], \
    'Variable `result_e` has invalid value, should be [True, False, None]'
    >>> assert result_f == [1, 2.2, True, 'a'], \
    'Variable `result_f` has invalid value, should be [1, 2.2, True, "a"]'
"""

# List without elements
# type: list
result_a = ...

# List with elements: 1, 2, 3
# type: list[int]
result_b = ...

# List with elements: 1.1, 2.2, 3.3
# type: list[float]
result_c = ...

# List with elements: 'a', 'b', 'c'
# type: list[str]
result_d = ...

# List with elements: True, False, None
# type: list[bool|None]
result_e = ...

# List with elements: 1, 2.2, True, 'a'
# type: list[int|float|bool|str]
result_f = ...

Code 6.6. Solution
"""
* Assignment: Type List Insert
* Type: class
* Complexity: easy
* Lines of code: 1 lines
* Time: 2 min

English:
    1. Insert character 'x' at the beginning of `result`
    2. Run doctests - all must succeed

Polish:
    1. Wstaw znak 'x' na początku `result`
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign your result to variable `result`'

    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'

    >>> result
    ['x', 'a', 'b', 'c']
"""

result = ['a', 'b', 'c']


# Insert string 'x' at the beginning of `result`
# type: list
...

Code 6.7. Solution
"""
* Assignment: Type List Append
* Type: class
* Complexity: easy
* Lines of code: 1 lines
* Time: 2 min

English:
    1. Insert character 'x' at the end of `result`
    2. Run doctests - all must succeed

Polish:
    1. Wstaw znak 'x' na końcu `result`
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign your result to variable `result`'

    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'

    >>> result
    ['a', 'b', 'c', 'x']
"""

result = ['a', 'b', 'c']


# Insert string 'x' at the end of `result`
# type: list
...

Code 6.8. Solution
"""
* Assignment: Type List Extend
* Type: class
* Complexity: easy
* Lines of code: 1 lines
* Time: 2 min

English:
    1. Insert all characters from `data` at the end of `result`
    2. Run doctests - all must succeed

Polish:
    1. Wstaw wszystkie znaki z `data` na końcu `result`
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign your result to variable `result`'

    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'

    >>> result
    ['a', 'b', 'c', 'x', 'y', 'z']
"""

data = ['x', 'y', 'z']
result = ['a', 'b', 'c']


# Insert all characters from `data` at the end of `result`
# type: list
...

Code 6.9. Solution
"""
* Assignment: Type List Sort
* Type: class
* Complexity: easy
* Lines of code: 1 lines
* Time: 2 min

English:
    1. Sort `result`
    2. Run doctests - all must succeed

Polish:
    1. Posortuj `result`
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign your result to variable `result`'

    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'

    >>> result
    ['a', 'b', 'c']
"""

result = ['c', 'a', 'b']


# Sort `result`
# type: list
...

Code 6.10. Solution
"""
* Assignment: Type List Reverse
* Type: class
* Complexity: easy
* Lines of code: 1 lines
* Time: 2 min

English:
    1. Reverse order of `result` (do not sort)
    2. Run doctests - all must succeed

Polish:
    1. Odwróć kolejność `result` (nie sortuj)
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign your result to variable `result`'

    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'

    >>> result
    ['b', 'a', 'c']
"""

result = ['c', 'a', 'b']


# Reverse order of `result` (do not sort)
# type: list
...