13.6. File Read

  • Works with both relative and absolute path

  • Fails when directory with file cannot be accessed

  • Fails when file cannot be accessed

  • Uses context manager

  • mode parameter to open() function is optional (defaults to mode='rt')

13.6.1. SetUp

>>> lines = [
...     'This is a first line\n',
...     'This is a second line\n',
... ]
>>>
>>> with open('/tmp/myfile.txt', mode='w') as file:
...     file.writelines(lines)

13.6.2. Open for Reading

# By default file is opened in read text mode (rt)

>>> file = open('/tmp/myfile.txt')              # read in text mode
>>> file = open('/tmp/myfile.txt', mode='r')    # read in text mode
>>> file = open('/tmp/myfile.txt', mode='rt')   # read in text mode
>>> file = open('/tmp/myfile.txt', mode='rb')   # read in binary mode

13.6.3. Read File at Once

  • Always remember to close file

  • Note, that whole file must fit into memory

>>> file = open('/tmp/myfile.txt', mode='rt')
>>> data = file.read()
>>> file.close()

13.6.4. Read One Line from File

  • Always remember to close file

>>> file = open('/tmp/myfile.txt', mode='rt')
>>> data = file.readline()
>>> file.close()

13.6.5. Read All Lines from File

  • Always remember to close file

  • Note, that whole file must fit into memory

>>> file = open('/tmp/myfile.txt', mode='rt')
>>> data = file.readlines()
>>> file.close()

Read selected (1-10) lines from file:

>>> file = open('/tmp/myfile.txt', mode='rt')
>>> data = file.readlines()[0:10]
>>> file.close()

13.6.6. Reading File as Generator

  • Always remember to close file

  • Use generator (file) to iterate over other lines

>>> file = open('/tmp/myfile.txt', mode='rt')
>>>
>>> for line in file:
...     line.strip()
'This is a first line'
'This is a second line'
>>>
>>> file.close()

13.6.7. Read Using Context Manager

  • Context managers use with ... as ...: syntax

  • It closes file automatically upon block exit (dedent)

  • Using context manager is best practice

Read whole file:

>>> with open('/tmp/myfile.txt', mode='rt') as file:
...     data = file.read()

Read one line:

>>> with open('/tmp/myfile.txt', mode='rt') as file:
...     data = file.readline()

Read all lines:

>>> with open('/tmp/myfile.txt', mode='rt') as file:
...     data = file.readlines()

Read file as generator:

>>> with open('/tmp/myfile.txt', mode='rt') as file:
...     for line in file:
...         line.strip()
'This is a first line'
'This is a second line'

13.6.8. Reading From One File and Writing to Another

>>> with open('/tmp/myfile1.txt', mode='rt') as infile, \
...      open('/tmp/myfile2.txt', mode='wt') as outfile:
...     data = infile.read()
...     # transform data
...     result = outfile.write(data)

13.6.9. Case Study

>>> from pprint import pprint
>>>
>>>
>>> FILE = r'/tmp/myfile.txt'
>>>
>>> DATA = """sepal_length,sepal_width,petal_length,petal_width,species
... 5.8,2.7,5.1,1.9,virginica
... 5.1,3.5,1.4,0.2,setosa
... 5.7,2.8,4.1,1.3,versicolor
... 6.3,2.9,5.6,1.8,virginica
... 6.4,3.2,4.5,1.5,versicolor
... 4.7,3.2,1.3,0.2,setosa
... """

Write:

>>> with open(FILE, mode='wt') as file:
...     file.write(DATA)
210

Read:

>>> with open(FILE) as file:
...     data = file.readlines()
...
>>> lines = [x.strip().split(',') for x in data]
>>> header = tuple(lines[0])
>>>
>>> rows = []
>>> for line in lines[1:]:
...     values = [float(x) for x in line[0:4]]
...     species = line[4]
...     row = tuple(values) + (species,)
...     rows.append(row)
>>>
>>> result = []
>>> result.append(header)
>>> result.extend(rows)

Result:

>>> pprint(result)
[('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
 (5.8, 2.7, 5.1, 1.9, 'virginica'),
 (5.1, 3.5, 1.4, 0.2, 'setosa'),
 (5.7, 2.8, 4.1, 1.3, 'versicolor'),
 (6.3, 2.9, 5.6, 1.8, 'virginica'),
 (6.4, 3.2, 4.5, 1.5, 'versicolor'),
 (4.7, 3.2, 1.3, 0.2, 'setosa')]

13.6.10. Use Case - 1

>>> DATA = """A,B,C,red,green,blue
... 1,2,3,0
... 4,5,6,1
... 7,8,9,2"""
>>>
>>> data = DATA.splitlines()
>>> header = data[0]
>>> lines = data[1:]
>>> colors = header.strip().split(',')[3:]
>>> colors = dict(enumerate(colors))
>>> result = []
>>>
>>> for line in lines:
...     line = line.strip().split(',')
...     *numbers, color = map(int, line)
...     line = numbers + [colors.get(color)]
...     result.append(tuple(line))

13.6.11. Assignments

# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author

# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -v myfile.py`

# %% About
# - Name: File Read Read
# - Difficulty: easy
# - Lines: 2
# - Minutes: 2

# %% English
# 1. Read `FILE` to `result: str`
# 2. Run doctests - all must succeed

# %% Polish
# 1. Wczytaj `FILE` do `result: str`
# 2. Uruchom doctesty - wszystkie muszą się powieść

# %% Hints
# - `with`
# - `open()`

# %% Tests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python 3.9+ required'

>>> from os import remove
>>> remove(FILE)

>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is str, \
'Variable `result` has invalid type, should be str'

>>> print(result)
Mark Watney
<BLANKLINE>
"""

FILE = '_temporary.txt'
DATA = 'Mark Watney\n'

with open(FILE, mode='wt') as file:
    file.write(DATA)


# Read `FILE` to `result: str`
# type: str
result = ...


# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author

# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -v myfile.py`

# %% About
# - Name: File Read Readlines
# - Difficulty: easy
# - Lines: 2
# - Minutes: 2

# %% English
# 1. Read `FILE` to `result: list[str]`
# 2. Run doctests - all must succeed

# %% Polish
# 1. Wczytaj `FILE` do `result: list[str]`
# 2. Uruchom doctesty - wszystkie muszą się powieść

# %% Hints
# - `with`
# - `open()`

# %% Tests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python 3.9+ required'

>>> from os import remove; remove(FILE)

>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is list, \
'Variable `result` has invalid type, should be list'

>>> print(result)
['Mark Watney\\n', 'Melissa Lewis\\n', 'Rick Martinez\\n']
"""

FILE = '_temporary.txt'

DATA = """Mark Watney
Melissa Lewis
Rick Martinez
"""

with open(FILE, mode='wt') as file:
    file.write(DATA)


# Read `FILE` to `result: list[str]`
# type: str
result = ...


# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author

# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -v myfile.py`

# %% About
# - Name: File Read List[str]
# - Difficulty: easy
# - Lines: 3
# - Minutes: 3

# %% English
# 1. Read `FILE` to `result: list[str]`
# 2. Remove newline character
# 3. Split line by comma
# 4. Run doctests - all must succeed

# %% Polish
# 1. Wczytaj `FILE` do `result: list[str]`
# 2. Usuń znak końca linii
# 3. Podziel linię po przecinku
# 4. Uruchom doctesty - wszystkie muszą się powieść

# %% Hints
# - `with`
# - `open()`
# - `str.strip()`
# - `str.split()`

# %% Tests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python 3.9+ required'

>>> from os import remove; remove(FILE)

>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is list, \
'Variable `result` has invalid type, should be list'
>>> assert all(type(x) is str for x in result), \
'All rows in `result` should be str'

>>> result
['firstname', 'lastname', 'age']
"""

FILE = '_temporary.txt'
DATA = 'firstname,lastname,age\n'

with open(FILE, mode='wt') as file:
    file.write(DATA)

# Read `FILE` to `result: list[str]`
# Remove newline character
# Split line by comma
# type: str
result = ...


# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author

# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -v myfile.py`

# %% About
# - Name: File Read Multiline
# - Difficulty: easy
# - Lines: 5
# - Minutes: 3

# %% English
# 1. Read `FILE` to `result: tuple`
# 2. Remove newline character
# 3. Split line by comma
# 4. Convert numeric values to float
# 5. Run doctests - all must succeed

# %% Polish
# 1. Wczytaj `FILE` do `result: tuple`
# 2. Usuń znak końca linii
# 3. Podziel linię po przecinku
# 4. Przekonwertuj wartości numeryczne do float
# 5. Uruchom doctesty - wszystkie muszą się powieść

# %% Hints
# - `with`
# - `open()`
# - Comprehension
# - `str.strip()`
# - `str.split()`
# - `float()`
# - `tuple()`

# %% Tests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python 3.9+ required'

>>> from os import remove; remove(FILE)

>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is tuple, \
'Variable `result` has invalid type, should be tuple'
>>> assert all(type(x) in (str, int) for x in result), \
'All rows in `result` should be float or str or int'

>>> print(result)
('Mark', 'Watney', 41)
"""

FILE = '_temporary.txt'
DATA = 'Mark,Watney,41\n'

with open(FILE, mode='wt') as file:
    file.write(DATA)

# Read `FILE` to `result: tuple`
# Remove newline character
# Split line by comma
# Convert numeric values to float
# type: tuple[str, str, float]
result = ...


# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author

# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -v myfile.py`

# %% About
# - Name: File Read CSV
# - Difficulty: easy
# - Lines: 11
# - Minutes: 8

# %% English
# 1. Read `FILE` to `result: tuple`
# 2. Remove newline character
# 3. Split line by comma
# 4. Convert numeric values to float
# 5. Run doctests - all must succeed

# %% Polish
# 1. Wczytaj `FILE` do `result: tuple`
# 2. Usuń znak końca linii
# 3. Podziel linię po przecinku
# 4. Przekonwertuj wartości numeryczne do float
# 5. Uruchom doctesty - wszystkie muszą się powieść

# %% Hints
# - `with`
# - `open()`
# - `str.split()`
# - `str.strip()`
# - Comprehension
# - `float()`
# - `(1,2,3) + ('abc',)`
# - `list.append()`

# %% Tests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python 3.9+ required'

>>> from pprint import pprint
>>> from os import remove; remove(FILE)

>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is list, \
'Variable `result` has invalid type, should be list'
>>> assert all(type(x) is tuple for x in result), \
'All rows in `result` should be tuple'

>>> pprint(result)
[('firstname', 'lastname', 'age'),
 ('Mark', 'Watney', 41),
 ('Melissa', 'Lewis', 40),
 ('Rick', 'Martinez', 39),
 ('Alex', 'Vogel', 40),
 ('Chris', 'Beck', 36),
 ('Beth', 'Johanssen', 29)]
"""

FILE = '_temporary.csv'

DATA = """firstname,lastname,age
Mark,Watney,41
Melissa,Lewis,40
Rick,Martinez,39
Alex,Vogel,40
Chris,Beck,36
Beth,Johanssen,29
"""

with open(FILE, mode='w') as file:
    file.write(DATA)

# Read `FILE` to `result: tuple`
# Remove newline character
# Split line by comma
# Convert numeric values to float
# type: list[tuple]
result = ...