2.2. String Literals
f'...'- Format Stringr'...'- Raw Stringu'...'- Unicode Stringb'...'- Byte String
Default str:
>>> text = 'Hello'
String literal (lowercase):
>>> text = f'Hello'
>>> text = r'Hello'
>>> text = u'Hello'
>>> text = b'Hello'
String literal (uppercase - works exactly the same):
>>> text = F'Hello'
>>> text = R'Hello'
>>> text = U'Hello'
>>> text = B'Hello'
Multiple string literals:
>>> text = fr'Hello'
>>> text = rf'Hello'
>>>
>>> text = RF'Hello'
>>> text = FR'Hello'
2.2.1. Format String
f-string
String interpolation (variable substitution)
Since Python 3.6
Used for
strconcatenation
Problem:
>>> name = 'Alice'
>>> text = 'Hello {name}'
>>>
>>> print(text)
Hello {name}
Solution:
>>> name = 'Alice'
>>> text = f'Hello {name}'
>>>
>>> print(text)
Hello Alice
2.2.2. Raw String
Escape characters does not matter
Problem:
>>> text = 'Hello\nAlice'
>>> print(text)
Hello
Alice
Solution:
>>> text = r'Hello\nAlice'
>>> print(text)
Hello\nAlice
2.2.3. Unicode Literal
In Python 3
stris UnicodeIn Python 2
stris BytesIn Python 3
u'...'is only for compatibility with Python 2
>>> text = 'Hello' # unicode
>>> text = u'Hello' # unicode
Since Python 3 all strings are unicode literals and there is no need to add u-prefix anymore.
2.2.4. Bytes Literal
Used while reading from low level devices and drivers
Used in sockets and HTTP connections
bytesis a sequence of octets (integers between 0 and 255)bytes.decode()- convertsbytestostr(using UTF-8 encoding)str.encode()convertsstrtobytes(using UTF-8 encoding)
For ASCII characters, bytes and str are the same:
>>> text = 'Hello' # unicode
>>> text = b'Hello' # bytes
For non-ASCII characters, bytes and str are different
(note, "cześć" means "hello" in Polish):
>>> text = 'Cześć'
>>> text = b'Cze\xc5\x9b\xc4\x87'
Convert bytes to str:
>>> text = 'Cześć'
>>>
>>> text.encode()
b'Cze\xc5\x9b\xc4\x87'
Convert bytes to str:
>>> data = b'cze\xc5\x9b\xc4\x87'
>>>
>>> data.decode()
'cześć'
Unicode (UTF-8) is the default encoding. You can also specify different
encodings to .encode() and .decode() methods as a positional
argument.
2.2.5. Strings Template
>>> from string import Template
>>>
>>> template = Template("Hello, $name! Today is $day.")
>>> template.substitute(name="Alice", day="Friday")
'Hello, Alice! Today is Friday.'
2.2.6. Template Strings
Since Python 3.14
>>> name = 'Alice'
>>> day = 'Friday'
>>>
>>> template = t'Hello {name}! Today is {day}.'
>>>
>>> template
Template(strings=('Hello ', '! Today is ', '.'),
interpolations=(Interpolation('Alice', 'name', None, ''),
Interpolation('Friday', 'day', None, '')))
>>> from string.templatelib import Interpolation, Template
>>>
>>> def parse(template):
... if not isinstance(template, Template):
... raise TypeError('t-string expected')
... result = []
... for item in template:
... if isinstance(item, str):
... # ... <your code here> ...
... iterpolated = item
... result.append(iterpolated)
... elif isinstance(item, Interpolation):
... value = item.value
... expression = item.expression
... conversion = item.conversion
... format_spec = item.format_spec
... iterpolated = format(value, format_spec)
... result.append(iterpolated)
... return ''.join(result)
>>>
>>>
>>> name = 'Alice'
>>> day = 'Friday'
>>>
>>> template = t'Hello {name}! Today is {day}.'
>>>
>>> parse(template)
'Hello Alice! Today is Friday.'
https://docs.python.org/id/3/library/string.templatelib.html#string.templatelib.Interpolation
value- the value of the expressionexpression- text found inside the curly brackets ({and}), including any whitespace, excluding the curly brackets themselves, and ending before the first!,:, or=if any is presentconversion-a,r,sorNone, depending on whether a conversion flag was present, ie."Hello {user!r}"format_spec- the format specifier, ie."Hello {value:.2f}"or"Hello {value:myfspec}"
2.2.7. Multiple letters
You can combine multiple letters
For example,
fr'...'is a format string and raw stringOrder does not matter, ie:
rf'...'is the same asfr'...'Not all combinations are possible, for example
ub'...'is not allowed and will raise aSyntaxError
>>> name = 'Alice'
>>> text = fr'Hello\n{name}'
>>> print(text)
Hello\nAlice
>>> name = 'Alice'
>>> text = rf'Hello\n{name}'
>>> print(text)
Hello\nAlice
2.2.8. Lowercase vs Uppercase
Python does not differentiate between those two
Works exactly the same
VS Code thinks
r'...'is a regex andR'...'is a raw string
>>> text = r'Hello\nAlice'
>>> print(text)
Hello\nAlice
>>> text = R'Hello\nAlice'
>>> print(text)
Hello\nAlice
>>> text = 'hello' # unicode
>>>
>>> text = u'hello' # unicode
>>> text = b'hello' # bytes
>>> text = f'hello' # f-string
>>> text = r'hello' # raw-string
>>>
>>> text = U'hello' # unicode
>>> text = B'hello' # bytes
>>> text = F'hello' # f-string
>>> text = R'hello' # raw-string
2.2.9. Case Study
Windows absolute path problem
Absolute path include all entries in the directories hierarchy
Absolute path on
*nixstarts with root/dirAbsolute path on Windows starts with drive letter
Linux (and other *nix):
>>> file = '/home/myuser/newfile.txt'
macOS:
>>> file = '/Users/myuser/newfile.txt'
Windows:
>>> file = 'c:/Users/myuser/newfile.txt'
Problem with paths on Windows
Use backslash (
\\) as a path separatorUse r-string for paths
Let's say we have a path to a file:
>>> print('C:/Users/myuser/newfile.txt')
C:/Users/myuser/newfile.txt
Paths on Windows do not use slashes (/). You must use backslash (\\)
as a path separator. This is where all problems starts. Let's start changing
slashes to backslashes from the end (the one before newfile.txt):
>>> print('C:/Users/myuser\newfile.txt')
C:/Users/myuser
ewfile.txt
This is because \n is a newline character. In order this to work
we need to escape it.
Now lets convert another slash to backslash, this time the one before
directory named myuser:
>>> print('C:/Users\myuser\\newfile.txt')
SyntaxWarning: invalid escape sequence '\m'
C:/Users\myuser\newfile.txt
Since Python 3.12 all non-existing escape characters (in this case \m
will need to be escaped or put inside of a row strings. This is only a
warning (SyntaxWarning: invalid escape sequence '\m', so we can ignore
it, but this behavior will be default sometime in the future, so it is better
to avoid it now:
The last slash (the one before Users):
>>> print('C:\Users\\myuser\\newfile.txt')
Traceback (most recent call last):
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
This time the problem is more serious. Problem is with \Users. After
escape sequence \U Python expects hexadecimal Unicode codepoint, i.e.
\U0001F600 which is a smiley 😀 emoticon emoticon. In this example,
Python finds letter s, which is invalid hexadecimal character and
therefore raises an SyntaxError telling user that there is an error
with decoding bytes. The only valid hexadecimal numbers are
0123456789abcdefABCDEF and letter s isn't one of them.
There is two ways how you can avoid this problem. Using escape before every slash:
>>> print('C:\\Users\\myuser\\newfile.txt')
C:\Users\myuser\newfile.txt
Or use r-string:
>>> print(r'C:\Users\myuser\newfile.txt')
C:\Users\myuser\newfile.txt
Both will generate the same output, so you can choose either one. In my opinion r-strings are less error prone and I use them each time when I have to deal with paths.
2.2.10. Recap
f'...'- f-string - Format String (variable substitution)r'...'- r-string - Raw String (escape characters does not matter)u'...'- u-string - Unicode String (for compatibility with Python 2)b'...'- b-string - Byte String (low level and network communication)bytes.decode()- convertsbytestostr(using UTF-8 encoding)str.encode()convertsstrtobytes(using UTF-8 encoding)
Format string:
>>> name = 'Alice'
>>> text = f'Hello {name}'
>>>
>>> print(text)
Hello Alice
Raw string:
>>> text = r'Hello\nAlice'
>>> print(text)
Hello\nAlice
Unicode string:
>>> text = u'hello' # hello
>>> text = u'cześć' # cześć
Bytes string:
>>> text = b'hello' # hello
>>> text = b'cze\xc5\x9b\xc4\x87' # cześć
Conversion:
>>> text = 'cześć'
>>>
>>> text.encode()
b'cze\xc5\x9b\xc4\x87'
>>> data = b'cze\xc5\x9b\xc4\x87'
>>>
>>> data.decode()
'cześć'
2.2.11. Use Case - 1
Raw-string in Regular Expressions:
>>> '\\b[a-z]+\\b'
'\\b[a-z]+\\b'
>>> r'\b[a-z]+\b'
'\\b[a-z]+\\b'
2.2.12. Assignments
# %% About
# - Name: Syntax Literals F-String
# - Difficulty: easy
# - Lines: 1
# - Minutes: 2
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Define `result` with value `Hello X`
# 2. In place of X, insert value of the `NAME` variable
# 3. Use f-string
# 4. Run doctests - all must succeed
# %% Polish
# 1. Zdefiniuj `result` z wartością `Hello X`
# 2. W miejsce X, wstaw wartość zmiennej `NAME`
# 3. Użyj f-string
# 4. Uruchom doctesty - wszystkie muszą się powieść
# %% Expected
# >>> result
# 'Hello Alice'
# %% Hints
# - `f'...'`
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert type(result) is str, \
'Variable `result` has an invalid type; expected: `str`.'
>>> result
'Hello Alice'
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
# %% Types
result: str
# %% Data
NAME = 'Alice'
# %% Result
result = ...
# %% About
# - Name: Syntax Literals R-String
# - Difficulty: easy
# - Lines: 1
# - Minutes: 2
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Define `result` with value `Hello\nAlice`
# 2. Use r-string
# 3. Run doctests - all must succeed
# %% Polish
# 1. Zdefiniuj `result` z wartością `Hello\nAlice`
# 2. Użyj r-string
# 3. Uruchom doctesty - wszystkie muszą się powieść
# %% Expected
# >>> result
# 'Hello\nAlice'
# %% Hints
# - `r'...'`
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert type(result) is str, \
'Variable `result` has an invalid type; expected: `str`.'
>>> result
'Hello\\\\nAlice'
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
# %% Types
result: str
# %% Data
# %% Result
result = ...
# %% About
# - Name: Syntax Literals B-String / U-String
# - Difficulty: easy
# - Lines: 2
# - Minutes: 2
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Define `result_a` with value `Hello`, use unicode string
# 2. Define `result_b` with value `Hello`, use bytes string
# 3. Run doctests - all must succeed
# %% Polish
# 1. Zdefiniuj `result_a` z wartością `Hello`, użyj ciągu znaków unicode
# 2. Zdefiniuj `result_b` z wartością `Hello`, użyj ciągu bajtów
# 3. Uruchom doctesty - wszystkie muszą się powieść
# %% Expected
# >>> result_a
# 'Hello'
#
# >>> result_b
# b'Hello'
# %% Hints
# - `u'...'`
# - `b'...'`
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'
>>> assert result_a is not Ellipsis, \
'Variable `result_a` has an invalid value; assign result of your program to it.'
>>> assert type(result_a) is str, \
'Variable `result_a` has an invalid type; expected: `str`.'
>>> result_a
'Hello'
>>> assert result_b is not Ellipsis, \
'Variable `result_b` has an invalid value; assign result of your program to it.'
>>> assert type(result_b) is bytes, \
'Variable `result_b` has an invalid type; expected: `bytes`.'
>>> result_b
b'Hello'
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
# %% Types
result_a: str
result_b: bytes
# %% Data
# %% Result
result_a = ...
result_b = ...
# %% About
# - Name: Syntax Literals Encode
# - Difficulty: easy
# - Lines: 1
# - Minutes: 2
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Define `result` with unicode string `DATA` encoded to bytes
# 2. Use UTF-8 encoding
# 3. Run doctests - all must succeed
# %% Polish
# 1. Zdefiniuj `result` ze stringiem unicode `DATA` zakodowanym do bajtów
# 2. Użyj kodowania UTF-8
# 3. Uruchom doctesty - wszystkie muszą się powieść
# %% Expected
# >>> result
# b'Cze\xc5\x9b\xc4\x87'
# %% Hints
# - `str.encode(encoding)`
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert type(result) is bytes, \
'Variable `result` has an invalid type; expected: `bytes`.'
>>> result
b'Cze\xc5\x9b\xc4\x87'
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
# %% Types
result: bytes
# %% Data
DATA = 'Cześć'
# %% Result
result = ...
# %% About
# - Name: Syntax Literals Decode
# - Difficulty: easy
# - Lines: 1
# - Minutes: 2
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% English
# 1. Define `result` with bytes `DATA` decoded to unicode string
# 2. Use UTF-8 encoding
# 3. Run doctests - all must succeed
# %% Polish
# 1. Zdefiniuj `result` z bajtami `DATA` zdekodowanym do ciągu znaków unicode
# 2. Użyj kodowania UTF-8
# 3. Uruchom doctesty - wszystkie muszą się powieść
# %% Expected
# >>> result
# 'Cześć'
# %% Hints
# - `str.decode(encoding)`
# %% Doctests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python has an is invalid version; expected: `3.9` or newer.'
>>> assert result is not Ellipsis, \
'Variable `result` has an invalid value; assign result of your program to it.'
>>> assert type(result) is str, \
'Variable `result` has an invalid type; expected: `str`.'
>>> result
'Cześć'
"""
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -f -v myfile.py`
# %% Imports
# %% Types
result: str
# %% Data
DATA = b'Cze\xc5\x9b\xc4\x87'
# %% Result
result = ...