5.5. String Methods
str
is immutablestr
methods create a new modifiedstr
5.5.1. Compare
str.casefold()
- Return a version of the string suitable for caseless comparisons
>>> a = 'Angus MacGyver'
>>> b = 'Angus Macgyver'
>>>
>>>
>>> a == b
False
>>>
>>> a.casefold() == b.casefold()
True
5.5.2. Strip Whitespace
str.strip()
- remove whitespaces from both endsstr.lstrip()
- remove whitespaces from left side onlystr.rstrip()
- remove whitespaces from right side only
Strip is a very common method, which you should always call upon any text
from user input, that is from input()
function, but also from files,
socket communication and from internet data transfer. You never know, if
the user did not pasted text from other source, which will add whitespace
at the end of at the beginning of a string.
There are three strip methods: left strip, right strip and strip from both ends. Word whitespace refers to:
\n
- newline
\t
- tab`` `` - space
\v
- vertical space
\f
- form-feed
Most common is plain strip, which will remove all whitespace characters from both sides at the same time:
>>> name = '\tAngus MacGyver \n'
>>> name.strip()
'Angus MacGyver'
Right strip:
>>> name = '\tAngus MacGyver \n'
>>> name.rstrip()
'\tAngus MacGyver'
Left strip:
>>> name = '\tAngus MacGyver \n'
>>> name.lstrip()
'Angus MacGyver \n'
5.5.3. Change Case
str.upper()
- all letters will be uppercasestr.lower()
- all letters will be lowercasestr.capitalize()
- will uppercase first letter of text, lowercase othersstr.title()
- will uppercase first letter of each word, lowercase othersstr.swapcase()
- make lowercase letters upper, and uppercase lower
Comparing not normalized strings will yield invalid or at least unexpected results:
>>> 'MacGyver' == 'Macgyver'
False
Normalize strings before comparing:
>>> 'MacGyver'.upper() == 'Macgyver'.upper()
True
This is necessary to perform further data analysis.
Upper:
>>> name = 'Angus MacGyver III'
>>> name.upper()
'ANGUS MACGYVER III'
Lower:
>>> name = 'Angus MacGyver III'
>>> name.lower()
'angus macgyver iii'
Title:
>>> name = 'Angus MacGyver III'
>>> name.title()
'Angus Macgyver Iii'
Capitalize:
>>> name = 'Angus MacGyver III'
>>> name.capitalize()
'Angus macgyver iii'
5.5.4. Replace
str.replace()
str.removesuffix()
str.removeprefix()
str.strip()
Replace substring:
>>> name = 'Angus MacGyver Iii'
>>> name.replace('Iii', 'III')
'Angus MacGyver III'
Replace is case sensitive:
>>> name = 'Angus MacGyver Iii'
>>> name.replace('iii', 'III')
'Angus MacGyver Iii'
5.5.5. Starts or Ends With
str.startswith()
- returnTrue
ifstr
starts with the specified prefix,False
otherwisestr.endswith()
- returnTrue
ifstr
ends with the specified suffix,False
otherwiseoptional
start
, teststr
beginning at that positionoptional
end
, stop comparingstr
at that positionprefix/suffix can also be a tuple of strings to try
>>> email = 'mark.watney@nasa.gov'
>>>
>>>
>>> email.startswith('mark.watney')
True
>>>
>>> email.startswith('melissa.lewis')
False
It also works with tuple of strings to try:
>>> email = 'mark.watney@nasa.gov'
>>> vip = ('mark.watney', 'melissa.lewis')
>>>
>>> email.startswith(vip)
True
>>> email = 'mark.watney@nasa.gov'
>>>
>>>
>>> email.endswith('nasa.gov')
True
>>>
>>> email.endswith('esa.int')
False
>>> email = 'mark.watney@nasa.gov'
>>> whitelist = ('nasa.gov', 'esa.int')
>>>
>>> email.endswith(whitelist)
True
5.5.6. Split by Line
str.splitlines()
- split by newline character, don't leave empty lines at the endstr.split('\n')
- will leave empty string if newline is a the end of str
>>> text = 'Hello\nPython\nWorld'
>>>
>>> text.splitlines()
['Hello', 'Python', 'World']
>>> text = """We choose to go to the Moon!
... We choose to go to the Moon in this decade and do the other things,
... not because they are easy, but because they are hard;
... because that goal will serve to organize and measure the best of our
... energies and skills, because that challenge is one that we are willing
... to accept, one we are unwilling to postpone, and one we intend to win,
... and the others, too."""
>>>
>>>
>>> text.splitlines()
['We choose to go to the Moon!',
'We choose to go to the Moon in this decade and do the other things,',
'not because they are easy, but because they are hard;',
'because that goal will serve to organize and measure the best of our',
'energies and skills, because that challenge is one that we are willing',
'to accept, one we are unwilling to postpone, and one we intend to win,',
'and the others, too.']
5.5.7. Split by Character
str.split()
- Split by given characterNo argument - any number of whitespaces
>>> text = '1,2,3,4'
>>> text.split(',')
['1', '2', '3', '4']
>>> setosa = '5.1,3.5,1.4,0.2,setosa'
>>> setosa.split(',')
['5.1', '3.5', '1.4', '0.2', 'setosa']
>>> text = 'We choose to go to the Moon'
>>> text.split(' ')
['We', 'choose', 'to', 'go', 'to', 'the', 'Moon']
>>> text = 'We choose to go to the Moon'
>>> text.split()
['We', 'choose', 'to', 'go', 'to', 'the', 'Moon']
>>> text = '10.13.37.1 nasa.gov esa.int'
>>> text.split(' ')
['10.13.37.1', '', '', '', '', '', 'nasa.gov', 'esa.int']
>>> text = '10.13.37.1 nasa.gov esa.int'
>>> text.split()
['10.13.37.1', 'nasa.gov', 'esa.int']
5.5.8. Join by Character
str.join(sep, sequence)
- concatenate sequence using separatorNote, this is a method of a
str
, nottuple.join()
orlist.join()
>>> letters = ['a', 'b', 'c']
>>> ''.join(letters)
'abc'
>>> words = ['We', 'choose', 'to', 'go', 'to', 'the', 'Moon']
>>> ' '.join(words)
'We choose to go to the Moon'
>>> setosa = ['5.1', '3.5', '1.4', '0.2', 'setosa']
>>> ','.join(setosa)
'5.1,3.5,1.4,0.2,setosa'
>>> crew = ['First line', 'Second line', 'Third line']
>>> '\n'.join(crew)
'First line\nSecond line\nThird line'
5.5.9. Join Numbers
(str(x) for x in data)
- using comprehension or generator expressionmap(str, data)
- using map transformationType cast won't work
str(data)
- it will stringify whole list
Method str.join()
expects, that all arguments are strings. Therefore it raises
and error if sequence of numbers is passed:
>>> data = [1, 2, 3]
>>> ','.join(data)
Traceback (most recent call last):
TypeError: sequence item 0: expected str instance, int found
In order to avoid errors, you have to manually convert all the values to strings
before passing them to str.join()
. In the following example the generator
expression syntax is used. It will apply str()
to all elements in data
.
More information in Generator Expression:
>>> data = [1, 2, 3]
>>> ','.join(str(x) for x in data)
'1,2,3'
You can also use map()
function. Map will apply str()
to all elements
in data
. More information in Generator Mapping:
>>> data = [1, 2, 3]
>>> ','.join(map(str,data))
'1,2,3'
5.5.10. Is Whitespace
str.isspace()
- Is whitespace (space, tab, newline)`` `` - space
\t
- tab\n
- newline
>>> text = ''
>>> text.isspace()
False
>>> text = ' '
>>> text.isspace()
True
>>> text = '\t'
>>> text.isspace()
True
>>> text = '\n'
>>> text.isspace()
True
5.5.11. Is Alphabet Characters
text in 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> text = 'hello'
>>> text.isalpha()
True
>>> text = 'hello1'
>>> text.isalpha()
False
5.5.12. Is Numeric
str.isdecimal()
str.isdigit()
str.isnumeric()
str.isalnum()
>>> '1'.isdecimal()
True
>>>
>>> '+1'.isdecimal()
False
>>>
>>> '-1'.isdecimal()
False
>>>
>>> '1.'.isdecimal()
False
>>>
>>> '1,'.isdecimal()
False
>>>
>>> '1.0'.isdecimal()
False
>>>
>>> '1,0'.isdecimal()
False
>>>
>>> '1_0'.isdecimal()
False
>>>
>>> '10'.isdecimal()
True
>>> '1'.isdigit()
True
>>>
>>> '+1'.isdigit()
False
>>>
>>> '-1'.isdigit()
False
>>>
>>> '1.'.isdigit()
False
>>>
>>> '1,'.isdigit()
False
>>>
>>> '1.0'.isdigit()
False
>>>
>>> '1,0'.isdigit()
False
>>>
>>> '1_0'.isdigit()
False
>>>
>>> '10'.isdigit()
True
>>> '1'.isnumeric()
True
>>>
>>> '+1'.isnumeric()
False
>>>
>>> '-1'.isnumeric()
False
>>>
>>> '1.'.isnumeric()
False
>>>
>>> '1.0'.isnumeric()
False
>>>
>>> '1,0'.isnumeric()
False
>>>
>>> '1_0'.isnumeric()
False
>>>
>>> '10'.isnumeric()
True
>>> '1'.isalnum()
True
>>>
>>> '+1'.isalnum()
False
>>>
>>> '-1'.isalnum()
False
>>>
>>> '1.'.isalnum()
False
>>>
>>> '1,'.isalnum()
False
>>>
>>> '1.0'.isalnum()
False
>>>
>>> '1,0'.isalnum()
False
>>>
>>> '1_0'.isalnum()
False
>>>
>>> '10'.isalnum()
True
5.5.13. Find Sub-String Position
str.find()
- Finds position of a letter in textreturns -1 if not found
Finds position of a letter in text:
>>> text = 'We choose to go to the Moon'
>>> text.find('M')
23
Will find first occurrence:
>>> text = 'We choose to go to the Moon'
>>> text.find('o')
5
Also works on substrings:
>>> text = 'We choose to go to the Moon'
>>> text.find('Moo')
23
Will yield -1
if substring is not found:
>>> text = 'We choose to go to the Moon'
>>> text.find('x')
-1
5.5.14. Count Occurrences
str.count()
returns 0 if not found
>>> text = 'Moon'
>>>
>>>
>>> text.count('o')
2
>>>
>>> text.count('Moo')
1
>>>
>>> text.count('x')
0
5.5.15. Remove Prefix or Suffix
str.removeprefix()
str.removesuffix()
Since Python 3.9: PEP 616 -- String methods to remove prefixes and suffixes
>>> filename = 'myfile.txt'
>>> filename.removeprefix('my')
'file.txt'
>>> filename = 'myfile.txt'
>>> filename.removesuffix('.txt')
'myfile'
5.5.16. Method Chaining
>>> text = 'Python'
>>>
>>> text = text.upper()
>>> text = text.replace('P', 'C')
>>> text = text.title()
>>>
>>> print(text)
Cython
>>> text = 'Python'
>>>
>>> text = text.upper().replace('P', 'C').title()
>>>
>>> print(text)
Cython
Note, that there cannot be any char, not even space after \
character:
>>> text = 'Python'
>>>
>>> text = text.upper() \
... .replace('P', 'C') \
... .title()
>>>
>>> print(text)
Cython
Backslash method is very error-prone, this is the reason why brackets are recommended:
>>> text = 'Python'
>>>
>>> text = (
... text
... .upper()
... .replace('P', 'C')
... .title()
... )
>>>
>>> print(text)
Cython
How it works:
text -> 'Python'
'Python'.upper() -> 'PYTHON'
'PYTHON'.replace('P', 'C') -> 'CYTHON'
'CYTHON'.title() -> 'Cython'
>>> text = 'Python'
>>>
>>> text = text.upper().startswith('P').replace('P', 'C')
Traceback (most recent call last):
AttributeError: 'bool' object has no attribute 'replace'
5.5.17. Use Case - 1
>>> text = 'cześć'
>>>
>>> text.find('ś')
3
>>> text[3]
'ś'
5.5.18. Use Case - 2
>>> line = '5.1,3.5,1.4,0.2,setosa'
>>>
>>> line.split(',')
['5.1', '3.5', '1.4', '0.2', 'setosa']
5.5.19. Use Case - 3
>>> line = '5.1,3.5,1.4,0.2,setosa\n'
>>>
>>> line.split(',')
['5.1', '3.5', '1.4', '0.2', 'setosa\n']
>>>
>>> line.strip().split(',')
['5.1', '3.5', '1.4', '0.2', 'setosa']
5.5.20. Use Case - 4
>>> data = ['5.1', '3.5', '1.4', '0.2', 'setosa']
>>>
>>> ','.join(data)
'5.1,3.5,1.4,0.2,setosa'
>>>
>>> ','.join(data) + '\n'
'5.1,3.5,1.4,0.2,setosa\n'
5.5.21. Use Case - 5
map(str, data)
- Apply functionstr()
to every element indata
(str(x) for x in data)
- Apply functionstr()
to every element indata
Both are equivalent
More info Idiom Map
More info Generator Expression
>>> data = [5.1, 3.5, 1.4, 0.2, 'setosa']
>>> ','.join(data)
Traceback (most recent call last):
TypeError: sequence item 0: expected str instance, float found
>>> ','.join(map(str,data))
'5.1,3.5,1.4,0.2,setosa'
>>> ','.join(str(x) for x in data)
'5.1,3.5,1.4,0.2,setosa'
5.5.22. Use Case - 6
>>> lvl = '[WARNING]'
>>> lvl.removeprefix('[').removesuffix(']')
'WARNING'
>>> lvl = '[WARNING]'
>>> lvl.replace('[', '').replace(']', '')
'WARNING'
>>> lvl = '[WARNING]'
>>> lvl.strip('[]')
'WARNING'
5.5.23. Use Case - 7
>>> line = '1969-07-21,02:56:15,WARNING,First step on the Moon'
>>>
>>> line.split(',', maxsplit=3)
['1969-07-21', '02:56:15', 'WARNING', 'First step on the Moon']
5.5.24. Use Case - 8
>>> line = '1969-07-21T02:56:15.123 [WARNING] First step on the Moon'
>>> dt, lvl, msg = line.split(maxsplit=2)
>>> dt.split('T')
['1969-07-21', '02:56:15.123']
>>> lvl.strip('[]')
'WARNING'
>>> msg.title()
'First Step On The Moon'
5.5.25. Use Case - 9
>>> DATA = """1969-07-14, 21:00:00, INFO, Terminal countdown started
... 1969-07-16, 13:31:53, WARNING, S-IC engine ignition (#5)
... 1969-07-16, 13:33:23, DEBUG, Maximum dynamic pressure (735.17 lb/ft^2)
... 1969-07-16, 13:34:44, WARNING, S-II ignition
... 1969-07-16, 13:35:17, DEBUG, Launch escape tower jettisoned
... 1969-07-16, 13:39:40, DEBUG, S-II center engine cutoff
... 1969-07-16, 16:22:13, INFO, Translunar injection
... 1969-07-16, 16:56:03, INFO, CSM docked with LM/S-IVB
... 1969-07-16, 17:21:50, INFO, Lunar orbit insertion ignition
... 1969-07-16, 21:43:36, INFO, Lunar orbit circularization ignition
... 1969-07-20, 17:44:00, INFO, CSM/LM undocked
... 1969-07-20, 20:05:05, WARNING, LM powered descent engine ignition
... 1969-07-20, 20:10:22, ERROR, LM 1202 alarm
... 1969-07-20, 20:14:18, ERROR, LM 1201 alarm
... 1969-07-20, 20:17:39, WARNING, LM lunar landing
... 1969-07-21, 02:39:33, DEBUG, EVA started (hatch open)
... 1969-07-21, 02:56:15, WARNING, 1st step taken lunar surface (CDR)
... 1969-07-21, 02:56:15, WARNING, Neil Armstrong first words on the Moon
... 1969-07-21, 03:05:58, DEBUG, Contingency sample collection started (CDR)
... 1969-07-21, 03:15:16, INFO, LMP on lunar surface
... 1969-07-21, 05:11:13, DEBUG, EVA ended (hatch closed)
... 1969-07-21, 17:54:00, WARNING, LM lunar liftoff ignition (LM APS)
... 1969-07-21, 21:35:00, INFO, CSM/LM docked
... 1969-07-22, 04:55:42, WARNING, Transearth injection ignition (SPS)
... 1969-07-24, 16:21:12, INFO, CM/SM separation
... 1969-07-24, 16:35:05, WARNING, Entry
... 1969-07-24, 16:50:35, WARNING, Splashdown (went to apex-down)
... 1969-07-24, 17:29, INFO, Crew egress"""
>>>
>>> DATA.splitlines()
['1969-07-14, 21:00:00, INFO, Terminal countdown started',
'1969-07-16, 13:31:53, WARNING, S-IC engine ignition (#5)',
'1969-07-16, 13:33:23, DEBUG, Maximum dynamic pressure (735.17 lb/ft^2)',
'1969-07-16, 13:34:44, WARNING, S-II ignition',
'1969-07-16, 13:35:17, DEBUG, Launch escape tower jettisoned',
'1969-07-16, 13:39:40, DEBUG, S-II center engine cutoff',
'1969-07-16, 16:22:13, INFO, Translunar injection',
'1969-07-16, 16:56:03, INFO, CSM docked with LM/S-IVB',
'1969-07-16, 17:21:50, INFO, Lunar orbit insertion ignition',
'1969-07-16, 21:43:36, INFO, Lunar orbit circularization ignition',
'1969-07-20, 17:44:00, INFO, CSM/LM undocked',
'1969-07-20, 20:05:05, WARNING, LM powered descent engine ignition',
'1969-07-20, 20:10:22, ERROR, LM 1202 alarm',
'1969-07-20, 20:14:18, ERROR, LM 1201 alarm',
'1969-07-20, 20:17:39, WARNING, LM lunar landing',
'1969-07-21, 02:39:33, DEBUG, EVA started (hatch open)',
'1969-07-21, 02:56:15, WARNING, 1st step taken lunar surface (CDR)',
'1969-07-21, 02:56:15, WARNING, Neil Armstrong first words on the Moon',
'1969-07-21, 03:05:58, DEBUG, Contingency sample collection started (CDR)',
'1969-07-21, 03:15:16, INFO, LMP on lunar surface',
'1969-07-21, 05:11:13, DEBUG, EVA ended (hatch closed)',
'1969-07-21, 17:54:00, WARNING, LM lunar liftoff ignition (LM APS)',
'1969-07-21, 21:35:00, INFO, CSM/LM docked',
'1969-07-22, 04:55:42, WARNING, Transearth injection ignition (SPS)',
'1969-07-24, 16:21:12, INFO, CM/SM separation',
'1969-07-24, 16:35:05, WARNING, Entry',
'1969-07-24, 16:50:35, WARNING, Splashdown (went to apex-down)',
'1969-07-24, 17:29, INFO, Crew egress']
5.5.26. Use Case - 10
>>> DATA = 'ul. pANA tWARdoWSKiego 3'
>>>
>>> result = (
... DATA
...
... # Normalize
... .upper()
...
... # Remove whitespace control chars
... .replace('\n', ' ')
... .replace('\t', ' ')
... .replace('\v', ' ')
... .replace('\f', ' ')
...
... # Remove whitespaces
... .replace(' ', ' ')
... .replace(' ', ' ')
... .replace(' ', ' ')
...
... # Remove special characters
... .replace('$', '')
... .replace('@', '')
... .replace('#', '')
... .replace('^', '')
... .replace('&', '')
... .replace('.', '')
... .replace(',', '')
... .replace('|', '')
...
... # Remove prefixes
... .removeprefix('ULICA')
... .removeprefix('UL')
... .removeprefix('OSIEDLE')
... .removeprefix('OS')
...
... # Substitute
... .replace('3', 'III')
... .replace('2', 'II')
... .replace('1', 'I')
...
... # Format output
... .title()
... .replace('Iii', 'III')
... .replace('Ii', 'II')
... .strip()
... )
>>>
>>> print(result)
Pana Twardowskiego III
5.5.27. References
5.5.28. Assignments
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -v myfile.py`
# %% About
# - Name: Type Str Splitlines
# - Difficulty: easy
# - Lines: 1
# - Minutes: 2
# %% English
# 1. Split `DATA` by lines
# 2. Run doctests - all must succeed
# %% Polish
# 1. Podziel `DATA` po liniach
# 2. Uruchom doctesty - wszystkie muszą się powieść
# %% Tests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python 3.9+ required'
>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert len(result) == 3, \
'Variable `result` length should be 3'
>>> assert type(result) is list, \
'Variable `result` has invalid type, should be list'
>>> line = 'We choose to go to the Moon'
>>> assert line in result, f'Line "{line}" is not in the result'
>>> line = 'in this decade and do the other things.'
>>> assert line in result, f'Line "{line}" is not in the result'
>>> line = 'Not because they are easy, but because they are hard.'
>>> assert line in result, f'Line "{line}" is not in the result'
"""
DATA = """We choose to go to the Moon
in this decade and do the other things.
Not because they are easy, but because they are hard."""
# Split DATA by lines
# type: list[str]
result = ...
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -v myfile.py`
# %% About
# - Name: Type Str Join
# - Difficulty: easy
# - Lines: 1
# - Minutes: 3
# %% English
# 1. Join lines of text with newline (`\n`) character
# 2. Run doctests - all must succeed
# %% Polish
# 1. Połącz linie tekstu znakiem końca linii (`\n`)
# 2. Uruchom doctesty - wszystkie muszą się powieść
# %% Hints
# - `str.join()`
# %% Tests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python 3.9+ required'
>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is str, \
'Variable `result` has invalid type, should be str'
>>> assert result.count('\\n') == 2, \
'There should be only two newline characters in result'
>>> line = 'We choose to go to the Moon'
>>> assert line in result, f'Line "{line}" is not in the result'
>>> line = 'in this decade and do the other things.'
>>> assert line in result, f'Line "{line}" is not in the result'
>>> line = 'Not because they are easy, but because they are hard.'
>>> assert line in result, f'Line "{line}" is not in the result'
"""
DATA = [
'We choose to go to the Moon',
'in this decade and do the other things.',
'Not because they are easy, but because they are hard.',
]
# Join DATA with newline (`\n`) character
# type: str
result = ...
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -v myfile.py`
# %% About
# - Name: Type Str Normalize
# - Difficulty: easy
# - Lines: 4
# - Minutes: 8
# %% English
# 1. Use `str` methods to clean `DATA`
# 2. Run doctests - all must succeed
# %% Polish
# 1. Wykorzystaj metody `str` do oczyszczenia `DATA`
# 2. Uruchom doctesty - wszystkie muszą się powieść
# %% Tests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python 3.9+ required'
>>> from pprint import pprint
>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is str, \
'Variable `result` has invalid type, should be str'
>>> pprint(result)
'Pana Twardowskiego III'
"""
DATA = 'UL. pana \tTWArdoWskIEGO 3'
# Expected result: 'Pana Twardowskiego III'
# type: str
result = ...
# %% License
# - Copyright 2025, Matt Harasymczuk <matt@python3.info>
# - This code can be used only for learning by humans
# - This code cannot be used for teaching others
# - This code cannot be used for teaching LLMs and AI algorithms
# - This code cannot be used in commercial or proprietary products
# - This code cannot be distributed in any form
# - This code cannot be changed in any form outside of training course
# - This code cannot have its license changed
# - If you use this code in your product, you must open-source it under GPLv2
# - Exception can be granted only by the author
# %% Run
# - PyCharm: right-click in the editor and `Run Doctest in ...`
# - PyCharm: keyboard shortcut `Control + Shift + F10`
# - Terminal: `python -m doctest -v myfile.py`
# %% About
# - Name: Type Str Normalization
# - Difficulty: easy
# - Lines: 8
# - Minutes: 13
# %% English
# 1. Expected value is `Pana Twardowskiego III`
# 2. Use only `str` methods to clean each variable
# 3. Discuss how to create generic solution which fit all cases
# 4. Implementation of such generic function will be in
# `Function Arguments Clean` chapter
# 5. Run doctests - all must succeed
# %% Polish
# 1. Oczekiwana wartość `Pana Twardowskiego III`
# 2. Wykorzystaj tylko metody `str` do oczyszczenia każdej zmiennej
# 3. Przeprowadź dyskusję jak zrobić rozwiązanie generyczne pasujące
# do wszystkich przypadków
# 4. Implementacja takiej generycznej funkcji będzie w rozdziale
# `Function Arguments Clean`
# 5. Uruchom doctesty - wszystkie muszą się powieść
# %% Tests
"""
>>> import sys; sys.tracebacklimit = 0
>>> assert sys.version_info >= (3, 9), \
'Python 3.9+ required'
>>> assert example is not Ellipsis, \
'Assign your result to variable `example`'
>>> assert a is not Ellipsis, \
'Assign your result to variable `a`'
>>> assert b is not Ellipsis, \
'Assign your result to variable `b`'
>>> assert c is not Ellipsis, \
'Assign your result to variable `c`'
>>> assert d is not Ellipsis, \
'Assign your result to variable `d`'
>>> assert e is not Ellipsis, \
'Assign your result to variable `e`'
>>> assert f is not Ellipsis, \
'Assign your result to variable `f`'
>>> assert g is not Ellipsis, \
'Assign your result to variable `g`'
>>> assert type(example) is str, \
'Variable `example` has invalid type, should be str'
>>> assert type(a) is str, \
'Variable `a` has invalid type, should be str'
>>> assert type(b) is str, \
'Variable `b` has invalid type, should be str'
>>> assert type(c) is str, \
'Variable `c` has invalid type, should be str'
>>> assert type(d) is str, \
'Variable `d` has invalid type, should be str'
>>> assert type(e) is str, \
'Variable `e` has invalid type, should be str'
>>> assert type(f) is str, \
'Variable `f` has invalid type, should be str'
>>> assert type(g) is str, \
'Variable `g` has invalid type, should be str'
>>> example
'Pana Twardowskiego III'
>>> a
'Pana Twardowskiego III'
>>> b
'Pana Twardowskiego III'
>>> c
'Pana Twardowskiego III'
>>> d
'Pana Twardowskiego III'
>>> e
'Pana Twardowskiego III'
>>> f
'Pana Twardowskiego III'
>>> g
'Pana Twardowskiego III'
"""
EXAMPLE = 'UL. Pana \tTWArdoWskIEGO 3'
A = 'ulica Pana Twardowskiego III'
B = 'ul Pana Twardowskiego III'
C = 'ul. Pana Twardowskiego III'
D = 'Pana Twardowskiego 3'
E = ' Pana Twardowskiego III\t'
F = 'Pana\t Twardowskiego III'
G = 'Pana Twardowskiego III\n'
example = (
EXAMPLE
.upper()
.replace('UL. ', '')
.replace('\t', '')
.strip()
.title()
.replace('3', 'III')
)
# Expected result: 'Pana Twardowskiego III'
# type: str
a = ...
# Expected result: 'Pana Twardowskiego III'
# type: str
b = ...
# Expected result: 'Pana Twardowskiego III'
# type: str
c = ...
# Expected result: 'Pana Twardowskiego III'
# type: str
d = ...
# Expected result: 'Pana Twardowskiego III'
# type: str
e = ...
# Expected result: 'Pana Twardowskiego III'
# type: str
f = ...
# Expected result: 'Pana Twardowskiego III'
# type: str
g = ...