16.1. Threading About

  • Zaletą wątków jest to, że mają współdzielony stan

  • Jeden wątek może zapisać kod do pamięci a drugi odczytać bez narzutu komunikacyjnego

  • Wadą jest również współdzielony stan i race condition

  • Ideą wątków jest tani dostęp do współdzielonej pamięci, tylko trzeba wszędzie wstawiać locki

  • Run very fast, but hard to get correct

  • It's insanely difficult to create large multi-threaded programs with multiple locks

  • Even if you lock resource, there is no protection if other parts of the system do not even try to acquire the lock

  • Threads switch preemptively

  • Preemptively means that the thread manager decides to switch tasks for you (you don't have to explicitly say to do so). Programmer has to do very little.

  • This is convenient because you don't need to add explicit code to cause a task switch

  • The cost of this convenience is that you have to assume a switch can happen at any time

  • Accordingly, critical sections have to be a guarded with locks

  • The limit on threads is total CPU power minus the cost of tasks switches and synchronization overhead

  • Why Should Async Get All The Love Advanced Control Flow With Threads [6]


16.1.1. Frequently Asked Questions

  1. Co to jest wątek?

  2. Ile czasu trwa tworzenie wątków?

  3. Kto zarządza wątkami?

  4. Ile może być równoległych wątków?

  5. Ile wątków może być w ramach jednego procesu?

  6. Jak komunikować się między wątkami?

  7. Czy współdzielenie pamięci przez wątki jest dobre czy złe?


Figure 16.4. Green: actual data transfer; blue: waiting; orange: domain name resolution, TLS handshake, etc. Source: Langa, Ł. import asyncio: Learn Python's AsyncIO [3]


Figure 16.5. Source: Michael Kennedy [2]

Every real operating system thread allocates full sized callstack. It's overhead. So you cannot run hundreds of threads without without sacrificing resources [1].

16.1.2. Daemon

Some threads do background tasks, like sending keepalive packets, or performing periodic garbage collection, or whatever. These are only useful when the main program is running, and it's okay to kill them off once the other, non-daemon, threads have exited.

Without daemon threads, you'd have to keep track of them, and tell them to exit, before your program can completely quit. By setting them as daemon threads, you can let them run and forget about them, and when your program quits, any daemon threads are killed automatically.

16.1.3. GIL

  • Global Interpreter Lock

  • CPython has a lock for its internal shared global state

  • One lock instead of hundreds smaller

  • The unfortunate effect of GIL is that no more than one thread can run at a time

  • For I/O bound applications, GIL doesn't present much of an issue

  • For CPU bound applications, using threads makes the application speed worse

  • Accordingly, that drives us to multiprocessing to gain more CPU cycles

  • Larry Hastings, Gilectomy project - removed GIL, but Python slowed down

Source: [1]


Figure 16.6. Source: Michael Kennedy [2]

16.1.4. Thread-safety

  • Thread-safe code is code that will work even if many Threads are executing it simultaneously.

Thread-safe code is code that will work even if many Threads are executing it simultaneously. Writing it is a black art. It is extremely difficult to debug since you can't reproduce all possible interactions between Threads. You have to do it by logic. In a computer, something that happens only one in a billion times must be dealt with because on average it will happen once a second. To write code that will run stably for weeks takes extreme paranoia [4].

A class is thread-safe if it behaves correctly when accessed from multiple threads, regardless of the scheduling or interleaving of the execution of those threads by the runtime environment, and with no additional synchronization or other coordination on the part of the calling code [5].

16.1.5. References