13.2 The threading Module in Python

The threading module in Python provides the tools necessary for creating and managing threads, allowing your program to perform multiple tasks concurrently. This module gives you the ability to spawn threads, manage their execution, and synchronize them when necessary, making it ideal for handling I/O-bound operations, background tasks, and tasks that can run in parallel.

In this section, we will dive deeper into the functionality provided by the threading module, covering important classes, methods, and techniques for managing threads in Python.


13.2.1 Basic Thread Class

The threading module provides a Thread class that represents an individual thread of execution. You can create a thread by either:

  • Passing a target function to the Thread class, or
  • Subclassing the Thread class and overriding the run() method.

Example 1: Creating a Thread Using a Target Function

import threading
import time

def print_numbers():
    for i in range(5):
        time.sleep(1)
        print(f"Thread: {i}")

# Create a thread by passing a target function
thread = threading.Thread(target=print_numbers)
thread.start()

# Main thread continues running
for i in range(5):
    time.sleep(1)
    print(f"Main thread: {i}")

In this example:

  • A new thread is created by passing the print_numbers() function as the target to the Thread class.
  • The thread starts running when thread.start() is called.

Example 2: Creating a Thread by Subclassing Thread

import threading
import time

# Define a class that subclasses Thread
class PrintNumbersThread(threading.Thread):
    def run(self):
        for i in range(5):
            time.sleep(1)
            print(f"Thread: {i}")

# Create and start the custom thread
thread = PrintNumbersThread()
thread.start()

# Main thread continues running
for i in range(5):
    time.sleep(1)
    print(f"Main thread: {i}")

In this example:

  • The PrintNumbersThread class subclasses Thread and overrides the run() method, which contains the logic to be executed when the thread starts.

13.2.2 Controlling Thread Execution

The threading module provides various methods to control and manage thread execution.

  • start(): Starts the thread’s activity by invoking its run() method.
  • join([timeout]): Waits for the thread to finish execution. Optionally, a timeout can be provided.
  • is_alive(): Returns True if the thread is still running, and False otherwise.
  • name: The name of the thread, which can be set or retrieved.

Example: Using join() and is_alive()

import threading
import time

def print_numbers():
    for i in range(3):
        time.sleep(1)
        print(f"Thread: {i}")

# Create and start the thread
thread = threading.Thread(target=print_numbers)
thread.start()

# Wait for the thread to finish
thread.join()  # Main thread waits for the thread to complete
print("Thread has finished, continuing in the main thread.")

In this example:

  • thread.join() makes the main thread wait for the worker thread to finish before continuing.

13.2.3 Synchronization Primitives

When multiple threads share data or resources, race conditions can occur if threads access or modify the same data at the same time. The threading module provides several synchronization primitives to prevent these issues.

Locks

A lock is the simplest synchronization primitive. A thread can acquire a lock before accessing a shared resource, and release it afterward, ensuring that only one thread at a time can access the resource.

  • Lock(): Creates a new lock object.
  • acquire([blocking]): Acquires the lock. If blocking is set to False, the method will return immediately with False if the lock cannot be acquired.
  • release(): Releases the lock.

Example: Using Locks to Prevent Race Conditions

import threading

# Shared resource
counter = 0
lock = threading.Lock()

def increment_counter():
    global counter
    for _ in range(1000):
        # Acquire the lock before modifying the shared resource
        with lock:
            counter += 1

# Create two threads that increment the counter
thread1 = threading.Thread(target=increment_counter)
thread2 = threading.Thread(target=increment_counter)

# Start the threads
thread1.start()
thread2.start()

# Wait for both threads to finish
thread1.join()
thread2.join()

print(f"Final counter value: {counter}")  # Output: Final counter value: 2000

In this example:

  • The with lock statement ensures that only one thread at a time can modify the shared counter variable, preventing race conditions.

13.2.4 Other Synchronization Primitives

The threading module provides additional synchronization primitives for more complex scenarios.

RLock (Reentrant Lock)

An RLock (Reentrant Lock) allows a thread to acquire the same lock multiple times. The thread must release the lock the same number of times it acquired it before any other thread can acquire the lock.

import threading

rlock = threading.RLock()

def task():
    with rlock:
        print("Acquired lock")
        with rlock:  # Reentrant locking
            print("Re-acquired lock")

thread = threading.Thread(target=task)
thread.start()
thread.join()

Condition

A Condition allows one or more threads to wait until they are notified that a particular condition has been met. It is often used in conjunction with a lock to synchronize access to shared resources.

  • wait(): Releases the lock and blocks until another thread calls notify().
  • notify(): Wakes up one of the waiting threads.
  • notify_all(): Wakes up all waiting threads.

Example: Using Condition for Synchronization

import threading

condition = threading.Condition()

def consumer():
    with condition:
        print("Consumer is waiting.")
        condition.wait()  # Wait until the producer notifies
        print("Consumer is proceeding.")

def producer():
    with condition:
        print("Producer is producing.")
        condition.notify()  # Notify the consumer to proceed

# Create and start the threads
consumer_thread = threading.Thread(target=consumer)
producer_thread = threading.Thread(target=producer)

consumer_thread.start()
producer_thread.start()

# Wait for the threads to finish
consumer_thread.join()
producer_thread.join()

In this example:

  • The consumer thread waits for the condition to be notified by the producer thread.
  • When the producer calls notify(), the consumer is unblocked and continues its execution.

13.2.5 Thread Communication with Queue

When threads need to communicate with each other, the queue.Queue class provides a thread-safe way to exchange data between threads. Queues ensure that data is passed between threads in a safe and orderly fashion without requiring explicit locks.

  • put(item): Adds an item to the queue.
  • get(): Removes and returns an item from the queue.
  • task_done(): Signals that a formerly enqueued task is complete.
  • join(): Blocks until all tasks are marked as done.

Example: Using a Queue to Communicate Between Threads

import threading
import queue
import time

# Create a queue
q = queue.Queue()

def producer():
    for i in range(5):
        time.sleep(1)
        item = f"Item {i}"
        q.put(item)
        print(f"Produced: {item}")

def consumer():
    while True:
        item = q.get()  # Blocks until an item is available
        if item is None:
            break
        print(f"Consumed: {item}")
        q.task_done()  # Signal that the task is done

# Create and start the threads
producer_thread = threading.Thread(target=producer)
consumer_thread = threading.Thread(target=consumer)

producer_thread.start()
consumer_thread.start()

# Wait for the producer to finish
producer_thread.join()

# Send a signal to the consumer to exit
q.put(None)
consumer_thread.join()

In this example:

  • The producer thread adds items to the queue using q.put().
  • The consumer thread retrieves items from the queue using q.get() and processes them.
  • The producer sends a None item to signal the consumer to stop.

13.2.6 Daemon Threads and Background Tasks

Daemon threads are background threads that run in the background and do not block the program from exiting. Once all non-daemon threads have finished, the program will exit, and daemon threads will be terminated.

Example: Creating a Daemon Thread

import threading
import time

def background_task():
    while True:
        print("Background task running...")
        time.sleep(1)

# Create a daemon thread
daemon_thread = threading.Thread(target=background_task)
daemon_thread.daemon = True
daemon_thread.start()



# Main thread continues for a few seconds
time.sleep(5)
print("Main thread exiting, background task will be terminated.")

In this example:

  • The daemon thread runs a background task. When the main thread finishes, the daemon thread will automatically be terminated.

13.2.7 Thread Safety and the Global Interpreter Lock (GIL)

Python's Global Interpreter Lock (GIL) prevents multiple threads from executing Python bytecode simultaneously. This limits the effectiveness of multithreading for CPU-bound tasks, as only one thread can execute at a time, even on multi-core systems.

However, for I/O-bound tasks, threading can still be beneficial since threads spend most of their time waiting for input/output operations to complete rather than executing Python code.

For CPU-bound tasks, consider using the multiprocessing module instead of threading to take full advantage of multiple cores.


13.2.8 Summary

The threading module in Python provides tools to efficiently manage and synchronize threads. It allows you to run tasks concurrently, communicate between threads, and handle synchronization issues that arise when threads share resources.

  • Thread Class: Represents a single thread of execution.
  • Synchronization Primitives: Locks, RLocks, and Conditions are used to synchronize access to shared resources.
  • Queues: Provide a thread-safe way for threads to communicate and share data.
  • Daemon Threads: Run in the background and do not block program termination.
  • Global Interpreter Lock (GIL): Limits parallel execution of Python bytecode in CPU-bound tasks.

Threading is particularly useful for I/O-bound tasks, background processing, and concurrent execution in scenarios where tasks can be run independently of each other. Understanding how to manage threads, synchronize their actions, and communicate between them is crucial for writing efficient, concurrent Python programs.