What Exactly is Multithreading?

Photo by hp koch on Unsplash

What Exactly is Multithreading?

exploring multithreading using python threading module

Overview

Multithreading is an important concept in Computer Architecture and Programming. It is one of the most common processing techniques used to increase the amount of processes that the computer can perform in a given period of time and to increase its efficiency. This article explains

  • how to run multiple threads concurrently
  • how to wait for a thread to finish execution
  • how to avoid concurrency issues using a real life situation as analogy and lastly
  • the use cases of multithreading

Introduction to Multithreading

First off, I am so certain you are reading this article with three or more programs opened alongside your browser. Lemme guess those programs: slack, a media player and VS code. You are able to do that because your operating system can perform Multitasking.


Multitasking refers to the ability of an operating system to have more than one program (also called a task) open at one time. For example, multitasking allows a user to edit a spreadsheet file in one window while loading a Web page in another window or to retrieve new e-mail messages in one window while a word processing document is open in another window.


Another thing I am also sure of is that you currently have more than one tab opened.

Now, what is Multithreading? It can simply be defined as the process of executing two or more threads concurrently in order to increase performance and execution time. You can think of multithreading as multitasking that occurs within a program.

What is a Thread?

Thread is an entity within a process that can be scheduled for execution. It can be explained as the sequence of instructions that can be executed independently of other codes. The following points should be noted about thread:

  • Each thread has it own unique identifier, register set, stack pointer, parent process pointer and program counter.
  • Global variable common to all threads are store in memory heap

Examples of Multithreading

Majority of the applications you use have multiple threads running behind scenes to improve the processing time, efficiency and more importantly to make better use of computing resources on high end PCS.

Consider the following applications

In VS Code, multiple threads are used to type the codes, asynchronously check for syntax , run live server. These are all happening concurrently, with independent threads performing these tasks internally.

Also in your browser, You can have multiple tabs opened each one independent of the other. Multiple threads of execution are used to load content, display animations, play sound & video, and so on.

With that being said, let's get started with how you can start a function on custom thread in python.

With the threading module you can call a function or method defined to perform a particular job to start execution on another thread instead of calling it normally in the main thread of the process. Here is how it can be doneπŸ‘‡

import threading

like_counter = 0

def like():
    global  like_counter
    like_counter += 1

def thread_task():
    for i in range(100000):
        like()

def main():
    global like_counter
    like_counter = 0
    t1 = threading.Thread(target=thread_task)
    t2 = threading.Thread(target=thread_task)

    t1.start()
    t2.start()

    t1.join()
    t2.join()

for i in range(10):
    main()
    print(f'Iteration {i} like counter: {like_counter}')

IMG_20220510_091325_245_1652170579323.jpg

What's happening in the above code snippet?

The like_counter is a global variable and it's the shared resource of this program. We use the Thread class from the threading module to create two threads t1 & t2. The like function is passed as value to the target argument of the Thread class

t1.start() & t2.start(): these instruct the program to start the like functions on another thread instead of calling the function normally as in like().

t1.join() & t2.join(): ensure that the like functions finish execution before executing the next instruction of the program in flow control.

On executing the codes above, you would get a output like this one:

Iteration 0 like counter: 200000
Iteration 1 like counter: 200000
Iteration 2 like counter: 160179
Iteration 3 like counter: 200000
Iteration 4 like counter: 139116
Iteration 5 like counter: 200000
Iteration 6 like counter: 200000
Iteration 7 like counter: 200000
Iteration 8 like counter: 198254
Iteration 9 like counter: 170863

Below is a diagrammatic representation of the program showing the Thread Control Block (TCB).

IMG_20220510_102209_934_1652174862120.jpg

You expected the final value of like_counter to be 200000 but what we get in 10 iterations of main() is some different values.

This happens due to threads having access to the shared variable like_counter at the same time. This unpredictability in its value is nothing but race condition which leads us to the next section in this article.

Multithreading Challenges

Race Condition

Race condition is a condition in which two or more threads perform read and write operation to the same variable at the same time or modify the common resources of a program simultaneously. This usually leads to an inconsistency of the value of the variable.

Given below is a diagram which shows how race condition can occur in the above program:

IMG_20220510_091355_731_1652170548330.jpg

NB: The expected value of like_counter in above diagram is 102 but due to race condition, it turns out to be 101!

The issues stated above can be avoided using an approach termed Thread Synchronization

Thread Synchronization

Thread Synchronization is a mechanism that ensures two or more threads don't have access to the critical section of a program at the same time.

Critical section is the part of a program where shared resources can be accessed. It is the segment that contains global variables, common files, shared memory.

In the preceding program, the critical segment of the program is the like function code block.

Lock

Lock class is a proper synchronization tool used to avoid race condition.

You just saw what happened to the shared variable like_counter in the preceding program. In this section, you are going to use lock object to prevent race condition from happening.

A quick illustration on how Lock object works:

Typically in real life, locks are used to protect our belonging, to restrict access to place or thing. You always lock the bathroom door when you are in because you don't want others to have access or make use of the bathroom at the same time you are using it.

Unless probably you want to have a bath 🚿 together with someone else. You get the logic.

Another person interested in using the bathroom at the time you are using it won't be able to. What will he / she then do? Well, what most people do (or that's what I always do πŸ˜€) is that you hang around at the lobby, wait for some period of time before trying again and then continue this approach until the bathroom is available.

I hope you spot recursion in this case. I'm sure you can also identify the base case and recursive case

The lock class works the same way and it's implemented using the semaphore object of the operating system.

Semaphore object is a synchronization object that controls how multiple processes or threads access a common resources in parallel programming environment.

import threading

like_counter = 0

def like():
    global  like_counter
    like_counter += 1

def thread_task(lock):
    for i in range(100000):
        lock.acquire()
        like()
        lock.release()

def main():
    global like_counter
    like_counter = 0

    lock = threading.Lock()

    t1 = threading.Thread(target=thread_task, args=(lock,))
    t2 = threading.Thread(target=thread_task, args=(lock,))

    t1.start()
    t2.start()

    t1.join()
    t2.join()

for i in range(10):
    main()
    print(f'Iteration {i} like counter: {like_counter}')

πŸ”’ lock .acquire() ensures no other thread can access the shared variable.

πŸ”“ lock.release() makes the shared variable available.

If you re-run the code now you will get a consistent value for like_counter as opposed to before.

Iteration 0 like counter: 200000
Iteration 1 like counter: 200000
Iteration 2 like counter: 200000
Iteration 3 like counter: 200000
Iteration 4 like counter: 200000
Iteration 5 like counter: 200000
Iteration 6 like counter: 200000
Iteration 7 like counter: 200000
Iteration 8 like counter: 200000
Iteration 9 like counter: 200000

Use Cases of Multithreading

  • To keep a GUI application responsive: You might have experienced a situation whereby you clicked a button on an application interface and the application froze and came back to life after few seconds. What usually causes that is because the application is running on a single thread (main thread) and the function call blocks the execution of the only thread of all the entire components of the application.

froze.gif

Multiple threads can be used to solve the issue.

Consider a GUI application that fetches data from an Application Programming Interface (API):

  • GUI can be served using the main thread
  • Another thread can be used to fetch the data from the API
  • Another thread can be used to send result to the main thread

By doing so, application responsiveness is improved as executing one task/function does not block the execution of the other.

To gain more insight about doing something like this using tkinter and python, click here

  • Website handling traffic: A web server uses multiple threads to simultaneously process requests for data at the same time. This allows multiple users to use a website at the same time.

Final Thoughts

You are already aware that your operating system can run multiple programs at the same time. In this article, you learnt how multiple tasks can be done within a program concurrently, how to avoid issues that come with using multithreading and how the concept of multithreading has made our lives easier both as developers and users of computer and software programs.

As mentioned in the article, the main purpose of utilizing multithreading in a program is to improve the performance. However, adding multiple threads makes a program more complex thereby making it difficult to debug, test and extend.

In the next article of this series, you will learn about how to avoid race condition in Django.

Reference

  1. Deborah Morley & Charles. Parke (2015). Understanding Computers Today and Tomorrow 15th Edition. Cengage learning.
  2. Multithreading in Python can be retrieved from here
Β