Alphasite

The programmers site

Estadísticas web

Estadisticas web

Publicidad

Languages

Google AdSense

Poll

Who's online

There are currently 0 users and 7 guests online.

Introduction to multithreaded programming

Before we start

In this article there isn't a single line of code and it's not aimed at teaching the intricacies of multithreaded programming in any given programming language but to give a short introduction, focusing primarily on how and especially why and when multithreaded programming would be useful.

So, what is multithreaded programming?

Nowadays all operating systems (except maybe the educational ones) are multitasking operating systems. This means they can run multiple tasks or processes at the same time. This has not always been so, the ms-dos was a monotasking system which means that only one task could run on the system simultaneously.

Before rushing to explain multithreaded programming is desirable to review certain features common to operating systems.

Multitasking and multiprocessing. The Scheduler.

When we talk about multitasking operating systems we say they can run multiple tasks simultaneously in parallel. This is not entirely true, just multiprocessor systems (with more than one processor) can make real multitasking, in a system with one processor we talk about the illusion of multitasking.

In both cases a mechanism of the operating system called scheduler intervenes. The sheduler function is to administer the different tasks that are in execution and to decide when each task executes and - if it is applicable - the processor in which it will be executed. When a task is in execution it has the whole processor power available for itself, it doesn't share it with any other one. The same thing happens with memory space, the task that is executing has the whole available memory of the system.

So, in order for real multitasking or at least the sensation of it to take place, the tasks have to share the processor, that is to say, the task that is currently executing, will have to get out and leave the processor to another task which will take its place, this - for wich the sheduler is also in charge - is called context switch and, among other things, implies a certain overload of resources (it is necessary to keep certain relative data to the process that comes out and to load other relative to the process that comes in).

Process memory space.

The process memory state is related with the concept of multitasking. Each process in a given instant has its own memory space, that is, the data of the process (variables, the stack, the code and the heap) are in a logical space separated from the rest so that, for the other processes all that data doesn't exist (in fact the rest of the processes are not "aware" that there are more processes, if they wanted to know about the rest of the processes running they should always look it up by means of a call to the operating system API).

Threads inside processes.

When we speak of multithreaded programming we are refering to the situation in wich, inside our own application, there are diverse tasks, that is, our application distributes the work among different tasks some way.

In this sense it is necessary to remark that, for the operating system, we continue being a single process (although the operating system "knows" that we are using several threads), and being so our slice time (the time assigned by the operating system to our process) keeps constant. On the other hand only one of our threads will be executing at a given time (for each processor available).

Where's the difference then? The difference resides in that, when a process stops to execute (due to a I/O read for example) and another process begins executing a context switch takes place, involving the waste of resources stated before. However when the same case takes place among threads that context switch doesn't take place. This is because threads that belong to the same process share the same memory space.

Why and where should we use multithreading?

Deciding when to go for multithreading is not an easy decision. In most cases the application is simple enough to make multithreading needless. In other cases where the aplication might be complex enough maybe the overload or the added complexity of the multithreading way (as multithreading programming is always more complex than sequencial) may cancel the possible advantages.

Generally speaking, the decision of using multiple threads must be taken when tasks exists with a high computational cost (that is, task in wich the computational cost will be clearly higher than a context switch). It will be desirable too, that any given task doesn't rely too much in the results or data provided by other tasks (as it will make no sense to program three task which will be continously wating one another) and when we foresee that they can also be "delayed" in certain moments during its execution (due to disk reads, database accesses ans do on) so that other threads may execute while that happens.

The main reason for the these points lie in what was previosly said about the executing model of the operating system.

  1. If the computational cost of the task is not high enough, the cost of a context switch will make any improvement due to the multithreading orientation.
  2. If a task depends on another one in a great deal (for example because it needs a result provided by another task) then there's little to gain as if task B needs the task B to finish in order to begin or continue executing then concurrency between both task will not occur. Task A will finish executing, then Task B will start executing (wich will be the same - or in fact worst - as if task A would finish the first step and itself execute the second one
  3. The last point is not so important, and may even be discarded in this days of growing multicore CPUs. In scenarios with just one CPU it may mean introducing additional complexity without any gain (as if threads never get blocked they will consume its time slice and then give way to another thread which will do the same). However, as it seems multicore CPU are becoming the new standard, such threads would execute in different cores (processors) so there would be a real improvement ther.

Blocked threads

When we talk about a blocked thread we are talking about a thread which is wating for an extern reource to complete an operation.

There are some operations that, due to the limitations of the resources accessed, are a lot slower than a normal CPU operation. That's specially true in the case of IO operations as well as accesses to remote machines. When a process (or thread) performs one of this operations it switch to a blocked state and another thread takes its place. This way, if we foresee our application can get blocked by one of this operations for a given time (while we're waiting for a database response for example), that time might be used to perform another work. If that is the case, we should consider dividing the application into different task to make the most of this "idle time" while we're accessing the database.

An example of performance improvement

Let's see an example of a situation in which using multiple threads will improve the efficiency of our application. Imagine we have to access several databases located on different machines and operate with the returned results. With a single thread the trace of execution will be:

  1. Open the connection to the first database
  2. Perform the query and wait for the results
  3. Wait for data
  4. Operate with the returned data
  5. Open the connection to the second database
  6. Perform the query and wait for the results
  7. Wait for data
  8. Operate with the returned data

However if we implement a thread by query to the database we'll be able to be make the querys in parallel, and will be obtaining our data concurrently from them.

  1. (Thread1) Open the connection to the first database
  2. (Thread1) Perform the query (Thread2) Open the connection to the second database
  3. (Thread2) Perform the query (Thread1) Wait for data
  4. (Thread1) Operate with the returned data (Thread2) Wait for data
  5. (Thread2) Operate with the returned data

Notice how performance increase even if we have just one processor as, while Thread1 blocks to query the database, Thread2 executes.

An example of conceptual separation.

Although until now we have been talking mainly about the concept of performance improvement by means of a differentiated thread use, sometimes it is interesting to separate the application in several threads although the yield gain is not very great if a suitable conceptual separation is obtained or if we want to guarantee that certain part of the application executes with a greater priority.

For example if the application has a section for loggin, a section for music management and a section for vital support, it seems advisable to separate it in 3 threads and to assign the highest priority to the part that handles the vital support.

Another example could be the design of a videogame in which, for example, we may have parts like the processing of the user input, the renderization process, the artificial intelligence subsystem... which conceptually seem like tasks that would have to be performed simultaneously instead of sequentially.

In addition to all that, we should never despise the possibility of the application being executed in machines with more than one processor (or with some technology like hyperthreading wich provides logical processing units). An application that wouldn't have anything to gain in a single processor scenario(because no input/output operations take place for example) could gain much if there are two processors available and each thread can execute in a different processor. In our videogame example, one thread could be calculating the trajectory of a falling body while the artificial intelligence system decides if it should activate the starboard thrusters.

Apendix

Context switches

In a multitasking operating system the processor "is distributed" between the different tasks in execution. Each task has a set of data associated wich is exclusive of this task. This data is grouped in a structure called PCB (process control block). In the PCB we can find, among other things:

  • The process identifier (pid).
  • The state of the process (in execution, ready, blocked)
  • Information about the input/output
  • Information about the memory management
  • The process priority
  • The state of the CPU registers including the program counter and stack pointer)

In addition to the PCB, part of the information of the process consists on the resident set. The resident set of a process is the portion of a process that exists in physical memory (opposed to the rest of the program which exists in swap.

When a context switch takes place, the PCB from the exiting process is saved, the BCP of the incoming process is loaded and the incoming process begins to execute. If the previous process had cleared data of the process that had just enter (because it needed space for its own data), the data of the entering process will be located in swap, in the hard disk, so it will be necessary to bring it to physical memory again, which is a slow operation. It's also possible that in the process of bringing that data from the hard drive we will need to remove of another process that will in turn need it in its time slice.

Introduction to operating system scheduling

Several reasons exist by which a process that is in execution can get blocked.

The process consumes its time slice

In most of operating systems, the time of the CPU it is organized in time slices, when a process enters to execute in the CPU he is assigned maximum time period which once consumed cause the process to be suspended to allow another process to execute. (This is a simplification, actual operating system have a much more complex way of distributing processes having into account priority, starvation and other things).

In order for this to take place there's a hardware mechanism that acts as a timer firing up what is called an interrupt, wich in place brings up the operating system task an allows it to block (given the case) the thread wich is executing and start executing some segment of the operating system itself which is in charge of deciding whether the current thread must give way to another one and in such case take the actions needed to remove the thread that was executing and start executing the new one.

A process or thread performs a blocking operation

If a process or thread currently in executing performs an IO operation, for example reading the disk, or accesing a memory location which is not currently in main memory, it will automatically get blocked while the data is fetches from the disk or the page of memory is brought to main memory.

The process raises an exception

Lastly, if a process raises an exception, by trying to access a privileged memory location, the thread or process will become blocked as the operating system handles the exception.

Reply

CAPTCHA
Esta pregunta permite identificar a los usuarios humanos de los robtos de spam de internet
Image CAPTCHA
Copy the characters (respecting upper/lower case) from the image.