POSIX Threads: Semi-FAQ Revision 5.2
© 2001-2006 Michael M. Lampkin
email: michael.lampkin<at>ieee.org
© 2001-2006 Michael M. Lampkin
email: michael.lampkin<at>ieee.org
This section covers threads in general, the POSIX view of a thread and provides a quick comparison between threads and processes and the advantages of each.
A thread is defined by the POSIX standard to be a single flow of control within a process and the required system information and resource(s) to support that flow of control. To paraphrase and put it in simpler terms, the use of threading allows a single application to appear to perform multiple tasks (execute multiple threads) at the same time.
Notice that I said "appear to perform multiple tasks at the same time". This is called concurrency, which is when multiple flows of control (or in our case, multiple threads) are interleaved without intersection on a single CPU. To a user or programmer, if the granularity of the interleaving is high enough, the threads give the appearance of simultaneous execution even though it is really occurring in a serial manner.
Parallelism on the other hand deals with multiple threads executing at exactly the same time. While a single CPU normally cannot typically perform parallel execution, there are many systems on the market with multiple CPUs and single CPUs with multiple cores which can are easily capable of such a feat.
So knowing in very basic terms what a thread is, the difference between concurrent and parallel execution and the fact that POSIX threads are only required to exhibit concurrency, I should state for the record that most kernel space implementations of POSIX threads do provide for parallelism when there is more than one CPU available on a system. Not only do the implementations provide for parallelism, they provide it in a completely transparent fashion meaning that if you have an application that executes without error on a single CPU system then the same application will execute and take advantage of parallel execution on multiple CPU systems without requiring modification of its source code.
The description / definition of a thread in section 3.1 makes a thread sound a lot like a process. To be honest, there IS a lot of similarity between the two but there are also many differences once we delve a bit deeper into how a thread is created.
Considering that, perhaps the easiest way to understand the differences between a thread and a process is to look at what information has to be initialized and maintained for a process and a thread when they are created. The following two lists are admittedly incomplete but hopefully will convey the intended point.
A child process created by fork( ) has:
A thread created (using pthread_create(...)) has:
Again, the above lists are incomplete and are meant only to convey a point, not be definitive statements of what occurs when either of the given functions are called.
The point is that the creation of a process means creating a completely self-sufficient entity and copying over (almost) all the parent code, data, and so forth into a new segment of memory. Once created, the child is completely independent of its parent and modification of either one (in most cases) has no effect on the other. If the parent and child desire to share information, this results in the need to explicitly open routes of communication using sockets, pipes, shared memory or similar mechanisms between the two processes to facilitate it.
On the other hand, a thread is created and exists within the context of a process. This means that a thread and its parent process share the same memory, the same variables (following normal visibility rules), the same file descriptors, and so forth and do not require the use of any additional code to facilitate communication between all threads within a process. This also results in the lifetime of a thread being tied directly to the lifetime of it's parent process, and that when the process terminates / exits for any reason, all of its contained threads must also exit.
Threads which are CPU bound can be thought of as having their execution speed tied directly to the execution speed of the CPU. For example, a thread which had the lone task of calculating the value of PI without performing any IO operations such as writing data to the screen or disk, waiting for input from a user and so forth, would be considered completely CPU bound.
Threads which are IO bound are on the opposite end of the spectrum when compared to threads which are CPU bound. This means that the speed at which an IO bound thread executes is not tied to the speed of the CPU on which it is executing and in fact use very few CPU cycles during their execution.
An example of this type of thread would be one which waited for input from a user and once received, performed no processing on it. Under those conditions, almost all of the thread's time will (likely) be spent waiting on human interaction; during which time the CPU would be idling or could readily be made available for use by other threads or processes on the system.
The term IO bound is a bit of a misnomer in many cases. While IO operations are very commonly what introduces the described behavior, the truth is that other common function also introduce it. Two quick examples are the sleep( ) and wait( ) calls. Despite this, I will use the term IO bound as I find it less confusing than using the terms CPU bound and non-CPU bound.
There are a number of advantages to using a threaded model. Some which are commonly give are:
* Applications can be divided into multiple tasks that can then execute in parallel. A common example of this is an application where one thread is created to initialize, display and handle manipulations of a user interface while others are responsible for operations specified via the interface.
* The ability to more effectively handle applications where multiple paths of (slow) input or output are handled without having to resort to system specific functions and tricks.
* On systems with multiple processors, a multi-threaded application will (normally) take advantage of the additional hardware and can result in greatly improved overall performance.
Along with the advantages of employing a threaded paradigm come several distinct disadvantages. A few of these are:
* Threaded applications are notorious for being harder to debug than their equivalent single process based applications. One reason for this is their susceptibility to Heisenberg anomalies. Such anomalies can be induced by the very act of running the application in a debugger which can then cause a modification of the thread execution sequences and consequently prevent the error you are attempting to track from occurring.
* The programming of multi-threaded applications typically requires a stricter adherence to proper programming disciplines. A primary reason for this is the simple fact that all threads within an application execute within the same address space, in particular, that of their parent process. Keeping this in mind, it is easy to see that the corruption of any data / memory by a thread has the potential of causing unpredictable behavior with another thread in the same process.
* If the number of CPU bound threads in a threaded application greater than the number of processors on the system, the application will quite often run (significantly) SLOWER than an equivalent process based application. A threaded design in such a case adds little if anything that would allow one to rationalize the additional complexity introduced by using a threaded design.
Two general things I personally look for in an application when trying to decide if it would be good candidates for threading are:
* The application must perform some sort of potentially blocking IO operation, needs to handle multiple such operations at the same time or can use the time during such an operation to perform some other task. A common example is a network server which handles simultaneous client connections.
* The application is performing a mostly CPU bound operation which can be divided into multiple non-dependent operations. A common example is an application performing matrix operations.
Each application is different though so the best advice is to re-read the short list of advantages and disadvantages in sections 3.5 and 3.6 and then make a case by case evaluation based on them and your own experience.