Chapter 7. POSIX Threads (pthreads) Implementations

A thread is a sequence of instructions to be executed within a program. Normal UNIX processes consist of a single thread of execution, along with system resources (such as open files) and a virtual address space. The overhead associated with process creation, destruction and context switching led to the development of various “lightweight process” and threading libraries. They sought to minimize this overhead by having the threads share various resources and thus the operating system would have less to do on thread creation etc.

Historically, various vendors have implemented their own proprietary versions of lightweight processes and threads. For example IRIX implemented shared lightweight processes or sprocs. These implementations differed substantially from each other making it difficult for programmers to develop portable applications.

In 1995 the IEEE provided a standardized thread based programming interface, POSIX 1003.1c (also known as ISO/IEC 9945-1:1996), referred to as POSIX threads or P-threads. The standard provides a variety of application programming interfaces that fall into three categories:

This chapter outlines differences between the Pthreads implementations on IRIX 6.5 and the latest version of ProPack. It must be noted that the Linux information is highly dependent on the version of the kernel and threading library being supported. The ProPack 2.4 release and glibc 2.2.4 supported the LinuxThreads library. The ProPack 3.0 release and glibc 2.3+ support the Natice Posix Thread Library for Linux (NPTL).

Implementation Differences

As of The IRIX 6.5.20 release, IRIX is Unix98 conformant and fully compliant with the POSIX 1003.1c standard. It implements an M:N threading model whereby M threads are mapped onto N kernel processes. This allows the ability to create both kernel and user level threads and to quickly switch between thousands of them. At the same time, it does complicate the implementation.

LinuxThreads (http://pauillac.inria.fr/~xleroy/linuxthreads ), on the other hand, adopts a 1:1 threading model where each thread is mapped onto a kernel process. Although this, in theory, should increase switching times, the LinuxThreads designers point to the overall low switching overhead of the Linux kernel. They also point to a simplified design that performs well when most threads are blocked or when there is not a large number of runnable threads. While LinuxThreads does implement all of the APIs from the POSIX 1003.1c standard, LinuxThreads is not standard conformant in the area of signal handling.

The Native Posix Thread Library (described in http://people.redhat.com/drepper/nptl-design.pdf ) provides performance improvements and increased scalability and it aims to overcome most of the deficiencies of Linux Threads while remaining as compatible as possible to the Linux Thread API. It is also a 1:1 (rather than M:N) threading model, but it corrects many of the issues with signal handling in Linux Threads and is thus much more standard conformant. Applications that rely on behavior where the LinuxThreads implementation deviates from the POSIX standard will need to be fixed. These behavior differences include the following:

  • Signal handling has changed from per-thread signal handling to POSIX process signal handling.

  • getpid() returns the same value in all threads.

  • Thread handlers registered with pthread_atfork are not run if vfork() is used.

  • There is no manager thread.

If an application does not work properly with NPTL, it can be run using the old LinuxThreads implementation by setting the following environment variable:

LD_ASSUME_KERNEL=kernel-version

The following versions are available:

2.4.19 -- Linuxthreads with floating stacks

Note that software using errno, h_errno, and _res must #include the appropriate header file (errno.h, netdb.h, and resolv.h respectively) before they are used. However, LD_ASSUME_KERNEL=2.4.19 can be used as a workaround until the software can be fixed.

Differences in Cancellation

Cancellation is the mechanism by which a thread can send a request to terminate the execution of another thread. Depending on its settings the target thread can then either ignore the request, honor it immediately, or defer it till it reaches a cancellation point. Cancellation points are those points in the program execution where a test for pending cancellation requests is performed and cancellation is executed if positive.

Under IRIX the following functions are cancellation points:

accept(2)
aio_suspend(3)
close(2)
connect(2)
creat(2)
fcntl(2)
fsync(2)
getmsg(2)
getpmsg(2)
lockf(3C)
mq_receive
mq_send
msgrcv(2)
msgsnd(2)
msync(2)
nanosleep(2)
open(2)
pause(2)
poll(2)
pread(2)
pthread_cond_timedwait(3P)
pthread_cond_wait(3P)
pthread_join(3P)
pthread_testcancel(3P)
putmsg(2)
putpmsg(2)
pwrite(2)
read(2)
readv(2)
recv(2)
recvfrom(2)
recvmsg(2)
select(2)
sem_wait
semop(2)
send(2)
sendmsgsendto(2)
sigpause(2)
sigsuspend(2)
sigtimedwait(3)
sigwait(3)
sigwaitinfo(3)
sleep(3C)
system(3S)
tcdrain(3t)
usleep(3C)
wait(2)
wait3(2)
waitid(2)
waitpid(2)
write(2)
writev(2)

In contrast the following are cancellation points under Linux:

pthread_join(3)
pthread_cond_wait(3)
pthread_cond_timedwait(3)
pthread_testcancel(3)
sem_wait(3)
sigwait(3)

In particular note that no system call is a cancellation point under Linux. In contrast, under IRIX the system call wrapper checks the caller and enables and disables cancellation around the particular system call.

For more information see the following man pages on IRIX and Linux: pthread_cancel(3P), pthread_setcancelstate(3P)

Differences in Mutex Implementations

A Mutex (or mutual exclusion point) controls whether threads can execute a critical region of code or modify a shared variable. They are a primary means of thread synchronization under Pthreads.

A mutex variable acts like a “lock” protecting access to a shared resource, such as shared memory or file descriptors. Only one thread can lock (or own) a mutex variable at any given time. If several threads try to lock a mutex, only one thread will succeed. The other threads will not be granted the mutex until the owner releases it.

A mutex has attributes that control its behavior. Under IRIX the function pthread_mutexattr_settype() defines the type of mutex. The type value may be one of PTHREAD_MUTEX_NORMAL, PTHREAD_MUTEX_ERRORCHECK, PTHREAD_MUTEX_RECURSIVE, PTHREAD_MUTEX_SPINBLOCK_NP, or PTHREAD_MUTEX_DEFAULT.

LinuxThreads supports only one mutex attribute: the mutex kind, which is either PTHREAD_MUTEX_FAST_NP for fast mutexes, PTHREAD_MUTEX_RECURSIVE_NP for recursive mutexes, or PTHREAD_MUTEX_ERRORCHECK_NP for error checking. mutexes. In all cases the NP suffix refers to “Non Portable” extensions to the Posix standard.

IRIX also implements a process-shared attribute (PTHREAD_PROCESS_SHARED) to permit a mutex to be operated upon by any thread that has access to the memory where the mutex is allocated, even if the mutex is allocated in memory that is shared by multiple processes. If the process-shared attribute is PTHREAD_PROCESS_PRIVATE, the mutex will only be operated upon by threads created within the same process as the thread that initialized the mutex; if threads of differing processes attempt to operate on such a mutex, the behavior is undefined. The default value of the attribute is PTHREAD_PROCESS_PRIVATE. For more information on IRIX (see: pthread_mutexattr_setpshared(3P)). This feature is not implemented under LinuxThreads (glibc 2.2.x) but is supported by NPTL (glibc 2.3+)

It should be pointed out that both IRIX and Linux support optimized atomic operations that are much faster than the following code sequence:

pthread_mutex_lock ( &count_mutex ); 
   count++; 
pthread_mutex_unlock ( &count_mutex );

On IRIX __fetch_and_add while under Linux __sync_fetch_and_add (gcc) or _InterlockedIncrement (Intel compiler) would be much faster.

Condition Variables

Condition variables allow threads to suspend execution until some condition is satisfied. Functions are provided to wait on a condition variable and to wake up threads that a waiting on the condition variable.

The type of condition variable used is determined by the attribute structure attr passed with the call to pthread_cond_init(). On IRIX these attributes are set by calls to pthread_condattr_init() and the various condition variable attribute functions such as pthread_condattr_init() and pthread_condattr_setpshared(). If attr is null (or the condition variable is statically initialized) the default attributes are used. The IRIX implementation supports the process-shared attribute. If this attribute is set to PTHREAD_PROCESS_SHARED it allows a condition variable to be operated upon by any thread that has access to the memory where the condition variable is allocated, even if the condition variable is allocated in memory that is shared by multiple processes. If the process-shared attribute is PTHREAD_PROCESS_PRIVATE, the condition variable will only be operated upon by threads created within the same process as the thread that initialized the condition variable. The default value of the attribute is PTHREAD_PROCESS_PRIVATE.

The LinuxThreads implementation supports no attributes for conditions, hence the cond_attr parameter is ignored by pthead_cond_init(). Likewise pthread_condattr_init() and pthread_condattr_destroy() under LinuxThreads do nothing and are only included for compliance with the POSIX API's.

NPTL supports the process-shared attribute for condition variables.

For more information see the pthread_cond_wait(3) and pthread_condattr_init(3) man pages.

Read-Write Locks

A read-write lock is a software object that gives one thread the right to modify some data, or multiple threads the right to read that data. The pthreads library on IRIX implements several functions for initializing and using read-write locks. For more informations see pthread_rwlock_init(3) pthread_rwlock_rdlock(3) and pthread_rwlock_wrlock(3).

Read-write locks are extensions to the POSIX standard and are not implemented on LinuxThreads but are supported by NPTL.

Signals

A signal is an asynchronous notification of an event.

Each thread has a signal mask that specifies the signals it is willing to receive. This mask can be changed in a pthreads program by calling the pthread_sigmask() function.

As mentioned earlier in this chapter, signal handling in LinuxThreads does not conform to the POSIX standard and is thus significantly different than the IRIX implementation. NPTL, on the other hand, is standard compliant.

According to the standard, external signals are addressed to the whole process (the collection of all threads), which then delivers them to the one particular thread. However, since each thread is actually a kernel process with its own process ID (PID) in LinuxThreads, external signals are always directed to one particular thread. If, for instance, another thread is blocked in sigwait on that signal, it will not be restarted. NPTL overcomes this by performing signal-handling for multi-threaded processes in the kernel. Signals sent to the process are now delivered to one of the available threads.

The LinuxThreads implementation of sigwait installs dummy signal handlers for the signals in set for the duration of the wait. Since signal handlers are shared between all threads, other threads must not attach their own signal handlers to these signals, or alternatively they should all block these signals.

Another difference between the implementations is that IRIX uses SIGPTRESCHED and SIGPTINTR for scheduling and cancellation whereas LinuxThreads uses SIGRTMIN and SIGRTMIN+1. NPTL uses SIGRTMIN

Scheduling Pthreads

Pthreads are scheduled by their scope, policy and priority. These variables are set initially when the thread is created though policy and priority can also be modified at runtime by the pthread_setschedparam() function.

Scope

IRIX supports three different contention scopes. System and bound scope threads are scheduled by the IRIX kernel, and compete with all other threads on the system. System scope threads are suitable for real-time programming and may only be created by privileged users, whereas bound scope threads are not suitable for real-time programming and do not require special privileges to create. Process scope threads are scheduled by the Pthreads library, and compete with one another for process timeslices. By default Pthreads are created with process scope.

The only scope supported by LinuxThreads is the system scope.

Policy

IRIX supports the following policies:

  • SCHED_RR (default; round robin scheduling)

  • SCHED_FIFO (first in first out)

  • SCHED_TS (time sharing same as SCHED_RR)

  • SCHED_OTHER (same as SCHED_RR)

LinuxThreads supports these policies

  • SCHED_OTHER (regular non-realtime scheduling)

  • SCHED_FIFO (realtime, first-in first out)

  • SCHED_RR (realtime, round robin)

Priority

IRIX supports priorities between 0-255. The range on LinuxThreads is 1-99. Larger numbers represent higher priorities on both implementations.

Environment Variables

IRIX supports the PT_CORE and PT_SPINS environment variables.PT_CORE permits a core file to be generated in certain situations which are otherwise not permitted by the Pthreads library, but should generally not be used unless debugging an application. PT_SPINS determines how many times a lock is tried before sleeping.

For more information see the IRIX pthreads(5) man pages. Neither are supported on Linux.

Summary of Differences in Supported Features

A chart that illustrates various pthreads features that are supported by different variants of Unix can be found at: http://www.tldp.org/FAQ/Threads-FAQ/OSsCompared.html

Table 7-1 reproduces a portion of this chart and includes what is supported on IRIX and ProPack 3.0 (NPTL) and ProPack 2.4 (LinuxThreads) respectively.

Table 7-1. IRIX 6.5 vs. Linux Pthread Feature Comparison

Feature

IRIX

NPTL

Linux Threads

User(U)/Kernel(K)-space

K&U

K

K

Cancellations

Yes

Yes

Yes

Priority Scheduling

Yes

Yes

Yes

Priority Inversion Handling [A]

Yes

Yes

No

Mutex Attributes

Yes

Yes

Yes

Shared and Private Mutexes [B]

Yes

Yes

No

Thread Attributes

Yes

Yes

Yes

Synchronization

Yes

Yes

Yes

Stack Size Control

Yes

Yes

No

Base Address Control

Yes

No [1]

No [1]

Detached Threads

Yes

Yes

Yes

Joinable Threads

Yes

Yes

Yes

Per-Thread Data Handling Function

Yes

Yes

Yes

Per-Thread Signal Handling

Yes

Yes

Yes

Condition Variables

Yes

Yes

Yes

Semaphores

Yes

Yes

No

Thread ID Comparison

Yes

Yes

Yes

Call-Once Functions

Yes

Yes

Yes

Thread Suspension

No [2]

Yes

Yes

Specifying Concurrency [C]

Yes [3]

Yes

No

Reader/Writer Share Locking

Yes

Yes

No

Processor-specific Thread Allocation [D]

Yes [4]

Yes

No

Fork All Threads [E]

No [5]

No

No

Fork Calling Thread Only

Yes

Yes

Yes

Feature Definitions:
[A] As threads get blocked on I/O, provide a temporary reprioritization of threads.
[B] Having separate spaces for mutexes
[C] The ability to identify which threads will be multiprocessed.
[D] The ability to designate a specific thread to a specific processor.
[E] A flag which forces all thread-creation calls to be forks with shared memory.

Notes:
[1] Using cpusets or dplace could accomplish much the same thing
[2] Only the whole process.
[3] You can specify how much user-level threads you will use at once. The number of kernel-level threads (i.e. concurrency level) is then determined as min([max number of threads to use],[number of available processors]).
[4]Via pthread_setrunon_np(3P).
[5] Available through the IRIX-specific sproc() call. However, it should be noted that sproc's and pthreads are not compatible under IRIX and cannot be intermixed.