Multithreading under Solaris* and Microsoft .NET*

Introduction
By John Sharp, Content Master Ltd.

Multithreading provides an excellent way for an application to perform multiple tasks concurrently. In a rich-client environment, using multiple threads can ensure that a graphical user interface remains responsive to user input while simultaneously performing business processing. Hyper-Threading Technology enabled processors take multithreading to the next level, providing ideal platforms for building rich-client solutions.

Multithreading is not a new concept, although the implementations provided by many operating systems and programming environments have evolved over recent years. This paper contrasts two particular models of multithreading available on different operating systems: Solaris* threads, aimed at Unix* developers building systems for Solaris, and .NET threads, employed by programmers using the Microsoft .NET Framework under Windows*.

This discussion describes two versions of the same multithreaded application. The first version is written in C++ running under Solaris, and the second is written in C#* running within the Microsoft .NET Common Language Runtime (CLR)* under Windows. Although the application does not illustrate every feature of threads available in these two environments, it should give you a feel for how the two different systems function, and how you could convert a multithreaded application written for Solaris to run under .NET.



Threads, Shared Data, and Synchronization
Writing a multithreaded application usually involves creating and managing multiple threads, although that has not always been the case. For example, older versions of Unix supported multitasking by allowing you to create multiple concurrent processes using the fork system call.

Sharing data between multiple processes started in this way involved using the Unix Interprocess Communication (IPC)* facilities such as shared memory, message queues, and signals. Unix provided semaphores to control concurrent access to shared data (to prevent two processes from updating the same information at the same time, for example).

The multi-process model of multithreading is useful for multitasking operations that need to run concurrently but that, for the most part, execute in an unrelated manner. It is very resource-intensive, however, for tasks that need to cooperate and synchronize with each other on a regular basis. Threads solved this problem.

In an application that uses multiple threads, the threads themselves execute in the context of the same process and can directly share data with each other, reducing the overhead involved with traditional Unix IPC. Synchronization is still a big issue, and most threading models (including those supplied by Solaris and .NET) provide a set of low-level primitives to control concurrent access to shared data.



Solaris Threads
Solaris provides the library libthread.so, which contains the native Solaris thread routines. The header file thread.h contains the definitions of these routines and the data structures they use.

Creating Threads

The thr_create function creates a new thread and starts it running (although you can specify that the thread should be created in a suspended state, in which case it will not begin executing until you invoke the thr_continue function). The thr_create routine expects a number of parameters; you can specify information such as the size and location of the stack to use, and options such as whether the thread should run as a foreground thread or in the background as a daemon. You are also expected to supply a pointer to a function that the thread will execute, and an optional parameter for this function.

The thread function must take a single void * parameter, and return a void *. When the function finishes, the thread will terminate. Each thread is allocated a thread identifier (thread_t), which you can use to control the thread once it is running. Threads can also be bound or unbound. A bound thread will always be used to execute the function specified by thr_create. Unbound threads are not tied to a function, and they effectively constitute a thread pool that Solaris can use to execute functions concurrently.

The most common form of thr_create starts a bound thread running in the foreground using a default stack, as shown in the example in Listing 1 (the variable myData is passed as the parameter arg to runThread).

Note that, as with most Unix system calls, thr_create returns the value -1 if an error occurs and the thread cannot be created. Strictly speaking, you should catch and test this return value, reporting an error if problems arise. This paper omits error checking, for the sake of simplicity.

Managing Threads
The Solaris thread library provides a number of routines you can use to control a thread once it is running. For example, thr_kill will send a signal to a thread, and possibly terminate that thread. (You can only send signals to threads inside the same process.) The function thr_setprio can change the scheduling priority of a thread. The routine thr_suspend will temporarily halt a thread, although it can be resumed later using thr_continue. An executing thread can also voluntarily relinquish the processor by calling thr_yield.

The reason for using multiple threads is to perform tasks concurrently. Unless you are using a computer that has as many processors as executing threads, however, you are unlikely to achieve true concurrent processing. Instead, Solaris will allocate time slots to threads and run them sequentially.

To further conserve resources, Solaris also optimizes the number of activateable threads inside an application, using an algorithm that ensures that sufficient threads can be made active to allow the process to progress (and also based upon the number of unbound threads available in the thread pool). In other words, although you create a number of threads, Solaris will not necessarily allocate time to all of them. This characteristic might not provide the most effective degree of concurrency for your application.

You can use the thr_getconcurrency and thr_setconcurrency methods to ascertain how many concurrent threads Solaris has activated for your program, and change this value if you feel it is too low.

Synchronizing Threads

The Solaris thread library provides the function thr_join to allow you to synchronize threads. The thread executing thr_join will wait for a specified thread to finish before continuing. You can also obtain the exit status of the thread (when a thread finishes, the return value of its function is passed back as the exit status). Finally, you can wait for the next thread to finish by specifying a thread id of 0; thr_join will return the id of the thread.

Listing 2 shows how to wait for a specific thread to exit. The first parameter to thr_join is the id of the thread to wait for. The second parameter is a pointer to a thread_t and will be populated with the id of the terminating thread if it is not NULL (it is possible to wait for one of a group of threads, and the second parameter will indicate which thread finished). The third parameter will be filled in with the exit status of the terminating thread if it is not NULL.

Controlling Access to Shared Resources

Concurrent threads sharing access to the same resources need careful coordination. The Solaris thread library provides mutexes and condition variables to help you manage threads manipulating shared data. The functions and data structures are located in the synch.h header file.

A mutex is a mutual exclusion lock designed to prevent two threads from simultaneously executing critical sections of code that read and/or write the same data. You can use mutexes to coordinate threads running in the same or different processes. You can initialize a mutex using mutex_init. A thread can then attempt to lock the mutex using mutex_lock. If the mutex is already locked, mutex_lock will block until it is released.

The function mutex_unlock will release a mutex (a thread can only release a mutex that it has locked). The mutex_trylock function will attempt to lock a mutex, but will terminate immediately with an error if the mutex is already locked. The mutex_destroy routine will remove a mutex.

Mutexes are simple to use, but they are not always sufficient by themselves. For example, consider the following scenario that uses a single mutex to protect a queue of data:

  • A writer thread wants to ensure that the queue will not be accessed while it is being written. The thread therefore locks the mutex to guarantee exclusive access to the queue, adds a new item to the queue, and then releases the mutex.

  • A reader thread wants to guarantee that the data in the queue will not change while it is being read. The reader therefore locks the mutex to obtain exclusive access to the queue, retrieves the first item from the queue, and then releases the mutex.
If the reader and writer threads both try to lock the mutex at the same time, one of them will be blocked until the other releases the mutex. It is worth considering, however, what happens if the reader thread locks the mutex and the queue is empty. In many implementations, the reader will wait until an item is available before continuing. It will have to release the mutex first; otherwise the writer thread will be blocked and unable to write to the queue.

The logic in the reader thread can become very contorted, especially if there are many reader threads all accessing the same queue (you also want to avoid performing a busy-wait in the reader thread, continually locking the mutex, examining the queue, and releasing the lock if it is empty before trying again).

Condition variables are designed to solve this type of problem. A condition variable is associated with a mutex and can send a signal to a thread waiting for that mutex to be released. In the meantime, the waiting thread is held in a suspended state. The code snippets in Listing 3 and Listing 4 show how to use a mutex and a condition variable. In Listing 3, the writer thread locks the queueReady mutex, pushes some data onto the queue, signals the condition variable, and releases the mutex.

In Listing 4, the reader thread locks the mutex. If the mutex is already locked by the writer thread, the reader will block until it becomes available. Once the reader has obtained the mutex, it invokes the cond_wait function to wait for the writer to signal the signalQueueReady condition variable. The cond_wait function automatically releases the queueReady mutex while the thread is suspended, but regains it again when the cond_wait method finishes.

This temporary release of the mutex allows the writer to lock it and write to the queue. Once the writer has written data to the queue and signaled the condition variable, the reader resumes and obtains the lock. The reader then extracts the data from the queue and, finally, releases the lock.

The previous discussion assumes that the reader executes first and waits for the writer to signal the condition variable. The cond_signal function releases a single thread executing cond_wait; if multiple threads are waiting, only one of them will be released, and if no threads are waiting, the signal is lost. Therefore, if the writer has already signaled the condition variable before the reader calls cond_wait, it will block indefinitely!

There are other ways in which cond_wait can terminate — if the thread is sent a Unix signal by another thread or process, for example (we will use this solution to fix the blocking problem). Whatever the reason for cond_wait finishing, it is always guaranteed that the calling thread will have the corresponding mutex locked so the thread can continue and access the locked data. The following example shows how this is useful.

Solaris Calculator and Printers Example

The program shown in Listing 5 and Listing 6 presents an example of a single writer and multiple readers sharing the same data structure. The writer thread executes the calculator::calcPowersOfTwo method, which calculates powers of two from 2 to 20 and places them on a queue.

On a single-processor machine, it is interesting to observe the different scheduling behavior that occurs when the thr_yield statement in the alcPowersOfTwo method is uncommented.

The reader threads execute the printer::printPowersOfTwo method, which retrieves the values from the queue and prints them together with a string identifying which printer object actually produced the output, as shown in Listing 7 and Listing 8.

The printer class iterates through the items on the queue, as there may be more than one value available; the calculator thread may reacquire the queueReady mutex before the printer does and place another value on the queue.

The test harness (Listing 9) creates the writer and four reader threads. The test harness waits for the writer thread to finish and then sends SIGUSR1 to each printer thread. The printer threads will all be blocked waiting for cond_wait to signal the signalQueueReady condition. Each printer object catches the signal, which causes cond_wait to terminate and grants the thread the mutex.

The thread can then iterate through any remaining data in the queue. The printer threads are killed when processing has completed; the program waits for the user to press a key before finishing to allow the signal processing to complete.



.Net Threads
Threads are an integral part of the Windows operating systems, and many parts of the .NET platform rely on them for everyday tasks. The System.Threading namespace in the Microsoft .NET Framework Class Library* contains the methods and classes used to control threads under .NET.

Creating Threads

To create and run a thread, you must create a ThreadStart delegate that refers to a method to be executed by the thread, create a Thread object using this delegate, and then start the thread object running. The ThreadStart delegate must refer to a void method that takes no parameters. Unlike Solaris threads, .NET threads must always be started explicitly after you create them. Listing 10 shows a common way to create and start a thread.

As with the Solaris examples shown earlier, error checking has been omitted for clarity.

The .NET Framework additionally supplies a system-managed thread pool, accessible through the ThreadPool class. If you don't wish to create threads manually, but want to have methods executed asynchronously, you can queue requests for the thread pool. The CLR will allocate threads from the pool and use them to run your requests. You can also provide callback methods that the CLR will invoke once execution has completed. Using the thread pool can greatly conserve resources.

Managing Threads

The Thread class exposes a variety of methods that you can use to control the execution of a thread. The Abort method can be used to request that a thread terminates, the Suspend method will temporarily halt execution of a thread, and the Resume method can cause it to start running again. You can use the Sleep method to stop execution for a short period of time, and have the thread restart when this period is over.

Threads also have properties that you can query, and in some cases change. For example, the Priority property lets you adjust the relative priority of a thread, and the IsBackground property allows you to specify that the thread should execute in the background, in which case it will be aborted automatically if it is still running when the program finishes (programs will wait for foreground threads to complete before terminating), as shown in Listing 11.

Synchronizing Threads
Like the features available with Solaris, the Thread class supplies the Join method that you can use to wait for a thread to terminate. An extension not available with Solaris is the ability to specify a timeout parameter — the Join method will also terminate if this timeout expires. Unlike Solaris, threads under .NET do not have exit codes, and the return value from the Join method is a Boolean value indicating whether the joined thread finished (true), or the Join method call timed out (false).

Controlling Access to Shared Resources

The .NET Framework Class Library provides a wealth of facilities for controlling access to shared resources. The System.Threading namespace includes synchronization classes such as Mutex, ManualResetEvent, AutoResetEvent, Monitor, Interlocked, ReaderWriterLock, and Timer.

Describing all these classes is beyond the scope of this paper, but you can use the Monitor class to implement coarse-grained locking and the Mutex to provide a finer level of control, much as with Solaris mutexes. One of the most useful constructs, however, is the AutoResetEvent. This class is similar to a combination of a Solaris mutex and a condition variable.

An AutoResetEvent is essentially a Boolean flag that can block a thread if it is false but raise a signal to release a blocked thread when it is set to true. A thread can wait for an AutoResetEvent object to become signaled by executing the WaitOne method. This method also resets the AutoResetEvent object back to the unsignalled state; if several threads are waiting for the same signal, only one of them will be released.

To signal an AutoResetEvent object, use the Set method. When you create an AutoResetEvent, you can specify its initial value: true means that it is in the signaled state and will not block a thread that executes WaitOne, and false means that it is unsignaled and will block.

The code snippets shown in Listing 12 and Listing 13 illustrate how to create and use an AutoResetEvent object using the same reader/writer scenario described earlier. In Listing 12, the is created in the signaled state. The writer thread takes control of the AutoResetEvent object, setting it to the unsignaled state, pushes some data on the queue, and then signals the AutoResetEvent object.

In Listing 13, the reader thread waits for the AutoResetEvent object to become signaled, retrieves the data from the queue, and then signals that the queue is available again.

.NETCalculator and Printers Example

For comparison purposes, the code in Listing 14Listing 15, and Listing 16 shows a .NET implementation of the same calculator and printers example as we looked at before, this time written in C#. The Calculator class contains a queue and an AutoResetEvent for controlling access to the queue. The Calculator.CalcPowersOfTwo method generates powers of two and queues them.

As in the Solaris example, it is interesting to observe the different scheduling behavior that occurs on a single-processor machine when the Thread.Sleep statement in the CalcPowersOfTwo method is uncommented.

The Printer class contains the PrintPowersOfTwo method, which the reader thread executes to retrieve and display values from the queue. Notice that in this example, the PrintPowersOfTwo method does not take a parameter — this is a limitation of thread methods with .NET. Instead, the Calculator class exposes its properties (the queue and AutoResetEvent) as static data, accessed through the class name.

The Runner class, shown in Listing 16, is a test harness that creates a single calculator thread and four printer threads. The test harness waits for the Calculator thread to finish. To ensure that the queue is drained correctly, the test harness also signals the QueueReady object (the static AutoResetEvent object that was created by the Calculator class in Listing 14), releasing a Printer thread to display any remaining data.

When the user presses a key, the program terminates and the Printer threads running in the background are killed. (If the Printer threads ran in the foreground, the application would wait for them to die first before terminating).



Conclusion
Solaris and .NET both provide comprehensive facilities for building multithreaded applications, and multithreading is an integral part of both Solaris and Windows. The features provided by Solaris are relatively low-level compared to those of .NET. The Solaris library is somewhat more mature and uses third-generation function calls and structures.

The threading library under .NET is object-oriented, having been developed specifically for the .NET Framework Class Library. The result is a highly functional library that contains a rich set of synchronization primitives, although richness comes with a price; the Solaris constructs are mean and lean, well-tuned to the underlying SPARC hardware, whereas the .NET mechanisms are more suited to the power provided by an Intel processor.



Additional Resources