modern-multithreading-c-java
.pdfTRACING, TESTING, AND REPLAY FOR SEMAPHORES AND LOCKS |
145 |
||
Thread1 |
LocksHeld(Thread1) |
CandidateLocks(s) |
|
|
{} |
{mutex1,mutex2} |
|
mutex1.lock(); |
{mutex1} |
{mutex1,mutex2} |
|
s = s+1; |
{mutex1} |
{mutex1} |
|
mutex1.unlock(); |
{} |
{mutex1} |
|
mutex2.lock(); |
{mutex2} |
{mutex1} |
|
s = s+1; |
{mutex2} |
{} |
|
mutex2.unlock(); |
{} |
{} |
|
Figure 3.30 Lockset algorithm.
// Let LocksHeld(T) denote the set of locks currently held by thread T
For each shared variable v, initialize CandidateLocks(v) to the set of all locks. On each read or write access to v by thread T:
CandidateLocks(v) = CandidateLocks(v) ∩ LocksHeld(T); if (CandidateLocks(v) == {}) issue a warning;
For example, in Fig. 3.30, Thread1 ’s access of shared variable s is protected first by mutex1 then by mutex2. This is a violation of mutual exclusion that can be detected by the lockset algorithm. CandidateLocks(s) is initialized to {mutex1, mutex2 } and is refined as s is accessed. When Thread1 locks mutex1, LocksHeld(Thread1) becomes {mutex1 }. When s is accessed in the first assignment statement, CandidateLocks(s) becomes mutex1, which is the intersection of sets
CandidateLocks(s) and LocksHeld(Thread1). When the second assignment statement is executed, Thread1 holds lock mutex2 and the only candidate lock of s is mutex1. After the intersection of CandidateLocks(s) and LocksHeld(Thread1), CandidateLocks(s) becomes empty. The lockset algorithm has detected that no lock protects shared variable s consistently.
We have implemented the lockset algorithm in the mutexLock class and in the C++ sharedVariable class template that was presented in Chapter 2:
žWe assume that threads can access a sharedVariable only after it is initialized. Thus, no refinement is performed during initialization (i.e., in the constructor of sharedVariable).
žRead-only variables can be accessed without locking. This means that warnings are issued only after a variable has been initialized and has been accessed by at least one write operation.
žThe refinement algorithm can be turned off to eliminate false alarms. For example, the bounded-buffer program in Listing 3.26 can be modified to use a buffer of sharedVariables. These variables are shared by the Producer and Consumer threads, but no locks are needed. Even so, when this program is executed, a warning will be issued by the lockset algorithm. When it is determined that the warning can safely be ignored, the lockset algorithm can be disabled for these variables. Alternatively, the buffer can be implemented without using sharedVariables.
146 |
SEMAPHORES AND LOCKS |
The lockset algorithm was originally implemented in a testing tool called Eraser [Savage et al. 1997]. It has also been implemented as part of a Java virtual machine called Java Pathfinder [Havelund and Pressburger 2000]. The lockset algorithm has been shown to be a practical technique for detecting data races in programs that protect shared variables with locks. However, programs that use semaphores for mutual exclusion, with patterns like passing-the-baton, will generate false alarms. As shown above for the bounded buffer program, lockset can produce false alarms even if locks are used. The number of false alarms can be reduced by allowing users to turn off the detection algorithm for regions of code or to provide the detection algorithm with extra information. As a nondeterministic testing technique, the lockset algorithm cannot prove that a program is free from data races. Still, knowing that a particular execution contains no data races can be helpful during tracing and replay, as we will see next.
3.10.2 Simple SYN-Sequences for Semaphores and Locks
In general, we can characterize an execution of a concurrent program as a sequence of synchronization events on synchronization objects. A sequence of synchronization events is called a SYN-sequence. First we define the types of synchronization events and synchronization objects that occur in programs containing semaphores and locks. Then we show how to collect and replay SYN-sequences of these programs. These two steps are not independent. There are several ways to define a SYN-sequence, and the definition of a SYN-sequence has an effect on the design of a replay solution, and vice versa.
Let CP be a concurrent program that uses shared variables, semaphores, and locks. The result of executing CP with a given input depends on the (unpredictable) order in which the shared variables, semaphores, and locks in CP are accessed. The semaphores are accessed using P and V operations, the locks are accessed using lock and unlock operations, and the shared variables are accessed using read and write operations. Thus, the synchronization objects in CP are its shared variables, semaphores, and locks. The synchronization events in CP are executions of read/write, P /V , and lock/unlock operations on these objects.
We point out that there may be shared variables that are accessed in the implementations of the P/V and lock/unlock operations. In fact, these shared variables may be accessed a large number of times, due to busy waiting loops. However, we will not trace the read and write operations on these shared variables. Since P/V and lock/unlock operations are atomic operations; we will consider them to be operating on a single (shared) semaphore or lock variable. This abstraction decreases the number of operations that need to be traced during execution.
A SYN-sequence for a shared variable v is a sequence of read and write operations on v. ReadWrite-sequences for shared variables were defined in Chapter 2. A SYN-sequence for a binarySemaphore or countingSemaphore s is a sequence of events of the following types:
žCompletion of a P operation
žCompletion of a V operation
TRACING, TESTING, AND REPLAY FOR SEMAPHORES AND LOCKS |
147 |
|
binarySemaphore mutex(1); |
|
|
Thread1 |
Thread2 |
|
mutex.P(); |
mutex.P(); |
|
x = 1; |
x = 2; |
|
mutex.V(); |
mutex.P(); // error: should be mutex.V(); |
|
Listing 3.31 ReadWrite-sequences and PV-sequences.
žStart of a P operation that is never completed due to a deadlock or an exception
žStart of a V operation that is never completed due to a deadlock or an exception
We refer to such a sequence as a PV-sequence of s. An event in a PV-sequence is denoted by the identifier (ID) of the thread that executed the P or V operation. The order in which threads complete their P and V operations is not necessarily the same as the order in which they call P and V or even the same as the order in which the P and V operations start. For operations that are completed, it is their order of completion that must be replayed, since this order determines the result of the execution. We also replay the starts of operations that do not complete, so that the same events, exceptions, and deadlocks will occur during replay.
A SYN-sequence for a mutexLock l is a sequence of events of the following types:
žCompletion of a lock operation
žCompletion of an unlock operation
žStart of a lock operation that is never completed due to a deadlock or an exception
žStart of an unlock operation that is never completed due to a deadlock or an exception
We refer to such a sequence as a LockUnlock-sequence of l. An event in a LockUnlock-sequence is denoted by the identifier (ID) of the thread that executed the lock or unlock operation. To illustrate these definitions, consider the simple program in Listing 3.31. The final value of shared variable x is either 1 or 2.
A possible ReadWrite-sequence of shared variable x is
(1, 0, 0), (2, 1, 0). // from Chapter 2, the event format is (thread ID, version // number, total readers)
This denotes that x was first accessed by Thread1 and then by Thread2. Since Thread1 accessed x first, the PV-sequence for mutex must be
1, 1, 2, 2
148 |
SEMAPHORES AND LOCKS |
indicating that Thread1 performed its P and V operations before Thread2. The second P operation in Thread2 is an error. This P operation will start but not complete and should be a V operation instead.
A SYN-sequence for concurrent program CP is a collection of ReadWritesequences, PV-sequences, and LockUnlock-sequences. There is one sequence for each shared variable, semaphore, and lock in the program. A SYN-sequence for the program in Listing 3.31 contains a ReadWrite-sequence for x and a PVsequence for mutex :
((ReadWrite-sequence of x: (1, 0, 0), (2, 1, 0); PV-sequence of mutex: (1, 1, 2, 2)).
This is a partial ordering of the synchronization events in the program. That is, the events on a single object are (totally) ordered, but the order of events among different objects is not specified. Alternatively, we can define a SYNsequence of a program as a single totally ordered sequence of synchronization events over all the synchronization objects. A totally ordered sequence of events that is consistent with the partially ordered sequence above is
1, (1, 0, 0), 1, 2, (2, 1, 0), 2.
In general, there may be two or more totally ordered sequences that are consistent with a given partial ordering since two concurrent events can appear in the total ordering in either order.
The definition of a SYN-sequence is intended to capture what it means for one execution to replay another. Suppose that when the program above is executed, Thread2 executes mutex.P() and blocks because Thread1 is already in its critical section. During the replay of this execution, assume that Thread2 executes its first mutex.P() operation without blocking, because Thread1 has already executed its mutex.P() and mutex.V() operations. These two executions are not identical. However, in both executions:
žThe sequence of completed P() and V() operations is the same.
žThe final value of x is 2.
Thus, we consider the second execution to replay the first. Although we could include events such as “call V()” or “block in P()” in the SYN-sequence of a semaphore object, we are not required to trace these events in order to do a successful replay, so we omit them. The events that we ignore during replay may be important when we are doing other things. Thus, we define different types of SYN-sequences for the different activities that occur during testing and debugging. Since the SYN-sequences that we use for replay tend to be much simpler than the other types of SYN-sequences, we use the term simple SYN-sequences to refer to the SYN-sequences that are used for replay.
TRACING, TESTING, AND REPLAY FOR SEMAPHORES AND LOCKS |
149 |
According to the foregoing definition of a simple SYN-sequence for CP, we must be prepared to record and replay arbitrary interleavings of read and write operations on shared variables. This is a general replay solution, which can be applied to programs that use shared variables and constructs other than semaphores and locks. However, this general solution may not be easy to implement. Controlling read and write operations during replay adds a significant amount of execution overhead and requires access to the implementation of CP. Hence, we examine some alternative solutions.
In Listing 3.31, observe that the simple ReadWrite-sequence for shared variable x is completely determined by the simple PV-sequence of mutex. That is, if we replay the PV-sequence for mutex, we will, with no additional effort, also replay the ReadWrite-sequence for x. This is an important observation, for it means that if we assume that shared variables are always safely accessed within critical sections, we can develop a simpler and more efficient solution.
Of course, in general, we cannot assume that mutual exclusion holds when shared variables are accessed. A programmer may make an error when writing the program, or even violate mutual exclusion on purpose. If mutual exclusion for a particular shared variable is not critical to a program’s correctness, a critical section can be eliminated to eliminate the overhead of acquiring a lock. Also, a statement accessing a shared variable need not occur within a critical section if the execution of the statement is atomic. (But remember that accessing a shared variable outside a critical section raises the shared memory consistency issues discussed in Section 3.9.)
Our observation about critical sections does not lead to a better replay solution. It does, however, highlight the importance of mutual exclusion and the need for synchronization constructs that aid in implementing mutual exclusion correctly. The monitor construct in Chapter 4 is such a construct. Using monitors greatly improves the chances that shared variables are accessed inside critical sections and, as we will see, makes it easier to replay executions.
For now, we will try to improve our replay solution by using the lockset algorithm in concert with replay. During replay, we will assume that shared variables are accessed inside critical sections that are implemented with mutexLocks that are being used as locks. This allows us to ignore read and write operations on shared variables. During tracing, we will use the lockset algorithm to validate our assumption. The lockset algorithm will tell us when replay may fail.
As we mentioned earlier, in cases where a shared variable can safely be accessed outside a critical section, the lockset algorithm can be turned off for that variable so that no warnings will be issued. However, this may create a problem for replay since we depend on each shared variable to be accessed inside a critical section so that each shared variable access is represented by a lock() or P() operation in the execution trace. In these cases, locks can be used to create critical sections, at least temporarily, so that replay can be performed. Designing a program so that it will be easier to test and debug increases the testability of the program.
150 |
SEMAPHORES AND LOCKS |
Let CP be a concurrent program containing mutex locks, binary semaphores, and counting semaphores. Assume that shared variables are correctly accessed inside critical sections:
žEach semaphore and lock in CP is a synchronization object.
žThe synchronization events in a simple PV-sequence for a semaphore are the four types of events defined above.
žThe synchronization events in a simple LockUnlock-sequence for a mutex lock are the four types of events defined above.
žEach synchronization event is denoted by the identifier of the thread that executed the event.
To replay an execution of CP, we must replay the simple PV-sequences for the semaphores in CP and the simple LockUnlock-sequences for the locks in CP. In the next section we show how to do this.
3.10.3 Tracing and Replaying Simple PV-Sequences
and LockUnlock-Sequences
We now show how to modify the semaphore and lock classes so that they can trace and replay simple PV-sequences and LockUnlock-sequences.
Modifying Methods P() and V() The implementations of methods P() and
V() in classes binarySemaphore and countingSemaphore can be modified so that simple PV-sequences can be collected and replayed. Collecting a PV-sequence for a semaphore s is simple. During execution, the identifier of the thread that completes a call to method s.P() or s.V() is recorded and saved to a trace file for s. Methods P() and V() can easily be modified (see below) to collect these events.
For the purpose of explaining our replay method, we assume that each semaphore has a permit, called a PV-permit. A thread must hold a semaphore’s PV-permit before it executes a P() or V() operation on that semaphore. The order in which threads receive a semaphore’s PV-permit is based on the PV-sequence that is being replayed. A thread requests and releases a semaphore’s PV-permit by calling methods requestPermit() and releasePermit(). These calls are added to the implementations of methods P() and V(). Method P () becomes
void P() {
if (replayMode) control.requestPermit(ID);
// code to lock this semaphore appears here
if (replayMode) control.releasePermit();
TRACING, TESTING, AND REPLAY FOR SEMAPHORES AND LOCKS |
151 |
/* rest of body of P() */
if (traceMode) control.traceCompleteP(ID);
// code to unlock this semaphore appears here
}
Method V() becomes:
public final void V() { if (replayMode)
control.requestPermit(ID);
// code to lock this semaphore appears here
if (replayMode) control.releasePermit();
/* rest of body of V() */
if (traceMode) control.traceCompleteV(ID);
// code to unlock this semaphore appears here
}
We assume that the implementations of P() and V() contain critical sections for accessing the variables that they share. We have indicated this with comments referring to the lock and unlock operations that create these critical sections. Calls to requestPermit(), releasePermit(), traceCompleteP(), and traceCompleteV() must be positioned correctly with respect to the critical sections in P() and V(). The call to requestPermit() appears before the lock operation, and the call to releasePermit() appears right after the lock operation. The calls to traceCompleteP() and traceCompleteV() both appear inside the critical section. This ensures that events are recorded in the execution trace in the order in which they actually occur. If the calls to the trace function are made outside the critical section, the order of the calls may not be consistent with the order in which the threads actually complete their P() and V() operations.
Modifying Methods lock() and unlock() The implementations of methods lock() and unlock( ) in class mutexLock are modified just like methods P() and V(). Class mutexLock contains calls to requestPermit() and releasePermit() before and after, respectively, the lock operation in mutexLock. Calls to traceCompleteLock() and traceCompleteUnlock() appear at the end of their respective critical sections.
152 |
SEMAPHORES AND LOCKS |
To simplify tracing and replay, we do not trace events that represent the start of a P , V , lock, or unlock operation that never completes due to a deadlock or exception. Thus, when deadlock-producing operations are replayed, the calling threads will not be blocked inside the body of the operation; rather, they will be blocked forever on the call to requestPermit() before the operation. Since in either case the thread will be blocked forever, the effect of replay is the same. Similarly, events involving exceptions that occur during the execution of a P , V , lock, or unlock operation will not be replayed. But the trace would indicate that the execution of these operations would raise an error, which probably provides enough help to debug the program. Our solution can be extended so that incomplete events are actually executed.
Class Control Each semaphore and lock is associated with a control object. In replay mode, the control object inputs the simple SYN-sequence of the semaphore or lock and handles the calls to requestPermit() and releasePermit(). In trace mode, the control object collects the synchronization events that occur and records them in a trace file. When a thread calls requestPermit() or one of the trace methods, it passes its identifier (ID). We assume that all threads are instances of class TDThread, which was described in Chapter 1. TDThread handles the creation of unique thread IDs.
A C++ control class is shown in Listing 3.32. Consider a semaphore s and a simple PV-sequence for s that was recorded during a previous execution. When the control object for s is created, it reads the simple PV-sequence for s into vector SYNsequence. Assume that the first j − 1, j > 0, operations of SYNsequence have been completed (i.e., index == j) and the value of SYNsequence[j] is k, which means that thread Tk is required to execute the next event. When thread Ti, i <> k, calls method requestPermit(), it blocks itself by executing Threads[i].P(). When thread Tk calls requestPermit(), it is allowed to return from the call (i.e., it receives the PV-permit). Thread Tk then starts the jth operation on s. Assume that this operation is a completed P() or V() operation. After completing this operation, thread Tk calls method releasePermit() to increment index to j + 1 and allow the next operation in the SYNsequence. The ID of the thread that is to execute the next operation is the value SYNsequence[index]. If this thread was previously blocked by a P() operation in method requestPermit(), method releasePermit() performs operation threads[SYNsequence[index]].V() to unblock this thread. Otherwise, when the next thread eventually calls requestPermit(), it will not block itself since its ID will match the next ID in the SYNsequence.
To illustrate the operation of the controller, consider a corrected version of the simple program in Listing 3.31. In this version, Thread2 calls mutex.P() followed by mutex.V(). Assume that the PV-sequence traced during an execution of this program is
1, 1, 2, 2
which indicates that Thread1 entered its critical section first. During replay, the controller will guarantee that Thread1 will be the first thread to enter its
TRACING, TESTING, AND REPLAY FOR SEMAPHORES AND LOCKS |
153 |
class control { public:
control() {
/* input integer IDs into SYNsequence; initialize arrays threads and hasRequested */
}
void requestPermit(int ID) { mutex.lock();
if (ID != SYNsequence[index]){ // thread ID should execute next event?
hasRequested[ID] = true; |
// No; set flag to remember ID’s request |
|
mutex.unlock(); |
|
|
threads[ID].P(); |
// |
wait for permission |
hasRequested[ID] = false; |
// |
reset flag and exit requestPermit |
} |
|
|
else mutex.unlock(); |
// Yes; exit requestPermit |
|
}
void releasePermit() { mutex.lock(); ++index;
if (index < SYNsequence.size()) { // Are there more events to replay? // Has the next thread already requested permission?
if (hasRequested[SYNsequence[index]]) threads[SYNsequence[index]].V(); // Yes; wake it up.
}
mutex.unlock();
}
void traceCompleteP(int ID) {...} // record integer ID void traceCompleteV(ID) {...} // record integer ID
private:
//PV-sequence or LockUnlock-sequence; a sequence of integer IDs vector SYNsequence;
binarySemaphore* threads; // all semaphores are initialized to 0
//hasRequested[i] is true if Thread i is delayed in requestPermit(); init to false bool* hasRequested;
int index = 0; // SYNsequence[index] is ID of next thread to execute an event mutexLock mutex; // note: no tracing or replay is performed for this lock
}
Listing 3.32 C++ class control for replaying PV-sequences and LockUnlocksequences.
critical section. Suppose, however, that Thread2 tries to execute mutex.P() first and thus calls requestPermit(2) before Thread1 calls requestPermit(1). Since the value of index is 0 and the value of SYNsequence[index] is 1, not 2, Thread2 blocks itself in requestPermit() by executing Threads[2].P(). When Thread1
154 |
SEMAPHORES AND LOCKS |
eventually calls requestPermit(1), it will be allowed to exit requestPemit() and execute its mutex.P() operation. Thread1 will then call releasePermit(). Method releasePermit() increments index to 1 and checks whether the thread that is to execute the next P/V operation has already called requestPermit(). The next thread is SYNsequence[1], which is 1. Thread1 has not called requestPermit() for the next operation, so nothing further happens in releasePermit(). Eventually, Thread1 calls requestPermit(1) to request permission to execute its mutex.V() operation. Thread1 receives permission, executes mutex.V(), and calls releasePermit(). Method releasePermit() increments index to 2 and finds that the thread to execute the next P/V operation is Thread2. Thread2, having already called requestPermit(), is still blocked on its call to Threads[2].P(). This is indicated by the value of hasRequested[2], which is true. Thus, releasePermit() calls
Threads[2].V(). This allows Thread2 to exit requestPermit() and perform its mutex.P() operation. Thread2 will eventually request and receive permission for its mutex.V() operation, completing the replay.
3.10.4 Deadlock Detection
Deadlock is a major problem, especially for concurrent programming novices. In Chapter 2 we defined a deadlock as a situation in which one or more threads become blocked forever. In this chapter, we learned that threads may be blocked forever when they call P(), V(), or lock() operations. Let CP be a concurrent program containing threads that use semaphores and locks for synchronization. Assume that there is an execution of CP that exercises a SYN-sequence S, and at the end of S, there exists a thread T that satisfies these conditions:
žT is blocked due to the execution of a P(), V(), or lock() statement.
žT will remain blocked forever, regardless of what the other threads will do.
Thread T is said to be deadlocked at the end of S, and CP is said to have a deadlock. A deadlock in CP is a global deadlock if every thread in CP is either blocked or completed; otherwise, it is a local deadlock.
Programmers typically become aware of a deadlock when their programs don’t terminate; however, there is usually no indication about how the deadlock was created. Below, we describe a deadlock detection method that we have built into our synchronization classes. When a deadlock is detected, the threads and synchronization operations that are involved in the deadlock are displayed.
Deadlock prevention, avoidance, and detection algorithms are covered in most operating system textbooks. In operating systems, processes request resources (e.g., printers and files) and enter a wait-state if the resources are held by other processes. If the resources requested can never become available, the processes can never leave their wait-state and a deadlock occurs. The information about which process is waiting for a resource held by which other process can be represented by a wait-for graph. An edge in a wait-for graph from node Pi to
