Archive for dot net rocks

Mutual Exclusion Definitions

Posted in Concurrency with tags , , , , , , , , on July 29, 2008 by moffdub

In my world, concurrency is poorly understood. Most of my experience with it has been with simple locks to coordinate threads. In college, they made vague reference to higher-level language constructs and mechanisms, but dared not elaborate. Such mechanisms might actually be useful in the workplace, and therefore they had no place in the classroom. After all, the students might get their money’s worth.

Mutexes. Locks. Semaphores. Monitors. I’ve never been clear on the relationship between these terms, and if relationships differ depending on your operating system and/or platform. So in this post, I will do my best to clarify the differences for myself and anyone who is reading this.


(the obligatory lock picture)

I’m going to illustrate the terms with equivalents in C/C++, C#/VB.NET, and Java. And I make no claim that I am an expert on any of this; I am but a mere a serf trying to improve his condition.

Locks

Here is a term that I wasn’t even sure was defined as anything above a way to implement mutual exclusion. If I had to come up with my own definition, it would be anything in a language that allows access to a delimited block of code by one and only one thread at a time.

In C#/VB .NET land, this is essentially what the lock keyword accomplishes.

On .NET Rocks show #166, with guest Joe Duffy, Carl Franklin infers this just before the guys get into a discussion on the differences between different synchronization mechanisms:

Carl Franklin: Yeah boy, I — don’t I know it, that is tough. I specially find it challenging because there is two things that you lock right, you can lock an object with sync locker, the lock keyword in C# and you can also lock that around a block of code. So what you are really doing is, you are locking that code. It’s not just about the object you are using as the lock object then there’s two different scenarios there.

In Java land, the synchronized keyword gets this job done. Not much more to say there.

In C/C++ land, where I will focus on POSIX because that is what they made us use in college, pthread_mutex_lock is what I dealt with the most. Note the confusing name: “lock” is the operation, and you’re locking a “mutex”, which is the next category. Functionally, though, I am putting pthread_mutex_lock in the simple lock category.

The verdict: a lock is a simple mechanism used by threads to allow one thread access to a block of code.

Mutexes

Here is where it gets kind of fun. As far as I knew, mutex is just short for mutual exclusion. That sort of clarity does not hold as you venture onto different platforms, though.

Carl Franklin: So, the difference between a lock and Mutex; that’s always a fun thing to talk about.

Joe Duffy: Well, Mutex is kind of a lock but the neat thing about Mutexes on Windows is that unlike say a monitor on the CLR you can use it to synchronize access to shared resources across multiple processes on the machine.

That answers what a mutex is in C#/VB.NET land: a synchronization mechanism that can coordinate access between both threads and processes. I found a cool example of using this class to ensure only one copy of a process is running at a time.

In Java land, I did not find an equivalent mechanism. In fact, doing anything involving IPC in Java that does not use sockets or RMI seems to involve an exercise of acrobatics. Consider using shared memory and JNI in concert with the synchronized keyword.

In C/C++ land, the same POSIX library, namely the functions pthread_mutex_lock and pthread_mutex_unlock and type pthread_mutex_t, describes a mutex as “a MUTual EXclusion device, and is useful for protecting shared data structures from concurrent modifications, and implementing critical sections and monitors” (source).

There is nothing that implies IPC here, as is implied in .NET. The closest POSIX C/C++ gets to a cross-process mutex is a binary semaphore, which is defined later in this post. which can be used to synchronize OS-level goodies. The boys on show #166 tend to agree with the binary semaphore comment:

Joe Duffy: So a Mutex is like a semaphore with a count of one, right.

The conclusion? Mutex is an ill-defined term. Sometimes it is a shorthand for the idea of mutual exclusion. Sometimes it is a synonym for simple locks. Sometimes it is a wrapper for simple locks around IPC mechanisms. I’d prefer not to use it.

Semaphores

Sure enough, the most esoteric of the terms is the most clearly defined.

Joe Duffy: A semaphore is kind of like a Mutex, except — well let me reverse that. A Mutex is kind of a like a semaphore. Semaphore has a count right, and when somebody acquires it, they decrement the count and if the count ever reaches zero and somebody tries to acquire it, they block waiting on it. And the only when the semaphore is incremented will they actually be able to move forward.

Indeed, the System.Threading.Semaphore class in C#/VB.NET fulfills the job.

Java land, not to be outdone, provides its own Semaphore class.

POSIX C/C++ supports semaphores as well, though the interface is in dire need of an application of the Facade pattern. Semaphores here can be used for cross-process access via shared memory, in contrast to Java and .NET’s implementations.

Speaking of facades, here is exactly what I am talking about, given to us students by Dan Duchamp to help us with our assignment (based on the Dutch words assigned by Dijkstra for “increase” and “try-and-decrease”):

#include <stdio.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <stdlib.h>

void P(int semid)
{
	struct sembuf sbuf;
	sbuf.sem_num = 0;
	sbuf.sem_op = -1;
	sbuf.sem_flg = 0;
	
	if (semop(semid, &sbuf, 1) < 0)
		perror("semop P"), exit(1);
	
	return;
}

void V(int semid)
{
	struct sembuf sbuf;
	sbuf.sem_num = 0;
	sbuf.sem_op = 1;
	sbuf.sem_flg = 0;

	if (semop(semid, &sbuf, 1) < 0)
		perror("semop V"), exit(1);
	
	return;
}
void Init(int semid, int value)
{
	union
	{
		int val;
		struct semid_ds *buf;
		u_short *array;
	} arg;
	
	arg.val = value;
	
	if (semctl(semid, 0, SETVAL, arg) < 0)
		perror("semctl SETVAL"), exit(1);
		
	return;
}
void Print(int semid)
{
	int val;

	if ((val = semctl(semid, 0, GETVAL)) < 0)
		perror("semctl GETVAL"), exit(1);

	printf("Semaphore %d’s value = %d\n", semid, val);

	return;
}

The conclusion here: no doubt a semaphore is, abstractly, a counter that causes threads to block when it is decremented to 0. The only ambiguity can come when considering if it is cross-process or not.

Monitors

In the operating system course, monitors were alluded to for about 4 minutes as language support for mechanisms like locks and semaphores. It sounded kind of mystical. Thanks to this post, they sound as pedestrian as anything else I will have the misfortune of using.

One of the biggest surprises to me was finding out that in .NET land, monitors are below the lock statement; the lock statement is a higher-level version of monitors. Huh?

Joe Duffy: Well, Monitor gives you a couple of things that just using the lock keyword doesn’t; one is time out, so you can actually specify time out when you say monitor.enter. … In general, its easier to use the lock keyword because making sure — guaranteeing that you actually exit the Monitor requires that you use like a try-finally block. And the actual coding pattern in there is hidden behind the scenes with the lock keyword. So it makes it a lot easier for you.

OK then. For now, it is enough to know that the Monitor class provides an interface through which you can wait for a lock to be unlocked and signal to a thread waiting on that lock that the lock is available (or all threads).

Java land, not to be outdone, follows suit. The compulsive neurotic in me cheers the consistency: there is a wait method, and notify and notifyAll methods, analogous to Pulse and PulseAll in .NET.

POSIX extends their mileage by providing pthread_cond_wait, pthread_cond_signal, and pthread_cond_broadcast. Hooray for consistency! You can no doubt find a C++ monitor class somewhere that wraps these calls into something Java- or .NET-like.

The conclusion: monitors, as their lowest, are condition variables and the ability for threads to wait to be signaled that a simple lock is available and to signal one or more threads that the lock is available.

Writing it all down has a way of putting the issue to bed for me, or at least tucking it in. If we are going to be stuck with the thread model for a while, I might as well know what everyone’s talking about.

Death of the Thread Model

Posted in Paradigms with tags , , , , , , , , on July 19, 2008 by moffdub

You haven’t really had the fear of Turing struck into you until you’ve tried concurrent programming. Any lofty visions you’ve had of a normal, happy life as a programmer won’t survive your first pthread or Thread.run() call.

My most recent non-trivial attempt at threads was way back when in college. The course was “TCP/IP Programming.” It was the kind of course that didn’t interest me just from its title and description. Really, I signed up for it because the best professor in the entire school was delivering it, Dan Duchamp, and this was my last semester. In the end, it actually wound up being an informative course, because networking is not my strong point.

We were warned that the assignments would be in C and would be back-breakers. This almost turned out to be true. I had Duchamp in three other courses, and sometimes he would overstate the scope of a project.

Our projects were:

  • a DNS resolver that sent UDP to DNS and relayed the result via TCP back to whomever requested it
  • a simulation of ICMP and IP; this is one of the courses that I look back on and think “I can’t believe I pulled that off”
  • a link-state and distance vector routing simulation
  • a simulation of TCP
  • port the first assignment to the TCP simulation

Somehow, someway, I had no problem with the first three. Then I read the description of the TCP simulation, and I shuddered when I saw that three threads were required.

I pretty much knew right then that it was over.

In fact, after getting extra help from Duchamp, it became apparent that I needed to use pthread_cond_wait, pthread_cond_timedwait, pthread_mutex_t, and pthread_cond_signal to name a few. These were system calls and types that I had never used before.

Duchamp, bless his heart, claimed that he covered them in the OS course, which I also took with him. While he covered mutexes and locking, we did not make use of condition variables in assignments. So on top of the intellectual burden of dealing with my half-finished attempt at multi-threading, which was already based on a bad thread design (and it was too late to start over), I had to learn and use new system calls correctly.

My point is my last semester of senior year could’ve been much less stressful if programming with threads wasn’t a death trap. At work, I asked our technical lead if they made use of multi-threading in the code, and the answer was a resounding “No, it becomes too ugly too quickly.”

What’s odd is that in most CRUD apps, concurrency is transparent and under the hood. Most RDBM systems give you this for free in the form of transactional processing.

So something can’t be right. It can’t be such a nightmare to do something like three threads, one for each layer boundary, plus another to do the real work, but so effortlessly achieve concurrent data access and processing. Why do I have to suffer through concurrency if it is not DB-related?

Then, once upon a time, I was listening to Dot Net Rocks show #269. The guest was Larry O’Brien, who was arguing that the thread-and-lock model is garbage:

Larry O’Brien: Oh, absolutely, but even if you get it and even if you think it’s fun, there’s a thing about that thread and process model that we have in the CLR today, which I would argue makes it fundamentally broken for business programming. So, check out this scenario. You have a library and it has some publicly available method by which it acquires a lock, right?

Carl Franklin: Yeah.

Larry O’Brien: And holds that lock and returns to you and you go about and do your other thing. If you have another method that calls that library on which that library performs a callback and by a callback I mean all sorts of important things. I’m talking about virtual method calls in a world of object orientation. I’m talking about delegates and events central to graphics programming and I’m talking about lambda expressions, which are coming out in C# 3.0 and they’re very important for LINQ. You can never safely have a library that simultaneously has a publicly triggerable lock that executes a callback to client code because it’s perfectly legal for the client code to start up a new thread…

Carl Franklin: Yeah.

Larry O’Brien: And within that new thread, call that method attempts to get the lock which is already held by the previous thread and the end result of that is deadlock.

I’d take this even a step further: the thread-and-lock model of concurrency is broken because it is too low-level. Why, in this age of garbage collection, managed code, and workflow engines, hasn’t the abstraction level been raised on concurrency as well?

The itching issue that needs scratching is that it has been for quite some time, but only in the niche of databases. Memory management used to be a pest; now there is platform support for garbage collection. We need similar support for concurrency. It was just then that the guys on the show discuss Software Transactional Memory:

Larry O’Brien: Okay. Software Transactional Memory is essentially optimistic locking for memory.

Carl Franklin: Yeah.

Larry O’Brien: You wrap a call or a block in what’s essentially a transaction. You say “Atomic, start block” and then you finish the block and it runs at full speed and then you have a follow-along processor that goes through it and it looks for that one in a thousand, one in a million memory collision. If a memory collision has occurred, it retries it.

Carl Franklin: Yeah.

Larry O’Brien: Yeah. It’s not at all a bad model.

This would be a giant step forward. It does have its problems, namely that it still requires the programmer to be aware and predict what blocks of code are critical sections in order to mark them off as transactions.

To take it a step higher, using Futures, this sort of demarcation can be delegated to the compiler and/or runtime environment entirely:

Larry O’Brien: Instead of trying to second guess the creation of thread, what you do is you say, “Well, I want this value” you know this value being returned from a function, “but I can either have the value or I can have” what Sutter calls “a Future.”

A Future is basically an IOU. Okay, so you have this reference to a Future and when you need it, when you say, “Okay, I want to cash in this IOU” at that point, the system will block and guarantee that the value becomes concrete. So, just by sort of flipping things around from “Okay, let’s create processes and kind of second guess how many cores I have” and those sorts of approaches.

This approach says, okay by simply having a whole bunch of IOUs floating around then it becomes possible for the compiler on the runtime to do a lot of really clever reasoning about “Oh hey, look. Here’s this IOU, but I know that this IOU won’t be required for several hundred milliseconds and so, therefore, I have enough information to distribute that to another core.”

The only drawback here is again another programmer-dependent pitfall. Functions that return Futures work best, if at all, when they are pure functions with no side effects. This is a throwback to functional programming, and my personal least-favorite Scheme.

I still favor these methods to the rather Draconian no-shared-mutable-state edict of Erlang:

Carl Franklin: There’s no shared memory, right?

Larry O’Brien: Oh, that’s huge.

Carl Franklin: Between processes?

Larry O’Brien: That’s a huge thing. Really to make Concurrency work at the language level, the number one thing that you want to get rid of is shared mutable state.

Carl Franklin: Yup.

Larry O’Brien: Erlang is single assignment to variables…

This is why I think Polyphonic C# is a nice try but gives me some of the same headaches I felt at the start of this post.

It is ultimately a trade-off between shared mutable state (and thus performance) a la Futures and functional programming, and clarity (and thus less performance) a la STM. I think this trade-off is a familiar one to most people in this field. Moreover, I’ll say that this trade-off is as innate to our field as Brooks’ Law.

I maintain that abstractions like STM and Futures are going to be very important for us software developers in the coming decade if we are to make affordable full use of the next wave of hardware.

Follow

Get every new post delivered to your Inbox.