Re: [Nomen-dev] A radical idea: a non-threaded language

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Recommended reading for this thread:
http://www.amazon.com/exec/obidos/ASIN/1893115100/qid=1017350701/sr=1-1/ref=sr_1_1/002-6318547-1132830

On Thu, 28 Mar 2002, Denis wrote:

> Hi Brian
> 
> Still coming with more revolutionary ideas! :-)

It's easy: just question the obvious.  Pick a random assumption ("Threads 
are good", in this discussion) and ask what the world would look like if 
that assumption is false.

> IMO threads are a low-level facility, like raw access to console 
> input etc. They are difficult to use properly.

And pointer arithmetic.  As a systems programmer, I've had to deal with
multiple threads (generally in different contexts) and reentrant code on a
regular basis.  My experiences have made me a paranoid with regards to the 
problems of threading.  After you spend a couple of weeks tracking down 
and fixing a condition that only occurs on average once every 48 hours, 
you'd be paranoid to.

Unfortunately this is not an attitude shared by many, let alone most, 
programmers.  If it doesn't crash in testing, it must be OK.

> I would use threads if I'd want to do asynchronised I/O... Wouldn't 
> you? 

No, I'd use async i/o.  Do what you mean.

<LECTURE>

OK, quick introduction for those on the list who haven't used aio.  
Normal i/o is synchronous- you do a write() or read(), and the problem
blocks and doesn't do anything more until the i/o completes.  This is an
easy way to program- at the end of the read() all the data is available
for your immediate inspection, and at the end of the write() the buffer
you just wrote is safe to be overwritten.  With async i/o this isn't true.  
The system starts the i/o (generally, it starts a DMA transfer, and the
hardware starts transferring data).  It then returns control to the
process.  There are means to come along later and test if the I/O is 
finished.

Synchronous I/O generally isn't a bottleneck, but if you're not handling 
lots of different streams.  But consider the case of a (simple) HTTP 
server- HTTP requests come in, and in response the server sends files 
back.  With async I/O the server could be doing multiple I/Os 
simultaneous.  The main loop might look like (in pseudo code):

    while (1) {
        wait for a new connection or an async io to complete
        while we have a new connection {
            open the new connection;
            add the new connection to our list of connections;
        }
        for all open connections {
            if the async write to the socket just completed {
                if we have the next buffer to write {
                    start an async io writting that buffer down the
                    socket;
                }
                if we don't have an async read going {
                    start the async read from the file, reusing the
                    buffer we just finished writting from;
                }
            } else if the async read from the file just completed {
                if we don't have an async write going {
                    start the async write to the socket with the
                    just finished being filled buffer;
                }
                if we have a spare buffer to read into {
                    start the next async read from the file into
                    that buffer;
                }
            }
        }
    }

This structure acts sort of like a juggler, or better yet, a grand master 
playing 50 different people simultaneously.  The grand master only needs 
two seconds to see through the other player's clumsy bluff, and do the 
pawn advance that anhilates the strategy- at which point the other player 
needs ten minutes to figure out what to do, during which the chess master 
is off playing other people.  This is the situation the CPU is in with 
regards to the I/O subsystem- it only takes the CPU a microsecond (which, 
with today's multi-gigahertz cpus, maybe thousands of intrustions) in 
order to decide to do a disk write that may take milliseconds to complete.

Note the only danger of using async io is that you can corrupt things 
accessing the I/O targets (the buffers) before the I/O is completed.  This 
is called a bug, generally.  But no race conditions, no deadlocks, no 
obscure bugs that can't be reproduced consistantly.

We could add a facility in the language to "lock" the buffer until the
async i/o is done.  He he.  If we did that, we wouldn't have to
differentiate between async i/o and sync i/o.  You can lock objects, such
that any reference to the object blocks the accessing thread.  All i/o is
then async but locks the buffers until they're done.  So if you have code
like:

    buf is string;
    f is file;

    f.readln(buf); /* read the next line */
    if (buf[0] == '\n') { /* empty line */
        ...

OK, the readln would fire off and start the I/O, lock buf, and return.  
But the next thing the thread does is try and read buf[0], at which point 
it blocks.  If the code were changed to say:
    f.readln(buf);
    go_factor_a_large_number();
    if (buf[0] == '\n') { ...

Then the procedure go_factor_a_large_number() could occur in parallel with 
the read.  When that procedure returned, if the I/O had completed and buf 
was unlocked, then the inspection of buf[0] would continue without 
blocking- the read was implicitly changed into a nonblocking read.  If the 
I/O hadn't quite completed yet (either it was a really long latency read, 
or it didn't take that long to factor the number), the thread would then 
block until the i/o completed.  As much I/O parallelism as is possible 
gets used.  With no special logic on behalf of the programmer.

I comment that we already need to do this for asynchronous garbage 
collection- GC needs a way to go "hey, I'm changing that- don't go 
changing it until I'm done, then make sure you change the new object and 
not the old!"  We just link I/O into it.

Obvious, there can only be one asynchronous transaction outstanding
against a given object at any given time.  We associate the I/O
transaction with that object, and provide the aio interface with that 
object.  So we could have a function aio_pending(arg is object):boolean 
which returns true if an asynchronous I/O is pending against that object, 
and attempting to access it will cause the thread to block.  Etc.  We 
don't have to provide an aio class to encapsulate the async io state.

> Ok fork() is an alternative, but I don't see async i/o as an 
> alternative. I must be missing the point.

The nice thing about fork() is that it pushes all the multithreading 
problems onto the OS.  Which is a signifigant advantage- solve the problem 
once, solve it correctly, and solve it in a way where everyone can simply 
reuse the solution.  And the OS has to deal with all the multithreading 
problems anyways...

The advantage async I/O has over fork() is fewer task switches.  You only
have a single thread in a single process running.  This is also the
advantage of a green threads, as to the hardware (which is where the
performance cost of multiple processes is incurred) it's the same diff.  
But note that most green thread implementations replace calls to sync i/o 
with calls to async i/o, so when on thread blocks, all of them don't.

> 
> For this case it would be nice to have a construct to run several 
> pieces of computation in parallel, and send/receive data with the 
> forks. I am thinking of a pipe- or message-based protocol.
> 

Several protocols already exist for this- both specialized (Beowulf and 
MPI both specialize in parallelizing numeric problems across a network) 
and generialized protocols that can be used for that (CORBA, COM/DCOM, 
RPC, RMI, SOAP, probably others).  

> >
> > 	C) Maintain the illusion of responiveness in a slow GUI.  This
> > implies both bad library design and bad code architecture.  The 
> > fact that
> > the MFC is rife with these problems doesn't make it acceptable.
> 
> Mmh... What about BeOS? It doesn't seem to me that the heavy 
> multi-threading is in any way there for 'bad library design and bad 
> code architecture'. The mere fact that threads helped to improve 
> Windows performance, /despite/ the shortcomings, rather supports 
> the use of threads in GUIs.

> BTW, Java uses threads in the GUI frameworks but the programmer 
> don't deal with them... unless he implements his own threads, in 
> which case it becomes a mess.

No, you can get into thread problems without spawing your own threads-
just have a call back from the GUI (which runs in the GUI's thread)  
interact with the main thread.  Also, the GC runs in a different thread.  
Did you know that destructors are run in a different thread (the GC
thread) than the main thread?  So if your destructor follows a reference 
to an object which is still visible, you have a race condition between the 
GC thread and the main thread.

What this means is that thread problems are all the more unexpected when
they bite you.  "But I'm not using multithreading!  All I'm doing is using
a GUI, and I have a couple of destructors!"

For what it's worth, Unix signals also have the same problem.  I've seen a
number of race conditions using signals.

> I don't want threads. I want parallelism.

Yes.  Back off from what everyone else gives us, to think about what we 
want and what we need.

> I would prefer to have kernel threads (using multiple CPUs), but 
> hidden from the programmer. In a first time we could implement them 
> with green threads to be sure we are doing the things right.

Green threads allows us to provide some gaurentees that we don't get with 
any other sort of threads.  Take the classic race condition problem:
    semaphore is int;

    if (semaphore > 0) {
        semaphore -= 1;
    }

With green threads, I can gaurenteed this works correctly- that you won't 
get some other thread sneaking in to modify semaphore between our test and 
our modification.  The easiest way to do this is simply to do cooperative 
multitasking between the threads.  Which causes other problems, as the 
code:
    while (true) { }
halts the entire program.

Hmm.  You're right, cooperatively multitasking threads just moves the 
problem around.  If we're doing multithreading, we do real multithreading.

> Choice #2 seems interesting, I am sure there are valid abstractions 
> out there to do what threads do with more control. If following 
> memory references allows such an abstraction to be implemented, I 
> am all for it.
> 

Here's a simple problem which gives a flavor of the problems this idea 
faces.  Consider you have the following:

   thread A --> object X <--> object Y <-- thread B

Where <-- and --> means "has a reference to".  Obviously, both object X
and Y are "shared" objects (reachable from both thread A and thread B),
and thus protected by locks.  So thread A goes an accesses object X,
grabbing it's lock.  Thread B then accesses object Y, accessing it's lock.  
Thread A then follows the reference in object X to object Y, and tries to
get the lock on object Y, and blocks waiting for thread B to release it 
(it keeps it's lock on object X).  Thread B then follows the reference 
from object Y to object X, and blocks waiting for A.  Deadlock!

Obvously, some sort of lock agglutenation needs to happen- both object X 
and Y need to be protected by the same lock.

Now, consider a doubly linked list in shared memory space.  We could 
protect the entire linked list with a single lock.  This would work, and 
if we don't have a lot of cotention on the data structure, would be the 
most efficient (you only have to grab one lock, not lots).  And would work 
in all conditions.  But if we had a lot of contention on the list it might 
be worthwhile to only lock subportions of the list.  This means traversing 
the list is more expensive, as you have to grab and release a number of 
locks, not just one, but it'd allow multiple threads to be accessing the 
list simultaneously.  Plus we may want to use reader/writer locks- allow 
multiple readers but disallow writers.  And we may have threads traversing 
the list both forwards and backwards, and at different rates.  Etc.

Brian