The errors don't say whether it is something to do with POSIX locks. POSIX locks are cluster wide and so are semaphores.

In the case of POSIX locks OpenSSI has a bug that can race across the cluster. That is when two or more processes on different nodes fight over the same file lock there is no guarantee they will be served in the order their requests were submitted. They are served in the order their requests arrive at the CFS server the file belongs to.  A possible fix is to serialize the requests by timestamp.

On Feb 17, 2013 12:48 AM, "Mulyadi Santosa" <> wrote:
On Fri, Feb 15, 2013 at 5:49 PM, Oliver Urbann
<> wrote:
> the process is migrated but the benchmark crashes with different
> messages, e.g.:
> Client 3 aborted in state 11: FATAL:  semop(id=491526) failed:
> Bezeichner wurde entfernt
> or
> Client 2 aborted in state 8: FEHLER:  lock RowExclusiveLock on object
> 16384/16391/0 is already held
> Did anybody have success load balancing postgresql?

IIRC, locks in OpenSSI is not system wide. Further more, when a
process (the whole thread group in this case) migrated, then it is
migrated entirely in the sense that it physically move.

This is different than say MOSIX which use "stub" approach, so lock
might still work because there is still communication between origin
node and the new node (from the migration point of view)

So that might explain the lock error message. To prove it, try to
install MOSIX and see if the crash/error disappear during your test.


Mulyadi Santosa
Freelance Linux trainer and consultant


The Go Parallel Website, sponsored by Intel - in partnership with Geeknet,
is your hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials, tech docs,
whitepapers, evaluation guides, and opinion stories. Check out the most
recent posts - join the conversation now.
Ssic-linux-users mailing list