Menu

#2 trying take lock hangs after node down event

open
nobody
None
5
2003-06-15
2003-06-15
No

this is what i did

two node ci-linux cluster.
node 1 and node 2 works smoothly by taking and
releasing locks.
bring node 2 down. Try to acquire the lock on node 1.
No response. Seems to be waiting for some information.
^C will end the application.

analysis.
seems to be a problem with state of node and lock
status. Attaching is the dmesg output when trying to
take the lock on node 1 afer node 2 has gone down.

Trying to acquire the lock with a new name works .

Discussion

  • Aneesh Kumar K.V

    dmesg output during lock operation

     
  • Aneesh Kumar K.V

    Logged In: YES
    user_id=230991

    Another observation :
    Bringing node 2 back makes the locking work again.

     
  • Aneesh Kumar K.V

    Logged In: YES
    user_id=230991

    I am taking a lock using .
    status = dlmlock_sync( LKM_EXMODE, & lockst, LKM_VALBLK,
    LOCK_NAME, strlen( LOCK_NAME ), NULL, NULL );

    and unlocking it using .
    status = dlmunlock_sync( lockst.lockid,
    NULL,
    LKM_FORCE|LKM_INVVALBLK);

    With single node in /proc/haDLM/rldb i get the following entry
    72719 | Openlocktest | VMS | 0005 | 3

    with two nodes.
    72719 | Openlocktest | VMS | 03E5 | 3

    The flag value changed from 0005 03E5.

    When the same test case is run on the new node added i get

    72719 | Openlocktest | VMS | 0005 | 2

    After this when the new node that was added goes down any
    lock operation on the Openlocktest hangs

    -aneesh

     

Log in to post a comment.

MongoDB Logo MongoDB