Menu

#3763 async-4.3 sometimes fails

obsolete: 8.5a7
closed-fixed
7
2011-08-19
2007-08-15
Don Porter
No

Need thread-enabled Tcl to demo this.
On FC 6 with the Tcl HEAD, about every
other run:

$ make test TESTFLAGS="-file async.test"
...
async.test

==== async-4.3 async interrupting loop-less bytecode sequence FAILED
==== Contents of test case:

hang3 $hm

---- Result was:
Async event not delivered
---- Result should have been (exact matching):
test pattern
==== async-4.3 FAILED

Discussion

  • Donal K. Fellows

    • assigned_to: nobody --> dgp
    • priority: 5 --> 7
     
  • Don Porter

    Don Porter - 2007-08-21

    Logged In: YES
    user_id=80530
    Originator: YES

    Async Events has no maintainer ?!

     
  • Don Porter

    Don Porter - 2007-08-21
    • assigned_to: dgp --> nobody
     
  • Don Porter

    Don Porter - 2008-02-29

    Logged In: YES
    user_id=80530
    Originator: YES

    Just saw this again.

     
  • Don Porter

    Don Porter - 2008-08-08

    Logged In: YES
    user_id=80530
    Originator: YES

    Testing 8.5.4, I now see this
    test *crashing* !! on thread-enabled Tcl.

     
  • Don Porter

    Don Porter - 2008-08-08
    • priority: 7 --> 9
     
  • Don Porter

    Don Porter - 2008-08-08

    Logged In: YES
    user_id=80530
    Originator: YES

    Grrr... doesn't crash every time.
    Sometimes merely fails.
    Sometimes passes.

     
  • Don Porter

    Don Porter - 2008-08-08

    Logged In: YES
    user_id=80530
    Originator: YES

    Back to prio-7 since it's not
    repeatable.

     
  • Don Porter

    Don Porter - 2008-08-08
    • priority: 9 --> 7
     
  • Donal K. Fellows

    • assigned_to: nobody --> mistachkin
     
  • Joe Mistachkin

    Joe Mistachkin - 2009-10-02

    This may be caused by a subtle (and rare) race condition in the test itself.

     
  • Joe Mistachkin

    Joe Mistachkin - 2009-10-18

    To see this test fail reliably, simply change the line in proc hang3
    from: [string repeat {;incr i;} 1500000] to: [string repeat {;incr i;} 150]

    The test itself assumes that the constructed script will be pending long enough for the async event to be delivered. This is not correct.

     
  • Joe Mistachkin

    Joe Mistachkin - 2009-10-18

    Changing the integer constant to 150 causes a segfault on my system.

     
  • Joe Mistachkin

    Joe Mistachkin - 2009-10-18

    There are actually two distinct bugs here. The test failure and the segfault. Further research reveals that the "testasync marklater" sub-command creates and uses a child thread and shares data without using thread-safe mechanisms. This change was introduced in 2003 (revision 1.73). The bug appears to exist only in 8.5 and HEAD. Previously, the code assumed that only one thread was accessing the data stored in firstHandler.

     
  • Alexandre Ferrieux

    No need to keep two threads. Marking this one as Dup of 2981154.

     
  • Alexandre Ferrieux

    • status: open --> closed-duplicate
     
  • Alexandre Ferrieux

    Reopening: segfault fixed in 2981154; failure remains.

    Again, this failure is a mundane one: it simply indicates that the finite stretch of bytecodes that we meant to interrupt somehow managed to outpace a thread with a Tcl_Sleep(1). Maybe we should just pour an [after 1] in the stretch of [incr].

     
  • Alexandre Ferrieux

    • status: closed-duplicate --> open
     
  • Alexandre Ferrieux

    Fixed in HEAD, by adding an [after 10] to the sequence to interrupt.

     
  • Alexandre Ferrieux

    • status: open --> closed-fixed
     
  • Don Porter

    Don Porter - 2011-08-19

    Can the fix be applied to 8.5.11 as well?

     
  • Alexandre Ferrieux

    Backported.