From: SourceForge.net <no...@so...> - 2008-08-18 22:27:19
|
Bugs item #2023112, was opened at 2008-07-21 01:13 Message generated for change (Comment added) made by dkf You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=2023112&group_id=10894 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 35. TclOO Package Group: current: 8.6a1 Status: Open Resolution: None Priority: 7 Private: No Submitted By: Daniel A. Steffen (das) Assigned to: Don Porter (dgp) Summary: oo-18.3 fails with -singleproc 1 Initial Comment: oo-18.3 fails when running the testsuite with -singleproc 1. I've tracked this down to a conflict between the tests NRE-oo.4, namespace-42.8 and oo-18.3, all three are needed to get the failure... i.e. running the testsuite with the following options gives the output below: -singleproc 1 -verbose "bet" -file "NRE.test namespace.test oo.test" \ -match "NRE-oo.4 namespace-42.8 oo-18.3" Tests running in interp: build/tcltest Tests located in: tcl/tests Tests running in: build Temporary files stored in build Test files sourced into current interpreter Running tests that match: NRE-oo.4 namespace-42.8 oo-18.3 Skipping test files that match: l.*.test Only running test files that match: NRE.test namespace.test oo.test Tests began at Mon Jul 21 01:57:46 CEST 2008 NRE.test ---- NRE-oo.4 start namespace.test ---- namespace-42.8 start oo.test ---- oo-18.3 start ==== oo-18.3 OO: define command support FAILED ==== Contents of test case: list [catch {oo::class create foo {error bar}} msg] $msg $errorInfo ---- Result was: 1 bar {bar while executing "error bar" (in definition script for object "::foo" line 1) invoked from within "return -options $opt $msg" (class "::oo::class" constructor line 6) invoked from within "oo::class create foo {error bar}"} ---- Result should have been (exact matching): 1 bar {bar while executing "error bar" (in definition script for object "::foo" line 1) invoked from within "oo::class create foo {error bar}"} ==== oo-18.3 FAILED Tests ended at Mon Jul 21 01:57:47 CEST 2008 all.tcl: Total 419 Passed 2 Skipped 416 Failed 1 Sourced 3 Test Files. Files with failing tests: oo.test ---------------------------------------------------------------------- >Comment By: Donal K. Fellows (dkf) Date: 2008-08-18 23:27 Message: Logged In: YES user_id=79902 Originator: NO 1) would seem to indicate that there is a mistake in the copying of the epoch change detection code from the code to compile a procedure body to the code to compile a method body. That's a pure bug; my intention was that they should detect changes identically. 2) is a hack. Non-hack replacements that are valid even with arbitrary traces are welcome. (I don't like using string processing for this because I worry about newlines. I suppose I could instead have written the constructor using a C-implemented method; the APIs for that are now much stronger than they were.) ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2008-08-04 17:22 Message: Logged In: YES user_id=80530 Originator: NO see 2037727. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2008-08-04 05:35 Message: Logged In: YES user_id=80530 Originator: NO see 2037338 ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2008-08-03 05:54 Message: Logged In: YES user_id=80530 Originator: NO The original prescription no longer reveals a bug, since NRE.test was renamed nre.test, and even sticking in the new names has the three tests in a different sequence that shows no test failure. Attached patch adds a new test oo-18.6 to replace the lucky trio. Attached patch also includes a fix for one of the bugs contributing to the test failure. When an in-progress TEBC detects a bump of the compile epoch it calls on Tcl_EvalEx to evaluate (possibly redefined) commands rather than continue executing invalid bytecode. Tcl_EvalEx does its own logging of errors to errorInfo. When the Tcl_EvalEx returns though, TEBC still routes exceptions through its own handling machinery which here attempt to log to errorInfo a second time, and also replace a carefully crafted errorline value. Fixing that is sufficient to make the test pass. There are other bugs discovered on this adventure. 1) This scenario should never get into the mid-execution invalid bytecode situation to begin with. The bytecode is known invalid before execution begins, but OO method body dispatch is currently failing to make the epoch check. 2) The trickery with a magic errorline value truly is an ugly hack, and at least to my eyes should be easy enough to replace with something cleaner and more robust. I'll try to craft additional tests to reveal these other bugs independently. Once I get those filed, I'll be able to close this one. File Added: 2023112.patch ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2008-08-01 22:05 Message: Logged In: YES user_id=80530 Originator: NO In the buggy scenario, Tcl_LogCommandInfo is getting called between the time the magic errorLine is set, and the time it is checked. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2008-08-01 21:18 Message: Logged In: YES user_id=80530 Originator: NO Somehow the "Horrible hack" at line 930 of tclOOMethod.c is being disrupted. Indicates the value of iPtr->errorLine is being overwritten when not expected. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2008-08-01 20:40 Message: Logged In: YES user_id=80530 Originator: NO Confirming that theory, replace proc demo args {}; rename demo {} with testcmdtrace tracetest {} and see the same failure. What those have in common is a bump of the compileEpoch. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2008-08-01 20:33 Message: Logged In: YES user_id=80530 Originator: NO Oops. More likely is that the proc demo args {}; rename demo {} sequence has the effect of bumping the compileEpoch. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2008-08-01 20:28 Message: Logged In: YES user_id=80530 Originator: NO Comment out line 343 of tclProc.c, ((Command *) cmd)->compileProc = TclCompileNoOp; and the test failure goes away (as well as no-op proc optimization). Suggests ugly possibility of memory corruption at the root of the problem? ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2008-08-01 19:35 Message: Logged In: YES user_id=80530 Originator: NO The proc name "demo" is irrelevant, but the argument spec and body matter. There must be some interaction with the special "compile to no instructions" procs. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2008-08-01 19:23 Message: Logged In: YES user_id=80530 Originator: NO So this interactive session demo's the bug, which appears to be within TclOO: % oo::class create foo {error bar} bar % set errorInfo bar while executing "error bar" (in definition script for object "::foo" line 1) invoked from within "oo::class create foo {error bar}" % % % oo::class create foo ::foo % foo destroy % proc demo args {} % rename demo {} % oo::class create foo {error bar} bar % set errorInfo bar while executing "error bar" (in definition script for object "::foo" line 1) invoked from within "return -options $opt $msg" (class "::oo::class" constructor line 6) invoked from within "oo::class create foo {error bar}" ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2008-08-01 19:18 Message: Logged In: YES user_id=80530 Originator: NO Probing reveals that the role of namespace-42.8 is just the command sequence: proc demo args {} rename demo {} And the role of NRE-oo.4 is just the command sequence oo::class create foo foo destroy ---------------------------------------------------------------------- Comment By: miguel sofer (msofer) Date: 2008-07-29 22:58 Message: Logged In: YES user_id=148712 Originator: NO Confirm reproducible, concur hard to explain. Relieved that it exists pre-nre, so that that's not where to look. Specifically, it exists since the very first commit of TclOO to the Tcl tree on 2008-05-31 (on my machine setting 'ulimit -s unlimited' allows the NRE tests to run to completion). ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2008-07-21 09:13 Message: Logged In: YES user_id=79902 Originator: NO Huh!? That's bizarre! (I confirm that this is reproducible.) OK, initial suspicion is something to do with representation caching and literals that is ending up allowing the error to escape the [catch] in the class constructor script. Handing off to an expert on such things for a second opinion. (I've no idea what to do about this!) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=2023112&group_id=10894 |