From: SourceForge.net <no...@so...> - 2012-12-07 16:07:36
|
Bugs item #3588687, was opened at 2012-11-19 22:41 Message generated for change (Comment added) made by dgp You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3588687&group_id=10894 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 39. Package Manager Group: None Status: Open Resolution: None Priority: 8 Private: No Submitted By: Don Porter (dgp) Assigned to: Don Porter (dgp) Summary: Tcl_InitStubs() Broken As Designed Initial Comment: A stubs-enabled extension encounters Tcl in three distinct ways. 1) It compiles against a tcl.h file from some release of Tcl. With USE_TCL_STUBS active, the particular tcl.h file determines the layout of the stubs table that the extension is compiled to assume. 2) It links against some libtclstub.a file. That libtclstub.a file in turn was compiled against some tcl.h file and from it got a value of TCL_STUBS_MAGIC compiled into it. As far as I can tell there is nothing in place to make sure that the two tcl.h files so far mentioned are the same file, or at a minimum define the same TCL_STUBS_MAGIC value. Until now it's never been an issue since we've never changed that value. 3) The extenson [load]s into some Tcl interp. The extension calls Tcl_InitStubs() passing in the interp value. T_IS pulls a table pointer out of the iPtr->stubTable field of that interp for vetting for suitability. What's important is that the table in iPtr->stubTable has a compatible layout to the one that was compiled into the extension. But the mechanism in place cannot check this. To illustrate this, consider compiling an extension containing the call Tcl_InitStubs(interp, "9.0", 0); but compiling it against a Tcl 8 header file, so all of it's other calls into the Tcl library are coded based on the assumptions of the Tcl 8 stub table layout. Next imagine linking this against libtclstubs9.a so that the TCL_STUBS_MAGIC value the Tcl_InitStubs() call will be checking will be the one for the Tcl 9 table. Then [load] in a Tcl 9 interp. All of Tcl_InitStubs() checks will pass, but we'll be left with an extension compiled to Tcl 8 layout expectations making use of a Tcl 9 stub table, and that's gonna crash hard. Now, if the extension writer would have just stuck with the recommended convention of Tcl_InitStubs(interp, TCL_VERSION, 0) he could not get in this trouble. But that is our fault for giving him that knob to tweak in the first place. Tcl_InitStubs ought not have a "version" argument for the coder to get wrong. ---------------------------------------------------------------------- >Comment By: Don Porter (dgp) Date: 2012-12-07 08:07 Message: That's not a fix. That's a demonstration of the wrongheadedness of the goal. Do you know anyone else who has the ability to [load] Tcl 8 compiled stub-enabled extensions into Tcl 9 on their list of desirables? Everyone else I'm aware of is happy to completely terminate stub compatibility at the 8 -> 9 transition and have only source compatibility as the migration support path. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-07 08:04 Message: Regarding trunk updates. Dear God No. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-07 08:01 Message: I don't expect a mere change of argument type in Tcl_EvalEx is our Tcl 9 destiny. Rather, I presume with the exception of Tcl_PkgRequireEx() the whole table is going to be torn apart and put back together again. At least. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-07 07:58 Message: Two remarks: - your commit broke test load-8.4 - pkgb.c wouldn't even compile with Tcl 9 headers. That's the difference with the Tcl_EvalEx() example: with Tcl_EvalEx, the compiler will not tell you that the signature is wrong. With Tcl_GetDefaultEncodingDir the compiler will tell you all about the problem. Both problems fixed on trunk now. I don't know if you really want to keep this on trunk, but it would be a good example how deprecated functions could be handled. It's (almost) the same way as Thread 2.7 does it, only in this case we know the stub entry exists, so we can simply test for NULL. :-> ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-07 07:06 Message: Using 'make shell` on novem branch, here's the result of [load]ing the updated pkgb.so files on the trunk and bug-3588687 branches: % load path/to/bug-3588687/pkgb.so version conflict for package "Tcl": have 9.0a0, need 8 % load path/to/trunk/pkgb.so % pkgb_demo make: *** [shell] Segmentation fault The [pkgb_demo] command calls Tcl_GetDefaultEncodingDir() which has been removed from the novem stubs table, causing the crash. The bug fix on bug-3588687 prevents this crash from happening by preventing the [load] of the unsupportable extension. Now you may object Tcl_GetDefaultEncodingDir isn't in the Tcl 9 API! So it's a bug for pkgb.c to be passing "8.5-9.1" to Tcl_InitStubs(). Two replies: 1) I submit that Tcl_EvalEx() with an int numBytes argument is likewise not in the Tcl 9 stubs API. So this is just the same bug already demonstrated. Except this new test makes clear this is not some problem that only shows up in a 64-bit scenario. It's implicit in the freedom to change the stubs tables incompatibly on a new major version of Tcl. 2) Even if it is a bug, the common theme in much of this discussion is how much better it is to react to a bug with an error message, or even a controlled panic, rather than a crash. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-07 06:29 Message: I would concentrate on Tcl9 getting better error-messages. We have the 'tclversion' as argument now, which can be used in generating the error-mesage for Tcl 9 extensions. Let's do it right first for Tcl9, then see how we can improve things for Tcl 8. Closed the "novem-demo-bug-3588687" branch now. demo has ended ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-07 06:26 Message: Demonstrating on the tip of novem-demo-bug-3588687: % load path/to/trunk/pkgb.so interpreter uses an incompatible stubs mechanism % load path/to/bug-3588687/pkgb.so interpreter uses an incompatible stubs mechanism And in reality, the first result for deployed foo.so will be: "This interpreter does not support stubs-enabled extensions." ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-07 06:17 Message: Ok, you're just repeating my starting point. The routine Tcl_InitStubs() and therefore all deployed binary stubs-enabled extensions -- the ones linked against the "versioned stubs lib" -- suffer from a fundamentally broken design. If any of them passed arguments requesting or accepting "Tcl 9", then those deployed, binaries are going to crash if they are [load]ed into novem. It's ugly and it's broken, but we can't turn back time. This was botched, and we're going to suffer for that botch. All we can do is get a fixed unversioned stubs library out and people started using it before Tcl 9 loading attempts get widespread. If we really must must must do something that will allows any old foo.so linked to a versioned stubs library that's been lying around untouched to [load] into Tcl 9 and not crash, then the only solution is to change TCL_STUB_MAGIC in Tcl 9 (for all platforms, which I will demonstrate). Then the [load] will not crash, but will instead generate the error message: "This interpreter does not support stubs-enabled extensions." and as a consequence, those new extensions linked to the new unversioned stubs library would be unable to generate any better error message than: "interpreter uses an incompatible stubs mechanism" Consider, though, which foo.so are going to crash. Only those which passed a 'version" string argument that accepted a Tcl 9 interp. I submit that no one has done this, other than folks like us who are probing and demonstrating the brokenness if Tcl_InitStubs(). So, we would be sacrificing the ability to produce more meaningful error messages like "version conflict for package "Tcl": have 9.0a0, need 8" for people actively making extensions today, using the latest tools and advice for the sake of preventing crashes in foo.so that do not exist outside the labs of torture testers like us. I prefer that we keep TCL_STUBS_MAGIC unchanged. Get better error messages. And if any of those mythical foo.so appear, then they crash. Sorry. We limped along with a misdesigned Tcl_InitStubs for a decade, and that's the pain. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-07 06:16 Message: Please try again with the current "novem-demo-bug-3588687" branch. You will get: % load path/to/trunk/pkgb.so interpreter uses an incompatible stubs mechanism This demonstrates that this solution works for already-built extensions, no need to recompile and link to a newer stub library. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-07 05:54 Message: The challenge is: Some person A build an extension foo.so. Another person B, which doesn't have the source code of that extension, tries to load it in Tcl9, and it crashes. So, the challenge is, how can Tcl9 detect that without making modifications to foo.so. We don't want to backport tclStubLib.c modifications to the field everywhere, giving additional restrictions to people. Joe English already expressed his reluctance to modify tclStubLib.c in current releases <= 8.6. I tried it already - before I found the alternative solution in tclLoad.c - which was put on-hold on your request..... It became clear to me that the solution must be found in Tcl9, not by modifying the 8.x stub library! ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-07 05:42 Message: >I submit this demonstrates that bug-3588687 branch >is a fix for the problem demonstrated by pkgb.so. But for the fix you had to recompile pkgb.so and link to a modified stub library. So this doesn't work with already-compiled pkgb.so binaries. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-07 05:35 Message: Ok, two things are becoming clear. 1) How I was misunderstanding/forgetting how Tcl_PkgRequireEx() had expanded in function, and how that impacts *all deployed* stubs-enabled extensions. (eeeuuuuwwww!!!) 2) Those areas where we still suffer from fundamental disagreement. I'm going to work on code demonstrations. That seems to be the way we're most successful clearly demonstrating points to one another. While I work on that... Thanks for the test case. I've merged it over to the bug-3588687 branch. On that branch, build the pkgb.so for testing. Built another on the trunk. Then over on the novem-demo-bug-3588687 branch, `make shell`: % load path/to/bug-3588687/pkgb.so version conflict for package "Tcl": have 9.0a0, need 8 % load path/to/trunk/pkgb.so % pkgb_unsafe Crash! make: *** [shell] Aborted I submit this demonstrates that bug-3588687 branch is a fix for the problem demonstrated by pkgb.so. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-07 02:54 Message: My idea of an unversioned stub library is now in the "novem-unversioned-stub" branch. The only thing missing is the MAGIC value change when the API changes in an incompatible way (novem-64bit-sizes) ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-07 02:12 Message: I now turned pkgb.so in core-8-5-branch and trunk into a Tcl9 interoperability test. And closed the "novem-support" branch. Tcl_InitStubs(interp, "8.6 9", 0) unfortunately doesn't work, it leads to a crash. That must be a bug somewhere in tclPkg.c. Luck enough, "8.6-9.1" works, and that's good enough for this test. Now the challenge is that pkgb.so loaded in any Tcl interpreter should either give an error message, either succeed, but never crash. On 32-bit platforms that's no problem: The signature of the used API will at most change some parameters from int to size_t, but they all are the same size. On 64-bit platforms I see no other solution than to change the MAGIC value. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-07 00:48 Message: >we agree that Tcl_InitStubs(interp, "8.6-", 0) >is *never* a sensible thing to do, right Thinking more about it, I agree! It would promise that it will work with Tcl 10 as well, which is nonsence, of course. Tcl_InitStubs(interp, "8.6 9", 0) should be used. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-07 00:26 Message: >we agree that Tcl_InitStubs(interp, "8.6-", 0) >is *never* a sensible thing to do, right On 64-bit platforms we agree on that, but on 32-bit platforms I see no problem. > Compiling against Tcl 8 headers, but passing > "9.0" as an argument to Tcl_InitStubs() cannot ever > work and is always a mistake.) There is no reason at all to do that: It can only work when the extension uses the common subset of Tcl 8.6 and 9.0 API's, but then it makes no sense to specify "9.0". Specifying "8.6-" would make more sense. The consensus should be: Tcl_InitStubs(interp, "8.6", 0) means that this extension uses the 8.6 API, so compiles and runs with any 8.x release with x >= 6. Tcl_InitStubs(interp, "9.0", 0) means that this extension uses the 9.0 API, so compiles and runs with any 9.x release. Tcl_InitStubs(interp, "8.6-", 0) means that when compiled with Tcl 8.6 headers it will run with Tcl 8.6, and when compiled with Tcl 9.0 headers it will run with any 9.x release. In Tcl9 some API's will be removed, others will be added. Any extension specifying "8.6-" can only use the common subset API's of 8.6 and 9.0. ( assuming the stub entrie positions are kept equal) This also means that no extension can claim "8.6-" during the alpha/beta period of Tcl 9. One method to enforce that is in tcl.h 9.0 specifying: #if TCL_RELEASE_LEVEL == TCL_FINAL_RELEASE #define Tcl_InitStubs(interp, version, exact) \ TclInitStubs(interp, version, exact, TCL_VERSION, TCL_STUB_MAGIC) else #define Tcl_InitStubs(interp, version, exact) \ TclInitStubs(interp, TCL_PATCH_LEVEL, 1, TCL_VERSION, TCL_STUB_MAGIC) #endif But - as proven - claiming "8.6-" compatibility cannot ever work across the 8.6/9.0 border on 64-bit platforms. That's what the MAGIC change is needed for. Regards, Jan Nijtmans ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-06 19:40 Message: Just so I can start from an agreed assumption in the morning, we agree that Tcl_InitStubs(interp, "8.6-", 0) is *never* a sensible thing to do, right? Any code including that is wrongheaded and buggy from the get-go. (In that sense it's like my example in the original ticket here. Compiling against Tcl 8 headers, but passing "9.0" as an argument to Tcl_InitStubs() cannot ever work and is always a mistake.) There's no compat issue with making Tcl_InitStubs refuse to follow broken orders. We just need to redesign it so that such mistakes are reported errors, not panics or crashes. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-06 19:32 Message: I'm getting a clearer picture now. Hope to have more extended remarks tomorrow. Meanwhile, just for fun you might give this one a try: Tcl_InitStubs(interp, "8.6 9", 0); :) ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-06 15:08 Message: > >Tcl_PkgRequireEx() doesn't accept arguments like "8.6-" > Somehow it works, so we might have hit an unrelated bug in tclPkg.c > somewhere here ....... TIP #268 specifies that ''The existing functions Tcl_PkgRequire(Ex) are re-implemented in terms of this function (=Tcl_PkgRequireProc)", so Tcl_PkgRequireEx() accepts "8.6-" just fine. That's what I see and it fits with the TIP description (and with my expectations). I don't see a bug here, after all. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-06 13:57 Message: I've got trains and family to content with, so may be until tomorrow until I can reply well, but if you haven't checked what the latest bug-3588687 code would do in this case, that might be worth trying. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-06 13:55 Message: >Tcl_PkgRequireEx() doesn't accept arguments like "8.6-" Somehow it works, so we might have hit an unrelated bug in tclPkg.c somewhere here ....... ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-06 13:07 Message: I'm only starting, but wanted to quickly observe that Tcl_PkgRequireEx() doesn't accept arguments like "8.6-". It's only built for single version numbers, not requirement ranges. For those you have to call Tcl_PkgRequireProc(), which TclInitStubs doesn't do. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-06 12:27 Message: Let's do an experiment, to show what I want to tell. Please check out the branche "novem-support". It is a modified 8.6, mainly pkgb.so is modified to use Tcl_PkgRequireEx(..., "8.6-". ...), and uses the Tcl_EvalEx() function. Clearly this is wrong: It would allow pkgb.so to be loaded by both Tcl 8.6 and Tcl 9.0, but Tcl_EvalEx will have a different signature, so that will lead to a crash. I'll show that. Then there is the branch "novem-demo-bug-3588687", which is a modified "novem" with a Tcl_EvalEx signature change: the numBytes parameter is changed to a Tcl_WideInt (will be size_t in Tcl9, but this way I can demonstrate the crash on 32-bit platforms as well) Let's run this tclsh9.0 and load the pkgb.so from the "novem-support" 8.6 build: $ tclsh9.0 % load ../../tcl8.6/unix/dltest/pkgb.so % info commands pkgb_* pkgb_unsafe pkgb_sub % pkgb_unsafe Crash! Aborted (core dumped) $ This pkgb.so was compiled and linked against the unmodified Tcl 8.6 tcl.h and libtclstub86.a. How come that this was not detected? Whatever test in tclStubLib.c cannot detect that: pkgb.so was compiled with a legacy stub library. So, the only way I see to make Tcl_StubInit give a nice error message is change the TCL_STUB_MAGIC magic value for 64-bit builds in Tcl9. Then, even though pkgb.so was linked with a legacy stub library it would detect that and give: "interpreter uses an incompatible stubs mechanism" in stead of crashing. If you have another idea how to prevent a crash in this case, I'm all ear. That's why my conclusion is that TCL_STUB_MAGIC should change on 64-bit platforms, it can stay the same on 32-bit platforms. Do you get the point from this? Regards, Jan Nijtmans ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-06 11:39 Message: We may have different visions of what the TCL_STUB_MAGIC value tells us. For me, if the TCL_STUB_MAGIC value stays the same, that means: 1) The interp has a stubs table. 2) The first entry in the table is a routine with the same signature and compatible functioning as Tcl_PkgRequireEx(). and that's it! Other compatibility checks get deferred to what T_PRE judges and that gets controlled by the usual version numbering scheme. In that vision, the only reason to change TCL_STUB_MAGIC is either 1) A stubs table rethink so radical it no longer allows the first slot to be assigned to T_PRE(); or 2) An incompatibility that cannot be sorted out by T_PRE() using Tcl version numbers. An example of 2) would be some kind of development fork where the forks had incompatible stub tables, but were making claim to the same set of version numbers. I don't fully grok your 64-bit claims yet, but it's possible they'd be in this category. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-06 11:25 Message: I must be slow or something (Don't answer that! :) ) If the tclStubLibCompat.c gets left out, then you do not get a libtclstubs.a that's version-free. You get a libtclstubs.a that will not link against an extension compiled with Tcl 8 headers. Right? If you commit to keeping including tclStubLibCompat.c in the stubs library, though, seems you could end up with a libtclstubs.a that's truly version-free, and sensitive only to changes in TCL_STUB_MAGIC. But if we need to keep including it, then the value of multiple source files no longer makes sense to me. That said, please have a look at my new checkin on the bug-3588687 branch. I think it moves things closer to achieving our increasingly shared goals. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-06 11:02 Message: The value of a separate tclStubLibCompat.c is, that it can be left out when the stub library is linked using Tcl 9 headers. I am trying to reach the goal of a version-independant stub library, the difference is in tcl.h only. The minimum requirement of Tcl_PkgRequireEx to work is that: - the entry in the stub table stays the same - the signature of Tcl_PkgRequireEx stays the same. I see no reason to change the entry or the signature, so there is no magic check needed to make it work for sure. The additional checks are meaningless IMHO. The real problem is the signature change in other functions, which are not used in the stub library, such as Tcl_EvalEx(). Or in the Tcl_Obj structure, which will (I assume) change the length field from int to size_t. On 32-bit platforms that doesn't make a difference (all those types are 32-bit) but on 64-bit platforms it totally breaks. Therefore I propose to change the magic value from (int) 0xFCA3BACF to (int) (0xFCA3BACB + sizeof(size_t)) Then the magic value on 32-bit platforms stays the same, on 64-bit platforms it changes. Then the current check suffices to detect the problem. So, I agree with you - as you stated earlier: >Since I'd prefer not to change the TCL_STUBS_MAGIC value I prefer not to change the TCL_STUBS_MAGIC as well. But striving for a version-independant stub library, on 64-bit platforms I don't see another possibility. On 32-bit platforms I don't see a reason to change it. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-06 10:31 Message: The calls to Tcl_PkgRequireEx() within TclInitStubs are themselves relying on the stubs mechanism. In order to call them with certainty they will work, we need to check that the magic value of the stub table in the loading interp matches the TCL_STUB_MAGIC value with which the stubs library was compiled. We must check for three-way agreement. Two-way is not enough. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-12-06 09:37 Message: Found time to examine this again. First a simple Q. What's the value of partitioning some of the code into another new file, tclStubLibCompat.c ? ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-12-05 01:56 Message: Updated the "bug-3588687" branch, trying to get closer to our common goal. Two remarks: - I don't think tcl.h should be modified, for binary compatibility with older 8.6.x releases. So, in stead defined TclInitStubs() in tclInt.h, and let the linking occur through Tcl_InitStubs(). We have it, so why not use it? In Tcl 9.0 we can then make the step to change Tcl_InitStubs to a macro. - The usage of TCL_MAJOR_VERSION and TCL_VERSION in tclStubLib.c hinders the adoption of a version-less stub library. I was under the impression that the tcl version should be brought in through tcl.h and that tclStubLib.c should not reference TCL_VERSION nor TCL_MAJOR_VERSION. So I suggest to remove the check in HasStubSupport(), in stead check for tclversion[0] == version[0] in TclInitStubs. Still that would not be 100% as desired, but at least the stub library then becomes completely independant from tcl.h. In Tcl 9.0 we could rename tclstub9.0.a to tclstub.a (and change tcl.h to use a macro), and this stub library would be usable for Tcl 8.6 (using Tcl_InitStubs), 9.0, 9.1 ... (using TclInitStubs) Wouldn't that be better (although we aren't there yet)? Feedback appreciated! ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-30 09:03 Message: Merge completed. Thanks. We're closing in on agreement what the reformed TclInitStubs() should look like and do. Just not quite there yet. That can continue without the merge held back, and that's a good thing. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-11-30 07:48 Message: Merged the (modified) "novem-review" branch to "novem" Whatever check we will do in HasStubSupport still has to be decided, it now has a full tclversion and a magic parameter. so anything can be put there as need arises. For now, I kept the check the same as in Tcl 8.6. I don't see the need to remove the Tcl_InitStubs() function itself. It is fully hidden, but allows the future (unversioned) tclstub.a to function for Tcl 8 as well. Further on, the win32 build was broken due to other changes, fixed that. Don, Is that sufficient for you to merge the novem-remove-string-result branch? ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-29 09:29 Message: I've created a branch "novem-review" that rolls back the 'premature' change to the TCL_STUB_MAGIC value. If this is acceptable for merging to novem, please do so. Then I will be able to merge the novem-remove-string-result branch in and declare that project complete. If the 64 bit work, or something else, ends up requiring that we change the TCL_STUB_MAGIC value, then we can do that when the need arises. I think the revised Tcl_InitStubs() routines in novem, and pending for 8.6 will handle that transition if/when it happens. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-29 07:30 Message: Update committed to branch. Now I have TclInitStubs() checking both TCL_MAJOR_VERSION and TCL_STUB_MAGIC agreement between extension and stubs library. I had been thinking about TCL_STUB_MAGIC as a value that flagged incompatible stubs mechanisms over time as new Tcl major versions would arrive. In that line of thought, it seems to duplicate what's already known from TCL_MAJOR_VERSION. However, the TCL_STUB_MAGIC value is also good for flagging incompatibilities that are not temporal in nature. In particular, should any forks arise in Tcl development that get reflected in incompatible stub tables, we'll want this value to make those distinctions. Such forks might well be temporary as we undertake various radical experiments in separate fossil branches, that only get resolved when merged. Another possibility that comes to mind is the ability to flag our known problems with needing a TCL_MEM_DEBUG interp to successfully [load] a TCL_MEM_DEBUG extension. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-29 06:47 Message: Correcting the misleading error message is an obviously correct and risk-free thing to do in all active branches. The stubs-failure world will soon be larger than 8.0 interps, and the message should reflect that. Committing.... ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-11-29 03:23 Message: B.T.W.: The current error message in tclStubLib.c is: "This interpreter does not support stubs-enabled extensions." Why don't we simply change that? Suggestion: "Detected incompatible Tcl version: expected 8.x" That can still be done in 8.6.0 without any risk. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-11-29 02:59 Message: Joe English wrote: >Is there a need to change the 8.6.* stub mechanism at all? No, I don't see the need now, I would prefer to wait until Tcl 9.0a1 is out (or just before that, with the Tcl 8.6.x release just before that) so the changes can be tested against the 9.0 signature changes. Now we are just guessing..... > But before I work on those (a boring boring process) I'd like to know > whether what I'm proposing is the right thing to try to do I think it is the right thing to do. We might discuss about individual functions, whether some parameter or return values should be int, size_t or ssize_t. I know that "novem-64bit-sizes" is not ready yet to be merged to trunk, I thrust dkf to decide when it is. Still it would be good to discuss and vote for TIP #115 before doing the merge. As long as signatures don't change, there is no need to change the TCL_STUBS_MAGIC value ( I was a little bit premature doing that in "novem" already..) Regards, Jan Nijtmans ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2012-11-29 01:10 Message: Re the 64 bit work, it works on my machine (i.e., clean build, clean test suite pass) but I could believe the existence of problems on other systems. There are a lot of signed/unsigned hazards in the code (which gcc only flags with extra warnings turned on) but many of those are problems in Tcl 8.* too; working through them all is a huge job. But before I work on those (a boring boring process) I'd like to know whether what I'm proposing is the right thing to try to do. I guess that requires me to update TIP #115 and get talk going in tcl-core… ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-28 18:20 Message: I think the answer to that question depends on how Jan and I resolve our difference about whether the TCL_STUBS_MAGIC value needs to change on the 8->9 transition. If it does not change, then the scenario in the original report can still happen. Since I'd prefer not to change the TCL_STUBS_MAGIC value, I'd want to have a redesigned stubs initialization in Tcl 8.6 for protection against that scenario. And yes, it offers much nicer error message too. I would not make the change for Tcl 8.5.14, however, because that would create the situation where extensions compiled against some 8.5 header files would not be able to link against some 8.5 stubs libraries, and that's just too messy to explain. Extensions compiling against 8.5 and older headers will just continue to suffer from this broken design. The redesign would neatly arrive with Tcl 8.6. ---------------------------------------------------------------------- Comment By: Joe English (jenglish) Date: 2012-11-28 17:06 Message: Is there a need to change the 8.6.* stub mechanism at all? No matter what else, if a novem tclsh tries to [load] an extension buitl against 8.4.19 or 8.5.13, or conversely if an 8.4.19 or 8.5.13 tclsh tries to [load] a novem-compatible extension, it must fail gracefully. "Fails gracefully with a slightly better error message" might be a worthwhile goal for 8.6.0, but I don't think it justifies a whole lot of effort or additional complexity. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-28 15:03 Message: On other matters, Jan, the value of checking TCL_STUBS_MAGIC is dawning on me. I'll comment after updating the bug-3588687 branch. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-28 14:59 Message: Although the action on novem is a lot freer than on trunk, it's still quite useful to let novem be a branch with some stability. Best if checkins there are build-able and don't panic attempting the test suite, which novem-64bit-sizes did the last I checked. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-28 14:57 Message: Please not yet. ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2012-11-28 14:37 Message: If someone wants to merge novem-64bit-sizes back to novem, I wouldn't object in principle, but we probably need to take some decisions about whether what I've done there is The Right Thing before we do. That's orthogonal to the stub table stuff or Tcl_PkgRequireEx; they are parts we don't need to touch for 64-bit fixing (if the stub tables ever get so they have as many as 1M entries - which will still fit in an int just fine - we've got *much* bigger problems). After all, 64-bit architectures already have 64-bit pointers; it is their defining feature. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-11-28 13:21 Message: >I don't see the need to change the TCL_STUBS_MAGIC value >to make the 8->9 transition As soon as your work and dkf's is on "novem" I will construct a test-case which proves that changing TCL_STUBS_MAGIC is necessary for 64-bit platforms (not for 32-bit platforms) > I believe that > the dlopen() or equivalent routine will detect that dlopen will detect the difference between a 32-bit and 64-bit platform on the same machine, e.g. win32/win64. It will not detect whether Tcl_NewStringObj() has a size_t or an int as last argument..... But let's concentrate on the things we agree on. The change from Tcl_InitStubs to a macro which calls TclInitStubs which has additinal aguments, that we agree on. The desire for an unversioned stub library, that we agree on. What checks the TclInitStubs() function should do can be decided later, after there is more clarity on the stub changes, so let's put that aside for now. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-28 10:04 Message: Ah, thanks. Something else to clarify. I'm aiming this patch at bridging the transition to the novem-remove-string-result branch. I'm not aiming at the stubs changes that are currently on the tip of the novem branch. I don't see the need to change the TCL_STUBS_MAGIC value to make the 8->9 transition. The functioning of the Tcl_PkgRequireEx() routine in the first stubs slot is enough to manage the version checking. The constraint that Tcl 9 has to keep that entry, and keep it first in the stubs table is a reasonable one, I think. I also see no need to have a TCL_STUBS_MAGIC value that depends on sizeof(size_t). If a [load]ing program and a [load]ed extension don't agree at that level, I believe that the dlopen() or equivalent routine will detect that and fail long before our stubs checking mechanisms are ever called on. That said, yes I agree that if we keep a Tcl_InitStubs() function to offer linkability of all libtclstubs8*.a libraries, we should use a literal value 8 for the argument instead of TCL_MAJOR_VERSION. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-11-28 00:21 Message: I only have two additional suggestions: >Suggest simply leaving Tcl_InitStubs out of novem tclStubLib.c. That way >the extension will get a linker error if there is a mismatch My suggestion would be to leave Tcl_InitStubs in there, but don't use TCL_MAJOR_VERSION, but simply hardcoded '8': We already know that Tcl 9 doesn't use Tcl_InitStubs, and in Tcl 7 it didn't even exist yet, so we know it must be 8. Then we again have no reason for versioning libstubs.a I would pass TCL_STUB_MAGIC as argument as well. For Tcl 8 we know it's a fixed number, but then we could keep the signature for TclInitStubs the same in Tcl8 and Tcl9. Another little bit of more compatibility. Thanks! ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-27 18:49 Message: The branch/patch is off the trunk, not off novem, This is a fix proposed to go into libtclstub8.6.a. When I merge it forward to the novem branch, I would do just what you suggest, and omit the Tcl_InitStubs() call completely. An extension compiled against 8.* headers will simply refuse to link to a Tcl 9 stubs library. Ok, so the patch would be improved if we just force that linking error to show up at the 8.5->8.6 transition and start using the runtime error detection right away. I shied away from that because you've observed many times that we version the libstubs.a file for no apparent reason, and now we would at last have one. ---------------------------------------------------------------------- Comment By: Joe English (jenglish) Date: 2012-11-27 16:52 Message: I don't think that this part does what it claims to do: 90 /* 91 * Detect whether the extension and the stubs library were built 92 * against Tcl header files from different major versions. That's 93 * seriously broken. 94 */ If an extension was compiled against 8.5.13 headers with -DUSE_TCL_STUBS (where Tcl_InitStubs is declared as a symbol with external linkage) but mistakenly linked against novem libtclstub.a, it'll get novem's definition of Tcl_InitStubs(), which passes novem's TCL_VERSION_MAJOR to the worker routine TclInitStubs(). Suggest simply leaving Tcl_InitStubs out of novem tclStubLib.c. That way the extension will get a linker error if there is a mismatch (old headers+new libtclstub => Tcl_InitStubs not found; new headers+old libtclstub => TclInitStubs not found). The change could be backported to core-8-5-branch and core-8-6-branch, but that would just change a link-time error into a runtime one -- probably not worth doing (if there's a mismatch, it's better to detect it earlier). ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-27 13:11 Message: Branch bug-3588687 has a patch to address these problems. Would appreciate your comments if you can spare the time. I think it's not as radical a change as you might choose, but I think it fixes the worst problem identified above. ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2012-11-20 12:01 Message: To further concur, the 9.0 series does not need to load any extension compiled against 8.* at all *and* the stub table can be considered to be wholly in a state of flux right now. For sure, there won't be stabilization until *at least* 9.0a1 is released, and probably not until 9.0.0 (with things slushy after 9.0b1). Now is not the time to consider the impact on extensions. Now is not the time to put in lots of compatibility goop. ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2012-11-20 10:07 Message: Regarding "This means that 8.x extensions cannot be used in a 9.0 interpreter." IMHO this is a good thing, and should be kept. 8 and 9 are different major versions, and we expect incompatibility at C API level, i.e. stubs tables. We expect that an extension must be re-compiled against 9.x to pick up on these changes and be loadable in 9.x. A clean break between the versions of the core. Putting code into 9.x to shoehorn an extension build against 8.x into it, that is a source of complexity in runtime checks and/or ifdeffery we can, should!, do without, at this point. We are currently in _compatibility breaking_ mode, where we are not even have fully worked out what we are breaking and how. But we know that we want to be free of constraints in our thinking, and 8.x compat work is such a constraint. IOW trying to re-create/shoehorn compatibility now is IMHO much to early in the dev cycle for 9.x. Given the moving target that is 9.x it could possibly be called a 'fool's errand' also (with apologies if that is considered insulting). ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-11-20 00:43 Message: Please consider: <http://core.tcl.tk/tcl/info/16299cb2af> I think 2 of the 3 points you raised are solved now in the "novem" branch. Feedback welcome! ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-11-20 00:10 Message: Thanks, Don, for this clear description. Regarding the second point, the TCL_STUBS_MAGIC value built-in the stubs library, that's completely right. I already did work on this in the "novem" branch which IMHO corrects that, but feedback is welcome. >What's important is that the table in iPtr->stubTable has a compatible >layout to the one that was compiled into the extension. But the >mechanism in place cannot check this Yes, that's a problem. So, let's put a mechanism in place to check this. Your last point, the version argument, the problem now is that Tcl_InitStubs(interp, "9.0", 0) returns false when called in a "8.6" interpreter. This means that 8.x extensions cannot be used in a 9.0 interpreter. I don't have an immediate solution for that. Easiest is to let extesions use Tcl_InitStubs(interp, "8.6+", 0) if they want to indicate that they can run under both 8.6 and 9.0, but that could - at best - only work on 32-bit platforms. 64-bit platforms should have a different MAGIC value, given the big reform that is being worked on now by dkf. Regards, Jan Nijtmans ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3588687&group_id=10894 |