From: SourceForge.net <no...@so...> - 2012-11-29 14:47:23
|
Bugs item #3588687, was opened at 2012-11-19 22:41 Message generated for change (Comment added) made by dgp You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3588687&group_id=10894 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 39. Package Manager Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Don Porter (dgp) Assigned to: Joe English (jenglish) Summary: Tcl_InitStubs() Broken As Designed Initial Comment: A stubs-enabled extension encounters Tcl in three distinct ways. 1) It compiles against a tcl.h file from some release of Tcl. With USE_TCL_STUBS active, the particular tcl.h file determines the layout of the stubs table that the extension is compiled to assume. 2) It links against some libtclstub.a file. That libtclstub.a file in turn was compiled against some tcl.h file and from it got a value of TCL_STUBS_MAGIC compiled into it. As far as I can tell there is nothing in place to make sure that the two tcl.h files so far mentioned are the same file, or at a minimum define the same TCL_STUBS_MAGIC value. Until now it's never been an issue since we've never changed that value. 3) The extenson [load]s into some Tcl interp. The extension calls Tcl_InitStubs() passing in the interp value. T_IS pulls a table pointer out of the iPtr->stubTable field of that interp for vetting for suitability. What's important is that the table in iPtr->stubTable has a compatible layout to the one that was compiled into the extension. But the mechanism in place cannot check this. To illustrate this, consider compiling an extension containing the call Tcl_InitStubs(interp, "9.0", 0); but compiling it against a Tcl 8 header file, so all of it's other calls into the Tcl library are coded based on the assumptions of the Tcl 8 stub table layout. Next imagine linking this against libtclstubs9.a so that the TCL_STUBS_MAGIC value the Tcl_InitStubs() call will be checking will be the one for the Tcl 9 table. Then [load] in a Tcl 9 interp. All of Tcl_InitStubs() checks will pass, but we'll be left with an extension compiled to Tcl 8 layout expectations making use of a Tcl 9 stub table, and that's gonna crash hard. Now, if the extension writer would have just stuck with the recommended convention of Tcl_InitStubs(interp, TCL_VERSION, 0) he could not get in this trouble. But that is our fault for giving him that knob to tweak in the first place. Tcl_InitStubs ought not have a "version" argument for the coder to get wrong. ---------------------------------------------------------------------- >Comment By: Don Porter (dgp) Date: 2012-11-29 06:47 Message: Correcting the misleading error message is an obviously correct and risk-free thing to do in all active branches. The stubs-failure world will soon be larger than 8.0 interps, and the message should reflect that. Committing.... ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-11-29 03:23 Message: B.T.W.: The current error message in tclStubLib.c is: "This interpreter does not support stubs-enabled extensions." Why don't we simply change that? Suggestion: "Detected incompatible Tcl version: expected 8.x" That can still be done in 8.6.0 without any risk. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-11-29 02:59 Message: Joe English wrote: >Is there a need to change the 8.6.* stub mechanism at all? No, I don't see the need now, I would prefer to wait until Tcl 9.0a1 is out (or just before that, with the Tcl 8.6.x release just before that) so the changes can be tested against the 9.0 signature changes. Now we are just guessing..... > But before I work on those (a boring boring process) I'd like to know > whether what I'm proposing is the right thing to try to do I think it is the right thing to do. We might discuss about individual functions, whether some parameter or return values should be int, size_t or ssize_t. I know that "novem-64bit-sizes" is not ready yet to be merged to trunk, I thrust dkf to decide when it is. Still it would be good to discuss and vote for TIP #115 before doing the merge. As long as signatures don't change, there is no need to change the TCL_STUBS_MAGIC value ( I was a little bit premature doing that in "novem" already..) Regards, Jan Nijtmans ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2012-11-29 01:10 Message: Re the 64 bit work, it works on my machine (i.e., clean build, clean test suite pass) but I could believe the existence of problems on other systems. There are a lot of signed/unsigned hazards in the code (which gcc only flags with extra warnings turned on) but many of those are problems in Tcl 8.* too; working through them all is a huge job. But before I work on those (a boring boring process) I'd like to know whether what I'm proposing is the right thing to try to do. I guess that requires me to update TIP #115 and get talk going in tcl-core… ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-28 18:20 Message: I think the answer to that question depends on how Jan and I resolve our difference about whether the TCL_STUBS_MAGIC value needs to change on the 8->9 transition. If it does not change, then the scenario in the original report can still happen. Since I'd prefer not to change the TCL_STUBS_MAGIC value, I'd want to have a redesigned stubs initialization in Tcl 8.6 for protection against that scenario. And yes, it offers much nicer error message too. I would not make the change for Tcl 8.5.14, however, because that would create the situation where extensions compiled against some 8.5 header files would not be able to link against some 8.5 stubs libraries, and that's just too messy to explain. Extensions compiling against 8.5 and older headers will just continue to suffer from this broken design. The redesign would neatly arrive with Tcl 8.6. ---------------------------------------------------------------------- Comment By: Joe English (jenglish) Date: 2012-11-28 17:06 Message: Is there a need to change the 8.6.* stub mechanism at all? No matter what else, if a novem tclsh tries to [load] an extension buitl against 8.4.19 or 8.5.13, or conversely if an 8.4.19 or 8.5.13 tclsh tries to [load] a novem-compatible extension, it must fail gracefully. "Fails gracefully with a slightly better error message" might be a worthwhile goal for 8.6.0, but I don't think it justifies a whole lot of effort or additional complexity. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-28 15:03 Message: On other matters, Jan, the value of checking TCL_STUBS_MAGIC is dawning on me. I'll comment after updating the bug-3588687 branch. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-28 14:59 Message: Although the action on novem is a lot freer than on trunk, it's still quite useful to let novem be a branch with some stability. Best if checkins there are build-able and don't panic attempting the test suite, which novem-64bit-sizes did the last I checked. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-28 14:57 Message: Please not yet. ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2012-11-28 14:37 Message: If someone wants to merge novem-64bit-sizes back to novem, I wouldn't object in principle, but we probably need to take some decisions about whether what I've done there is The Right Thing before we do. That's orthogonal to the stub table stuff or Tcl_PkgRequireEx; they are parts we don't need to touch for 64-bit fixing (if the stub tables ever get so they have as many as 1M entries - which will still fit in an int just fine - we've got *much* bigger problems). After all, 64-bit architectures already have 64-bit pointers; it is their defining feature. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-11-28 13:21 Message: >I don't see the need to change the TCL_STUBS_MAGIC value >to make the 8->9 transition As soon as your work and dkf's is on "novem" I will construct a test-case which proves that changing TCL_STUBS_MAGIC is necessary for 64-bit platforms (not for 32-bit platforms) > I believe that > the dlopen() or equivalent routine will detect that dlopen will detect the difference between a 32-bit and 64-bit platform on the same machine, e.g. win32/win64. It will not detect whether Tcl_NewStringObj() has a size_t or an int as last argument..... But let's concentrate on the things we agree on. The change from Tcl_InitStubs to a macro which calls TclInitStubs which has additinal aguments, that we agree on. The desire for an unversioned stub library, that we agree on. What checks the TclInitStubs() function should do can be decided later, after there is more clarity on the stub changes, so let's put that aside for now. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-28 10:04 Message: Ah, thanks. Something else to clarify. I'm aiming this patch at bridging the transition to the novem-remove-string-result branch. I'm not aiming at the stubs changes that are currently on the tip of the novem branch. I don't see the need to change the TCL_STUBS_MAGIC value to make the 8->9 transition. The functioning of the Tcl_PkgRequireEx() routine in the first stubs slot is enough to manage the version checking. The constraint that Tcl 9 has to keep that entry, and keep it first in the stubs table is a reasonable one, I think. I also see no need to have a TCL_STUBS_MAGIC value that depends on sizeof(size_t). If a [load]ing program and a [load]ed extension don't agree at that level, I believe that the dlopen() or equivalent routine will detect that and fail long before our stubs checking mechanisms are ever called on. That said, yes I agree that if we keep a Tcl_InitStubs() function to offer linkability of all libtclstubs8*.a libraries, we should use a literal value 8 for the argument instead of TCL_MAJOR_VERSION. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-11-28 00:21 Message: I only have two additional suggestions: >Suggest simply leaving Tcl_InitStubs out of novem tclStubLib.c. That way >the extension will get a linker error if there is a mismatch My suggestion would be to leave Tcl_InitStubs in there, but don't use TCL_MAJOR_VERSION, but simply hardcoded '8': We already know that Tcl 9 doesn't use Tcl_InitStubs, and in Tcl 7 it didn't even exist yet, so we know it must be 8. Then we again have no reason for versioning libstubs.a I would pass TCL_STUB_MAGIC as argument as well. For Tcl 8 we know it's a fixed number, but then we could keep the signature for TclInitStubs the same in Tcl8 and Tcl9. Another little bit of more compatibility. Thanks! ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-27 18:49 Message: The branch/patch is off the trunk, not off novem, This is a fix proposed to go into libtclstub8.6.a. When I merge it forward to the novem branch, I would do just what you suggest, and omit the Tcl_InitStubs() call completely. An extension compiled against 8.* headers will simply refuse to link to a Tcl 9 stubs library. Ok, so the patch would be improved if we just force that linking error to show up at the 8.5->8.6 transition and start using the runtime error detection right away. I shied away from that because you've observed many times that we version the libstubs.a file for no apparent reason, and now we would at last have one. ---------------------------------------------------------------------- Comment By: Joe English (jenglish) Date: 2012-11-27 16:52 Message: I don't think that this part does what it claims to do: 90 /* 91 * Detect whether the extension and the stubs library were built 92 * against Tcl header files from different major versions. That's 93 * seriously broken. 94 */ If an extension was compiled against 8.5.13 headers with -DUSE_TCL_STUBS (where Tcl_InitStubs is declared as a symbol with external linkage) but mistakenly linked against novem libtclstub.a, it'll get novem's definition of Tcl_InitStubs(), which passes novem's TCL_VERSION_MAJOR to the worker routine TclInitStubs(). Suggest simply leaving Tcl_InitStubs out of novem tclStubLib.c. That way the extension will get a linker error if there is a mismatch (old headers+new libtclstub => Tcl_InitStubs not found; new headers+old libtclstub => TclInitStubs not found). The change could be backported to core-8-5-branch and core-8-6-branch, but that would just change a link-time error into a runtime one -- probably not worth doing (if there's a mismatch, it's better to detect it earlier). ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2012-11-27 13:11 Message: Branch bug-3588687 has a patch to address these problems. Would appreciate your comments if you can spare the time. I think it's not as radical a change as you might choose, but I think it fixes the worst problem identified above. ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2012-11-20 12:01 Message: To further concur, the 9.0 series does not need to load any extension compiled against 8.* at all *and* the stub table can be considered to be wholly in a state of flux right now. For sure, there won't be stabilization until *at least* 9.0a1 is released, and probably not until 9.0.0 (with things slushy after 9.0b1). Now is not the time to consider the impact on extensions. Now is not the time to put in lots of compatibility goop. ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2012-11-20 10:07 Message: Regarding "This means that 8.x extensions cannot be used in a 9.0 interpreter." IMHO this is a good thing, and should be kept. 8 and 9 are different major versions, and we expect incompatibility at C API level, i.e. stubs tables. We expect that an extension must be re-compiled against 9.x to pick up on these changes and be loadable in 9.x. A clean break between the versions of the core. Putting code into 9.x to shoehorn an extension build against 8.x into it, that is a source of complexity in runtime checks and/or ifdeffery we can, should!, do without, at this point. We are currently in _compatibility breaking_ mode, where we are not even have fully worked out what we are breaking and how. But we know that we want to be free of constraints in our thinking, and 8.x compat work is such a constraint. IOW trying to re-create/shoehorn compatibility now is IMHO much to early in the dev cycle for 9.x. Given the moving target that is 9.x it could possibly be called a 'fool's errand' also (with apologies if that is considered insulting). ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-11-20 00:43 Message: Please consider: <http://core.tcl.tk/tcl/info/16299cb2af> I think 2 of the 3 points you raised are solved now in the "novem" branch. Feedback welcome! ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2012-11-20 00:10 Message: Thanks, Don, for this clear description. Regarding the second point, the TCL_STUBS_MAGIC value built-in the stubs library, that's completely right. I already did work on this in the "novem" branch which IMHO corrects that, but feedback is welcome. >What's important is that the table in iPtr->stubTable has a compatible >layout to the one that was compiled into the extension. But the >mechanism in place cannot check this Yes, that's a problem. So, let's put a mechanism in place to check this. Your last point, the version argument, the problem now is that Tcl_InitStubs(interp, "9.0", 0) returns false when called in a "8.6" interpreter. This means that 8.x extensions cannot be used in a 9.0 interpreter. I don't have an immediate solution for that. Easiest is to let extesions use Tcl_InitStubs(interp, "8.6+", 0) if they want to indicate that they can run under both 8.6 and 9.0, but that could - at best - only work on 32-bit platforms. 64-bit platforms should have a different MAGIC value, given the big reform that is being worked on now by dkf. Regards, Jan Nijtmans ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3588687&group_id=10894 |