Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#5132 Tcl_InitStubs() Broken As Designed

open
Don Porter
8
2014-11-30
2012-11-20
Don Porter
No

A stubs-enabled extension encounters Tcl in three distinct ways.

1) It compiles against a tcl.h file from some release of Tcl. With
USE_TCL_STUBS active, the particular tcl.h file determines the
layout of the stubs table that the extension is compiled to assume.

2) It links against some libtclstub.a file. That libtclstub.a file in turn
was compiled against some tcl.h file and from it got a value of
TCL_STUBS_MAGIC compiled into it. As far as I can tell there is
nothing in place to make sure that the two tcl.h files so far mentioned
are the same file, or at a minimum define the same TCL_STUBS_MAGIC
value. Until now it's never been an issue since we've never changed
that value.

3) The extenson [load]s into some Tcl interp. The extension calls
Tcl_InitStubs() passing in the interp value. T_IS pulls a table pointer
out of the iPtr->stubTable field of that interp for vetting for suitability.

What's important is that the table in iPtr->stubTable has a compatible
layout to the one that was compiled into the extension. But the
mechanism in place cannot check this. To illustrate this, consider
compiling an extension containing the call

Tcl_InitStubs(interp, "9.0", 0);

but compiling it against a Tcl 8 header file, so all of it's other calls
into the Tcl library are coded based on the assumptions of the Tcl 8
stub table layout.

Next imagine linking this against libtclstubs9.a so that the TCL_STUBS_MAGIC
value the Tcl_InitStubs() call will be checking will be the one for the Tcl 9 table.

Then [load] in a Tcl 9 interp. All of Tcl_InitStubs() checks will pass, but we'll
be left with an extension compiled to Tcl 8 layout expectations making use
of a Tcl 9 stub table, and that's gonna crash hard.

Now, if the extension writer would have just stuck with the recommended
convention of

Tcl_InitStubs(interp, TCL_VERSION, 0)

he could not get in this trouble. But that is our fault for giving him that
knob to tweak in the first place. Tcl_InitStubs ought not have
a "version" argument for the coder to get wrong.

Discussion

1 2 3 > >> (Page 1 of 3)
  • Jan Nijtmans
    Jan Nijtmans
    2012-11-20

    Thanks, Don, for this clear description.

    Regarding the second point, the TCL_STUBS_MAGIC value built-in
    the stubs library, that's completely right. I already did work on
    this in the "novem" branch which IMHO corrects that, but feedback
    is welcome.

    >What's important is that the table in iPtr->stubTable has a compatible
    >layout to the one that was compiled into the extension. But the
    >mechanism in place cannot check this
    Yes, that's a problem. So, let's put a mechanism in place to check this.

    Your last point, the version argument, the problem now is that
    Tcl_InitStubs(interp, "9.0", 0) returns false when called in a "8.6"
    interpreter. This means that 8.x extensions cannot be used
    in a 9.0 interpreter. I don't have an immediate solution for that.
    Easiest is to let extesions use Tcl_InitStubs(interp, "8.6+", 0)
    if they want to indicate that they can run under both 8.6
    and 9.0, but that could - at best - only work on 32-bit platforms.
    64-bit platforms should have a different MAGIC value,
    given the big reform that is being worked on now by dkf.

    Regards,
    Jan Nijtmans

     
  • Jan Nijtmans
    Jan Nijtmans
    2012-11-20

    Please consider:

    <http://core.tcl.tk/tcl/info/16299cb2af>

    I think 2 of the 3 points you raised are solved now
    in the "novem" branch.

    Feedback welcome!

     
  • Regarding "This means that 8.x extensions cannot be used in a 9.0 interpreter."

    IMHO this is a good thing, and should be kept.

    8 and 9 are different major versions, and we expect incompatibility at C API level, i.e. stubs tables.

    We expect that an extension must be re-compiled against 9.x to pick up on these changes and be loadable in 9.x.

    A clean break between the versions of the core.

    Putting code into 9.x to shoehorn an extension build against 8.x into it, that is a source of complexity in runtime checks and/or ifdeffery we can, should!, do without, at this point.

    We are currently in _compatibility breaking_ mode, where we are not even have fully worked out what we are breaking and how. But we know that we want to be free of constraints in our thinking, and 8.x compat work is such a constraint. IOW trying to re-create/shoehorn compatibility now is IMHO much to early in the dev cycle for 9.x. Given the moving target that is 9.x it could possibly be called a 'fool's errand' also (with apologies if that is considered insulting).

     
  • To further concur, the 9.0 series does not need to load any extension compiled against 8.* at all *and* the stub table can be considered to be wholly in a state of flux right now. For sure, there won't be stabilization until *at least* 9.0a1 is released, and probably not until 9.0.0 (with things slushy after 9.0b1).

    Now is not the time to consider the impact on extensions. Now is not the time to put in lots of compatibility goop.

     
  • Don Porter
    Don Porter
    2012-11-27

    Branch bug-3588687 has a patch to address these
    problems. Would appreciate your comments if you
    can spare the time. I think it's not as radical a change
    as you might choose, but I think it fixes the worst
    problem identified above.

     
  • Don Porter
    Don Porter
    2012-11-27

    • assigned_to: dgp --> jenglish
     
  • Joe English
    Joe English
    2012-11-28

    I don't think that this part does what it claims to do:

    90 /*
    91 * Detect whether the extension and the stubs library were built
    92 * against Tcl header files from different major versions. That's
    93 * seriously broken.
    94 */

    If an extension was compiled against 8.5.13 headers with -DUSE_TCL_STUBS (where Tcl_InitStubs is declared as a symbol with external linkage) but mistakenly linked against novem libtclstub.a, it'll get novem's definition of Tcl_InitStubs(), which passes novem's TCL_VERSION_MAJOR to the worker routine TclInitStubs().

    Suggest simply leaving Tcl_InitStubs out of novem tclStubLib.c. That way the extension will get a linker error if there is a mismatch (old headers+new libtclstub => Tcl_InitStubs not found; new headers+old libtclstub => TclInitStubs not found).

    The change could be backported to core-8-5-branch and core-8-6-branch, but that would just change a link-time error into a runtime one -- probably not worth doing (if there's a mismatch, it's better to detect it earlier).

     
  • Don Porter
    Don Porter
    2012-11-28

    The branch/patch is off the trunk, not off novem,
    This is a fix proposed to go into libtclstub8.6.a.

    When I merge it forward to the novem branch, I
    would do just what you suggest, and omit the
    Tcl_InitStubs() call completely. An extension
    compiled against 8.* headers will simply refuse to
    link to a Tcl 9 stubs library.

    Ok, so the patch would be improved if we just
    force that linking error to show up at the 8.5->8.6
    transition and start using the runtime error detection
    right away. I shied away from that because you've
    observed many times that we version the libstubs.a
    file for no apparent reason, and now we would at last
    have one.

     
  • Jan Nijtmans
    Jan Nijtmans
    2012-11-28

    I only have two additional suggestions:
    >Suggest simply leaving Tcl_InitStubs out of novem tclStubLib.c. That way
    >the extension will get a linker error if there is a mismatch
    My suggestion would be to leave Tcl_InitStubs in there, but don't use
    TCL_MAJOR_VERSION, but simply hardcoded '8': We already know
    that Tcl 9 doesn't use Tcl_InitStubs, and in Tcl 7 it didn't even
    exist yet, so we know it must be 8. Then we again have no
    reason for versioning libstubs.a

    I would pass TCL_STUB_MAGIC as argument as well.
    For Tcl 8 we know it's a fixed number, but then we could
    keep the signature for TclInitStubs the same in Tcl8
    and Tcl9. Another little bit of more compatibility.

    Thanks!

     
  • Don Porter
    Don Porter
    2012-11-28

    Ah, thanks. Something else to clarify.

    I'm aiming this patch at bridging the transition to
    the novem-remove-string-result branch. I'm not aiming
    at the stubs changes that are currently on the tip of
    the novem branch.

    I don't see the need to change the TCL_STUBS_MAGIC value
    to make the 8->9 transition. The functioning of the
    Tcl_PkgRequireEx() routine in the first stubs slot is
    enough to manage the version checking. The constraint
    that Tcl 9 has to keep that entry, and keep it first in the
    stubs table is a reasonable one, I think.

    I also see no need to have a TCL_STUBS_MAGIC value
    that depends on sizeof(size_t). If a [load]ing program and
    a [load]ed extension don't agree at that level, I believe that
    the dlopen() or equivalent routine will detect that and fail long
    before our stubs checking mechanisms are ever called on.

    That said, yes I agree that if we keep a Tcl_InitStubs() function
    to offer linkability of all libtclstubs8*.a libraries, we should use
    a literal value 8 for the argument instead of TCL_MAJOR_VERSION.

     
  • Jan Nijtmans
    Jan Nijtmans
    2012-11-28

    >I don't see the need to change the TCL_STUBS_MAGIC value
    >to make the 8->9 transition
    As soon as your work and dkf's is on "novem" I will construct
    a test-case which proves that changing TCL_STUBS_MAGIC
    is necessary for 64-bit platforms (not for 32-bit platforms)

    > I believe that
    > the dlopen() or equivalent routine will detect that
    dlopen will detect the difference between a 32-bit
    and 64-bit platform on the same machine, e.g.
    win32/win64. It will not detect whether
    Tcl_NewStringObj() has a size_t or an int as last
    argument.....

    But let's concentrate on the things we agree
    on. The change from Tcl_InitStubs to a
    macro which calls TclInitStubs which has
    additinal aguments, that we agree on.
    The desire for an unversioned stub library,
    that we agree on.
    What checks the TclInitStubs() function
    should do can be decided later, after
    there is more clarity on the stub changes,
    so let's put that aside for now.

     
  • If someone wants to merge novem-64bit-sizes back to novem, I wouldn't object in principle, but we probably need to take some decisions about whether what I've done there is The Right Thing before we do.

    That's orthogonal to the stub table stuff or Tcl_PkgRequireEx; they are parts we don't need to touch for 64-bit fixing (if the stub tables ever get so they have as many as 1M entries - which will still fit in an int just fine - we've got *much* bigger problems). After all, 64-bit architectures already have 64-bit pointers; it is their defining feature.

     
  • Don Porter
    Don Porter
    2012-11-28

    Please not yet.

     
  • Don Porter
    Don Porter
    2012-11-28

    Although the action on novem is a lot freer than on trunk,
    it's still quite useful to let novem be a branch with some stability.
    Best if checkins there are build-able and don't panic attempting
    the test suite, which novem-64bit-sizes did the last I checked.

     
  • Don Porter
    Don Porter
    2012-11-28

    On other matters, Jan, the value of checking TCL_STUBS_MAGIC
    is dawning on me. I'll comment after updating the bug-3588687 branch.

     
  • Joe English
    Joe English
    2012-11-29

    Is there a need to change the 8.6.* stub mechanism at all?

    No matter what else, if a novem tclsh tries to [load] an extension buitl against 8.4.19 or 8.5.13, or conversely if an 8.4.19 or 8.5.13 tclsh tries to [load] a novem-compatible extension, it must fail gracefully. "Fails gracefully with a slightly better error message" might be a worthwhile goal for 8.6.0, but I don't think it justifies a whole lot of effort or additional complexity.

     
  • Don Porter
    Don Porter
    2012-11-29

    I think the answer to that question depends on how
    Jan and I resolve our difference about whether the
    TCL_STUBS_MAGIC value needs to change on the
    8->9 transition. If it does not change, then the scenario
    in the original report can still happen. Since I'd prefer not
    to change the TCL_STUBS_MAGIC value, I'd want to have
    a redesigned stubs initialization in Tcl 8.6 for protection
    against that scenario. And yes, it offers much nicer error
    message too.

    I would not make the change for Tcl 8.5.14, however, because
    that would create the situation where extensions compiled against
    some 8.5 header files would not be able to link against some 8.5
    stubs libraries, and that's just too messy to explain. Extensions
    compiling against 8.5 and older headers will just continue to
    suffer from this broken design. The redesign would neatly arrive
    with Tcl 8.6.

     
  • Re the 64 bit work, it works on my machine (i.e., clean build, clean test suite pass) but I could believe the existence of problems on other systems. There are a lot of signed/unsigned hazards in the code (which gcc only flags with extra warnings turned on) but many of those are problems in Tcl 8.* too; working through them all is a huge job.

    But before I work on those (a boring boring process) I'd like to know whether what I'm proposing is the right thing to try to do. I guess that requires me to update TIP #115 and get talk going in tcl-core…

     
  • Jan Nijtmans
    Jan Nijtmans
    2012-11-29

    Joe English wrote:
    >Is there a need to change the 8.6.* stub mechanism at all?
    No, I don't see the need now, I would prefer to wait until
    Tcl 9.0a1 is out (or just before that, with the Tcl 8.6.x
    release just before that) so the changes can be
    tested against the 9.0 signature changes. Now we are
    just guessing.....

    > But before I work on those (a boring boring process) I'd like to know
    > whether what I'm proposing is the right thing to try to do
    I think it is the right thing to do. We might discuss about
    individual functions, whether some parameter or return
    values should be int, size_t or ssize_t. I know that
    "novem-64bit-sizes" is not ready yet to be merged
    to trunk, I thrust dkf to decide when it is. Still
    it would be good to discuss and vote for
    TIP #115 before doing the merge.

    As long as signatures don't change, there is no need to
    change the TCL_STUBS_MAGIC value ( I was a
    little bit premature doing that in "novem" already..)

    Regards,
    Jan Nijtmans

     
  • Jan Nijtmans
    Jan Nijtmans
    2012-11-29

    B.T.W.: The current error message in tclStubLib.c is:
    "This interpreter does not support stubs-enabled extensions."
    Why don't we simply change that? Suggestion:
    "Detected incompatible Tcl version: expected 8.x"
    That can still be done in 8.6.0 without any risk.

     
  • Don Porter
    Don Porter
    2012-11-29

    Correcting the misleading error message is
    an obviously correct and risk-free thing to
    do in all active branches. The stubs-failure
    world will soon be larger than 8.0 interps,
    and the message should reflect that. Committing....

     
  • Don Porter
    Don Porter
    2012-11-29

    Update committed to branch. Now I have TclInitStubs()
    checking both TCL_MAJOR_VERSION and TCL_STUB_MAGIC
    agreement between extension and stubs library.

    I had been thinking about TCL_STUB_MAGIC as a value that flagged
    incompatible stubs mechanisms over time as new Tcl major versions
    would arrive. In that line of thought, it seems to duplicate what's already
    known from TCL_MAJOR_VERSION. However, the TCL_STUB_MAGIC
    value is also good for flagging incompatibilities that are not temporal
    in nature. In particular, should any forks arise in Tcl development that get
    reflected in incompatible stub tables, we'll want this value to make those
    distinctions. Such forks might well be temporary as we undertake various
    radical experiments in separate fossil branches, that only get resolved
    when merged. Another possibility that comes to mind is the ability to
    flag our known problems with needing a TCL_MEM_DEBUG interp to
    successfully [load] a TCL_MEM_DEBUG extension.

     
  • Don Porter
    Don Porter
    2012-11-29

    I've created a branch "novem-review" that rolls back
    the 'premature' change to the TCL_STUB_MAGIC value.
    If this is acceptable for merging to novem, please do so.
    Then I will be able to merge the novem-remove-string-result
    branch in and declare that project complete.

    If the 64 bit work, or something else, ends up requiring that
    we change the TCL_STUB_MAGIC value, then we can do
    that when the need arises. I think the revised Tcl_InitStubs()
    routines in novem, and pending for 8.6 will handle that
    transition if/when it happens.

     
  • Don Porter
    Don Porter
    2012-11-29

    • priority: 5 --> 8
    • assigned_to: jenglish --> nijtmans
     
  • Jan Nijtmans
    Jan Nijtmans
    2012-11-30

    • assigned_to: nijtmans --> dgp
     
1 2 3 > >> (Page 1 of 3)