From: <apn...@ya...> - 2025-01-16 18:03:18
|
Sigh..another long repetitive post (sorry Christian for the tedium, but I do not enjoy this either. Really!). But this time I have sample test cases that purportedly illustrate the problem. Folks can skip the middle if you want and go directly to the test cases later. Jan, Perhaps I was not clear enough in the issues I was raising. The topic is complex enough which is why I linked to the blogs and source code in my previous post. Everything I know comes from there. Explicit illustrative test cases are detailed below but first, here's why your test cases do not address my concerns *at all*. These are the two *simplest* scenarios that are potential TLS issues. - Loading of two DLL's within a single thread. Your test loads one DLL into two threads. That's not the same thing. I've added a test for this which illustrates the data corruption. - Loading of a DLL *after* a thread is running from a different thread. Your test loads the extension *before* the thread is created. I have not bothered to test this. I'm not sure how to until the two test cases below are fixed. The above do not even involve the more complex scenarios when the TLS grows enough to need reallocation, the use of constructors to initialize TLS variables and so on. Your explanation with respect to exception handling also misses the mark. The issue does not have to with Tcl's interfaces at all but strictly apply to the extension's internal implementation. *If* the extension uses nested exceptions, the loader needs to translate the table mapping address ranges to corresponding exception handlers. That is the issue, nothing to do with how exceptions are passed to and from Tcl. And as I stated, I do not know if MemoryModule does this properly even with PR #91. I do know the update MemoryModulePP does a whole lot more in this regard than MemoryModule and that has me wondering why. You assume it works until proven otherwise. I assume it doesn't work unless proven otherwise. That's the fundamental disconnect we have. I don't see ASLR tables and CFG tables to be handled at all. I don't if this means something will fail or just that the benefits of those features will be disabled. As far as the LUCK tests go, I do not know what was tested but just loading packages is completely meaningless. Or as Christian himself phrased it, it means zilch. It's like saying Tcl multithreading is race-free because packages load! TEST CASES But onto the main purpose of this post - the illustrative test case which is extremely simple - just load two extensions that have implicit thread variables. In the apn-tip-709 branch, I've added test memorymodule-3.0 to memorymodule.test with a slight modification to your memorymoduletest extension to build two variants. To run it, build two versions of your dltest/memorymoduletest extension as follows: ------- cd win nmake /f makefile.vc OPTS=memorymodule (builds Tcl with load from memory enabled) nmake /f makefile.vc clean nmake /f makefile.vc (builds memorymodule extension) nmake -f makefile.vc BUILD2=1 (build memorymodule2 extension) zip memorymoduletest.zip *.dll pkgIndex.tcl cd .. Release_AMD64_VC1939\tclsh90.exe ..\tests\memorymodule.test -constraints memorymoduletest -match *-3.0 ------ Output is ------ ---- Result was: 15 16 17 ---- Result should have been (exact matching): 15 15 16 ==== memorymodule-3.0 FAILED ------ illustrating that the two DLL's stomp on each other. Of course, tests can have bugs as well so do review my changes. But nevertheless, building Tcl with memory loading disabled, ----- rmdir/s/q Release_AMD64_VC1939 nmake /f makefile.vc (build Tcl with load from memory disabled) Release_AMD64_VC1939\tclsh90.exe ..\tests\memorymodule.test -constraints memorymoduletest -match *-3.0 ----- shows ------ memorymodule.test: Total 5 Passed 1 Skipped 4 Failed 0 ------ which is a strong indicator the test case is not the one at fault. A second test which isolates the cause is memorymoduletest-3.1. This retrieves the _tls_index (see http://www.nynaeve.net/?p=185) which must be different for every DLL loaded. If we try this an interactive shell without memory loading, ----- % zipfs mount dltest/memorymoduletest.zip [file join [zipfs root] memorymoduletest] //zipfs:/memorymoduletest % lappend auto_path //zipfs:/memorymoduletest C:/Tcl/magic/lib/tcl9.0 C:/Tcl/magic/lib //zipfs:/memorymoduletest % package require memorymoduletest 1.0.0 % package require memorymoduletest2 1.0.0 % memorymoduletest2::GetTlsIndex 5 % % memorymoduletest::GetTlsIndex 4 ----- As you see the two modules have different and non-0 _tls_index values. With memory module loading enabled however, both Dll's have the *same* _tls_index value. ----- c:\src\tcltk\tip-709\tcl\win>Release_AMD64_VC1939\tcltest90.exe % zipfs mount dltest/memorymoduletest.zip [file join [zipfs root] memorymoduletest] //zipfs:/memorymoduletest % lappend auto_path //zipfs:/memorymoduletest //zipfs:/lib/tcl/tcl_library //zipfs:/lib/tcl c:/src/tcltk/tip-709/tcl/win/lib //zipfs:/memorymoduletest % package require memorymoduletest 1.0.0 % package require memorymoduletest2 1.0.0 % memorymoduletest::GetTlsIndex 0 % memorymoduletest2::GetTlsIndex 0 ----- What's more they are both 0 as MemoryModule has not initialized them at all! It is in fact very possible _tls_index 0 is allocated to some other module, possibly the main application and these DLL's are stomping on its data! I am pretty much done with TIP 709 review unless there are significant changes. It absolutely needs a more filled out test suite no matter what TCT decides but I am not going to be trying to "prove" an implementation is broken by constructing basic tests. Given there is no test group, that onus should be on the developer, not the reviewers, to gain confidence in the implementation. The tests I wrote could have and should have been constructed by anyone reading that blog. Perhaps these failures can be ignored with the "assurance" that Tcl extensions do not make use of implicit threads or the other features I mentioned. I'm sure it will all work. Most of the time. I don't support this posture myself but I understand others might differ. Note however that use of declspec(thread) is much more common now that (a) XP is dead and (b) all common compilers support the equivalent. Funnily enough, future changes by Microsoft are less of a worry because if my test above is valid, MemoryModule anyways does not match the implementation they currently have! I'll worry about future implementations once MemoryModule works with the current one. PS For anyone interested, https://godbolt.org/z/Yxs43dc3e shows the assembler generated for a two line program using declspec(thread). |