|
From: Jan N. <jan...@gm...> - 2025-12-04 08:38:41
|
Attn. TCT members,
This is a CFV for TIP 709:
https://core.tcl-lang.org/tips/doc/trunk/tip/709.md
I'm aware of TIP #741, and Ashoks desire to vote on it
together with this TIP. However, I don't think TIP #741 is
going anywhere, but i'll leave that up to Ashok to handle it.
I also agree with Sergey's remark that "dynamic or static link
on compilation phase is much better than any attempt to put
DLLs into tcl_library". This TIP is meant for use-cases where
you cannot do that. Christian Werner's LUCK framework
is one of those.
Please send your votes to this list until Friday, Dec 19, 12:00 gmt,
or [clock format 1766145600]
Best regards,
Jan Nijtmans
|
|
From: <apn...@ya...> - 2025-12-05 08:15:33
|
TIP #709: No While I have expressed my dissent many times, for the record, I want to detail the specific problems I have discovered on code review and testing in the last couple of evenings. I would ask TCT members to read through before voting, given that the TIP was put forward as a request for a license approval for a “bug fix” and the TIP itself has zero details, a single line in the Specification, something rather disturbing in my mind for such a consequential change, whether seen as a bug fix or otherwise. Disclaimer: my findings have not been peer-reviewed 😊 Check for yourself or trust Jan’s assertions and vote accordingly. My objections can be broken into a. Philosophical opposition to the use of undocumented Windows data structures and interfaces for very little gain. I have said this before but below I add evidence that Jan’s assertions about stability between releases is false. b. Reservations about quality of implementation based on observed failure cases for even simple Win32 API calls in DLL’s loaded from memory, inadequacy of test coverage, and review of the MemoryModule library. First, with respect to (a), For the benefit of those new to this debate, MemoryModule reimplements parts of the Windows loader using undocumented structures and algorithms. There is no guarantee its behavior matches the real loader across current Windows versions leading to API failures. Examples below. Further, because these internals change between Windows releases—and even between minor updates—any apparent robustness today provides no assurance of correctness tomorrow. On this last point, I have to counter Jan’s confident assertion <https://sourceforge.net/p/tcl/mailman/message/59118711/> that, quoting, The future risk in Windows Changes: I don't see that happen. The changes in Vista were probably either for security reasons, or for extension to x64, both are being used a long time now in various Windows versions. I don't see the need for more changes coming. Next change in the Windows kernel is probably the switch to the Linux kernel. I don’t know the basis for Jan’s comments but a simple Google search or comparison of SDK releases would have shown LDR_DATA_TABLE_ENTRY which is the fundamental structure tracking loaded DLL's in Windows has changed in size and offsets between Vista, Win7, Win8 and then again at least two Win 10. So much for stability of internals. The fact that MemoryModule is a thirdparty library that was basically orphaned in 2015, coincidentally right around the time Win 10 was released, means the onus to track and maintain falls on the Tcl core team. Good luck with that. Since Jan dismissed my concerns around undocumented interfaces as exaggerations <https://sourceforge.net/p/tcl/mailman/message/59116993/> , and will likely do so again, I’ll move on to (b) concrete issues around the implementation and test suite. First the implementation, In my testing of TIP 709 builds, I found even the basic functions GetModuleHandleExW and EnumProcessModules did not work correctly when loading from memory was in effect. They of course work fine with trunk, 9.0 or TIP 741. Jan can verify for himself that it was not pilot error on my part. I had no interest in trying other functions like loading resources if even these basic functions did not work. (There is one function that does work – GetModuleFileName – that I will revisit below) Looking at the code (debugging fails because debugger is oblivious to DLL’s loaded in MemoryModule), the root cause is non-trivial. It does not appear that MemoryModule integrates with the Windows loader structures at all. In particular, my trusty Windows Internals reference tells me DLL’s are tracked in Windows through three linked lists hanging off the process’ PEB. MemoryModule makes no attempt to update these (with good reason probably, access needs to be synchronized with the kernel using locks that are not exposed). The result is failure in Windows functions that cannot locate handles created via MemoryModule. As an aside, this hiding from Windows may be difficulty of implementation or intentional. MemoryModule’s primary use seems to be in malware to hide DLL’s from Windows security tools. Sufficiently widespread that CCCS has a YARA malware rule for MemoryModule on its front page <https://github.com/CybercentreCanada/CCCS-Yara/blob/master/README.md> ! There was one anomaly that aroused my curiosity – the GetModuleFileName{A,W} functions worked, which was unexpected. It turns out that the reason they work is that TIP 709 (i.e. Tcl, not MemoryModule) effectively hooks these two Windows calls via the GetProcAddress callbacks and if the pointer matches its internal handle, returns the corresponding file path. The implications are that to work correctly, Tcl has to similarly hook the entire Windows API that uses DLL module handles!!! And implement their functionality! A monumental task considering that even the very simple GetModuleFileName function implementations in 709 are faulty (they return paths using forward slashes, and in the case of the *A variant return the path in Tcl’s internal TUTF-8 encoding instead of the caller’s ANSI locale). Do we really want Tcl to be in the business of intercepting Windows API’s ? (Note the need to intercept Windows API’s and the lack of integration with Windows loader are two separate issues that need to be solved.) For added amusement, we have the situation that the result of GetModuleFileName is not usable by other Windows API’s. PathFileExists(GetModuleFileName) will return false leaving some programmer head-banging in frustration (not our problem though, they will blame Bill Gates). But why is GetModuleFileName singled out for hooking? That question brings us to the test suite … Considering MemoryModule is taking on the considerable task of the Windows loader, a reasonable test suite must be mandated. I’d rather not have a repeat of the zipfs fiasco <https://core.tcl-lang.org/tcl/rptview/28> . At a very minimum the 15-20 lower level DLL related API must be tested. Instead, the only function tested is …drum roll... GetModuleFileName! Don’t look behind the curtain! The only comment I have on this is that it vaguely reminds me of the old SPEC compiler benchmark rigging 😊 Outside of the public API itself, there is considerable loader functionality behind the scenes that I had listed way back in January. I do not see that being tested. Jan has repeatedly said “extensive tests are currently done”, “all problems have been fixed” and similar. But outside of a reference to LUCK, there is no detail as to what those tests were. Given the above issues, and similar statements being made about TLS issues earlier that were not fixed (by excluding TLS using DLL’s from loading from memory) until I supplied the actual code demonstrating breakage, forgive me for being skeptical. A non-exhaustive list … * DLL related API’s as mentioned earlier. How does “extensive testing” miss basic API calls? * x86 architecture since there are significant differences in PE formats, exception handling mechanisms and the like. I do not see this tested at all (test suite fails but I believe that is because there is no 32-bit test DLL). Ditto for ARM if we officially support ARM. Or the TIP should explicitly say those architectures are not supported. * Exception handling testing is (like the original TLS test case) testing the wrong scenario. It’s loading thrower.dll from disk, which means the exception tables are set up by the Windows loader. The test case needs to be adapted so it is MemoryModule that is responsible for setting up the exception tables. I’ll ignore the fact that the exception test case is very simple. * Exception handling only tests SEH, not C++ exceptions which use a different mechanism. * ASLR, DEP, delay load dependencies… Look at the loader flags in winternl.h for what might break * nested load scenarios (import resolution will potentially load libraries which may themselves load dependencies from zipfs) * multithread scenarios – I believe there is a race condition here. While Tcl’s own structures are under a lock, the call to MemoryModule load function is not. Since MemoryModule itself does not seem to do any synchronization, I suspect concurrent calls from multiple threads can result in the DLL being loaded twice. Which is a no-no. I understand that is a lot. But so is the Windows functionality 709 chooses to re-implement. I anticipate a question from Jan, as he has asked before in this and other instances in one form or another – “But show me which Tcl extensions use these facilities”. I am not going to do that. Not my approach to software quality. Jan did say in his CFV warning: I don't think it's possible to convince Ashok: I answered on all his remarks, even created testcases especially for those situations. Still he keeps having his reservations. No-one is implementing an alternative. Christian Werner is using it, no-one ever filed a bug-report to him. It is in fact true, I cannot be convinced at this point. I take issue with the rest of the comment but that is immaterial. No further development will convince me this is worth doing and have my support in the future. I will not even review, too much time has been spent already. Someone else from the TCT will have to supply an additional Yes vote. Sorry /Ashok |
|
From: Harald O. <har...@el...> - 2025-12-05 09:55:36
Attachments:
OpenPGP_signature.asc
|
Dear Jan, dear Ashok, thanks for the debate. I appreciate all moves here. I personally have a lot of experiences with Windows and internal changes between minor releases, mostly be security issues. An example is the DLL load by .net, which suddenly did not work after a security patch, if the "from internet" flag was set to the dll. I searched one week for the reason. And it only showed up on client side, as they have received the dll by a download from our web-site. The dll load mechanism was often hit by security issues and is constantly changing. Perhaps, the "copy to temp and execute" will not be possible any more in future. It might be prohibitted to store dlls in the temp folder. Or only "installed" DLLs may be executed. It might happen, that DLLs have to reside outside of the executable and be correctly installed in future. I can understand that. Due to that, the "memory execution" was neat avoiding this issue. So, it is a decision between two alternatives with issues. We need reliabillity and I am ready to sacrifice a neat feature for it. The post by Ashok convinced me, that the risk of memory module in the core is to high. Due to that, please allow me to vote "No" on TIP 709. Thanks for your understanding, Harald |
|
From: Donal F. <don...@ma...> - 2025-12-05 10:22:44
|
A significant part of the problem is that exactly what is allowed and prohibited varies from deployment to deployment, according to the whims of the administrators of the domain, and Windows Defender (the usual tool for enforcing these things) tends to set great store by file names and paths (as well as increasingly by signatures). This situation is "not great" to say the least, especially on development machines. (It usually manifested for me as a persistent failure to be able to run tcltest because "Computer Says No...") A mechanism that doesn't require writing out to disk at all is definitely valuable. Donal. ________________________________ From: Harald Oehlmann <har...@el...> Sent: Friday, December 05, 2025 09:55 To: tcl...@li... <tcl...@li...> Subject: Re: [TCLCORE] CFV: TIP 709: MPL Licence for MemoryModule The dll load mechanism was often hit by security issues and is constantly changing. Perhaps, the "copy to temp and execute" will not be possible any more in future. It might be prohibitted to store dlls in the temp folder. Or only "installed" DLLs may be executed. It might happen, that DLLs have to reside outside of the executable and be correctly installed in future. I can understand that. Due to that, the "memory execution" was neat avoiding this issue. |
|
From: <apn...@ya...> - 2025-12-05 10:50:20
|
I don't think anyone has ever questioned the value of loading a DLL without writing to disk. My vote was based on the implementation and the shortcomings thereof, as perceived by *me*. From: Donal Fellows <don...@ma...> A mechanism that doesn't require writing out to disk at all is definitely valuable. |
|
From: Kevin W. <kw...@co...> - 2025-12-05 11:49:02
|
Question: what problem does this TIP solve, besides messiness? Tempdir litter is not pleasing, but doesn’t the occasional reboot fix that? > On Dec 5, 2025, at 5:23 AM, Donal Fellows <don...@ma...> wrote: > > > A significant part of the problem is that exactly what is allowed and prohibited varies from deployment to deployment, according to the whims of the administrators of the domain, and Windows Defender (the usual tool for enforcing these things) tends to set great store by file names and paths (as well as increasingly by signatures). This situation is "not great" to say the least, especially on development machines. (It usually manifested for me as a persistent failure to be able to run tcltest because "Computer Says No...") > > A mechanism that doesn't require writing out to disk at all is definitely valuable. > > Donal. > > From: Harald Oehlmann <har...@el...> > Sent: Friday, December 05, 2025 09:55 > To: tcl...@li... <tcl...@li...> > Subject: Re: [TCLCORE] CFV: TIP 709: MPL Licence for MemoryModule > > The dll load mechanism was often hit by security issues and is > constantly changing. Perhaps, the "copy to temp and execute" will not be > possible any more in future. It might be prohibitted to store dlls in > the temp folder. Or only "installed" DLLs may be executed. > It might happen, that DLLs have to reside outside of the executable and > be correctly installed in future. I can understand that. Due to that, > the "memory execution" was neat avoiding this issue. > > _______________________________________________ > Tcl-Core mailing list > Tcl...@li... > https://lists.sourceforge.net/lists/listinfo/tcl-core |
|
From: Harald O. <har...@el...> - 2025-12-05 11:54:39
Attachments:
OpenPGP_signature.asc
|
Am 05.12.2025 um 12:48 schrieb Kevin Walzer: > Question: what problem does this TIP solve, besides messiness? Tempdir > litter is not pleasing, but doesn’t the occasional reboot fix that? Each start of the application makes a new copy of all included dlls to the temp dir. The standard windows wish.exe has two of them for the registry and the tk send command. Those files are currently never removed. The code comes from the starkit area which did it the same way. My personal solution is to remove on startup the dlls of the last run. Harald |
|
From: Kevin W. <kw...@co...> - 2025-12-05 15:01:49
|
> On Dec 5, 2025, at 6:54 AM, Harald Oehlmann <har...@el...> wrote: > >> Am 05.12.2025 um 12:48 schrieb Kevin Walzer: >> Question: what problem does this TIP solve, besides messiness? Tempdir litter is not pleasing, but doesn’t the occasional reboot fix that? > > Each start of the application makes a new copy of all included dlls to the temp dir. The standard windows wish.exe has two of them for the registry and the tk send command. > > Those files are currently never removed. So, in other words: a mess. I don’t see this as a problem that requires a solution, especially if the solution has as many potential issues as this one. TIP 709: NO |
|
From: Marc C. <cul...@gm...> - 2025-12-05 14:46:03
|
TIP #709: No - Marc On Thu, Dec 4, 2025 at 2:38 AM Jan Nijtmans <jan...@gm...> wrote: > Attn. TCT members, > > This is a CFV for TIP 709: > https://core.tcl-lang.org/tips/doc/trunk/tip/709.md > > I'm aware of TIP #741, and Ashoks desire to vote on it > together with this TIP. However, I don't think TIP #741 is > going anywhere, but i'll leave that up to Ashok to handle it. > > I also agree with Sergey's remark that "dynamic or static link > on compilation phase is much better than any attempt to put > DLLs into tcl_library". This TIP is meant for use-cases where > you cannot do that. Christian Werner's LUCK framework > is one of those. > > Please send your votes to this list until Friday, Dec 19, 12:00 gmt, > or [clock format 1766145600] > > Best regards, > Jan Nijtmans > > > _______________________________________________ > Tcl-Core mailing list > Tcl...@li... > https://lists.sourceforge.net/lists/listinfo/tcl-core > |
|
From: Steve L. <st...@di...> - 2025-12-08 00:41:14
|
TIP #709: NO -- Steve On 4 Dec 2025 at 4:39 PM +0800, Jan Nijtmans <jan...@gm...>, wrote: > Attn. TCT members, > > This is a CFV for TIP 709: > https://core.tcl-lang.org/tips/doc/trunk/tip/709.md > > I'm aware of TIP #741, and Ashoks desire to vote on it > together with this TIP. However, I don't think TIP #741 is > going anywhere, but i'll leave that up to Ashok to handle it. > > I also agree with Sergey's remark that "dynamic or static link > on compilation phase is much better than any attempt to put > DLLs into tcl_library". This TIP is meant for use-cases where > you cannot do that. Christian Werner's LUCK framework > is one of those. > > Please send your votes to this list until Friday, Dec 19, 12:00 gmt, > or [clock format 1766145600] > > Best regards, > Jan Nijtmans > > > _______________________________________________ > Tcl-Core mailing list > Tcl...@li... > https://lists.sourceforge.net/lists/listinfo/tcl-core > |