|
From: <apn...@ya...> - 2025-09-25 10:49:16
|
Tx. In any case, I have limited time to work on Tcl until Nov so no hurries.
/Ashok
From: Donald Porter <d.g...@co...>
Sent: Thursday, September 25, 2025 2:51 PM
To: apn...@ya...
Cc: tcl...@li...
Subject: Re: [TCLCORE] Questions about Tcl{Get, Set}ProcessGlobalValue functions
I have answers, but may I please offer them to you next month?
DGP
On Sep 25, 2025, at 5:06 AM, apnmbx-public--- via Tcl-Core <tcl...@li... <mailto:tcl...@li...> > wrote:
I have a question about the TclGetProcessGlobalValue / TclSetProcessGlobalValue pair of functions that I hope someone can answer. These functions are supposed to store values or settings that are shared across all threads in the process.
TL;DR why do the above functions get/set values using the *system* encoding?
As currently implemented, TclSetProcessGlobalValue encodes the Tcl_Obj value passed in using the current system encoding and stores it in a global C struct. It also stores the *original* passed in Tcl_Obj in a thread-local cache so that its internal representation is not lost for that thread’s usage. Use of epochs ensure the stale values are not used.
When TclGetProcessGlobalValue is called, the encoded value in the global C struct is decoded using the system encoding and the result is passed back to the caller in a new Tcl_Obj which is also stored in that thread’s cache. If this function is called without TclSetProcessGlobalValue having previous set the value, an initializer function is called which returns the initial value along with encoding used.
The code accounts for the fact that system encoding may change (generally only during initialization when all encodings are not immediately available) by tracking the encodings used and converting appropriately as needed.
My question is - what the purpose of this encoding / decoding pair when storing and retrieving values? The passed in values are (effectively) internal modified UTF-8 strings. Why not just store return those? This is not just a question of efficiency but correctness. There are several issues with the current implementation:
* A value being stored may not be representable using the current system encoding. Since the encoding is done using TCL_ENCODING_PROFILE_TCL8, essentially a “corrupted” value is stored in the global struct and returned by
* Likewise, there is potential for further corruption for similar reasons when the system encoding changes and the new system encoding does not support additional characters.
* Further, because the *original* Tcl_Obj remains in the thread that called TclSetProcessGlobalValue, that thread’s perception of the “global” value differs from all other threads (which see the “corrupted” value).
This seems broken to me if the whole purpose was to have global values shared across threads. It neither preserves values, nor shares them correctly. From my perspective, the global value should directly reflect he string representation of the Tcl_Obj passed in. It is the responsibility of the caller to ensure the value is correct. Once in Tcl’s internal representation, changes in system encoding should not matter.
And yet, because there is all this additional explicit machinery for encoding / decoding that has been added, I believe there was some purpose behind it. If so, what was it?
As an aside, I think there are bugs with the sequence of encoding operations as well, e.g. it assumes single byte nul terminators, epochs are checked without any thread synchronization etc. but those are secondary to the questions above.
Anybody know the answer to the above?
/Ashok
PS The context for all this is TIP 732 – trying to fully understand Tcl initialization.
_______________________________________________
Tcl-Core mailing list
<mailto:Tcl...@li...> Tcl...@li...
<https://lists.sourceforge.net/lists/listinfo/tcl-core> https://lists.sourceforge.net/lists/listinfo/tcl-core
|