|
From: Andy H. <and...@gm...> - 2005-10-10 16:18:22
|
The UText abstract text access API needs to be revised. In brief, the changes being proposed are: 1. Use 64 bit indexes to the text, rather than 32. 2. Add functionality equivalent to the new Freezable interface. 3. Eliminate struct UTextChunk, fold the data into UText itself. These are breaking API changes from the UText of today. This is ok - in ICU 3.4, UText is described as a technology preview with API changes expected to be made. The switch to 64 bit indexing is being made because UText can easily be made to work with the contents of files or with sparse arrays, either of which can exceed the limits of 32 bit indexing. We will use ICU's int64_t type for index values. There is no standard C type that can be relied upon to be 64 bits (or more) on all platforms. Note that 64 bit UText indexing is needed even on platforms with 32 bit memory addressing, so something like ptrdiff_t is not useful at all. =3D=3D=3D Freezable is the proposed new ICU interface for disabling writing (modifying) an object - for making it become immutable. We can't just make UText implement Freezable as it is, because 1) UText is in C, not Java, and 2) UText already has some related functions, so simply adding the functions from Freezable wouldn't fit all that cleanly. Here is what I am proposing: utext_freeze(UText *ut); This function will freeze a UText.=20 utext_isWritable() will return false, and functions that attempt to modify the text will fail. This will disable changes made via this specific UText wrapper only; it will not have any effect on the ability directly modify the underlying text by bypassing UText. (Such backdoor modifications are always an error while UText access is occurring because the underlying text can get out of sync with UText's buffering.) UBool utext_isWritable() is an existing function. In addition to UTexts that are non-writable because the the underlying text does not support it, there will now be UTexts that are read only because the UText has been frozen. =3D=3D=3D utext_clone(..., UBool deep, ... The meaning of the "deep" flag to the existing utext_clone() function will be extended as follows: A shallow clone will preserve the state of the freeze state of source. With a writable source UText, the clone will also be writable.=20 Shallow clones share the same underlying text storage. A deep clone will be made writable whenever possible. Deep clones copy the underlying text itself in addition to the UText wrapper. The only reason for doing a deep clone is in preparation for modifying one copy without having the changes be visible in the other, so having the clone copy become writable seems like the right thing to do. =3D=3D=3D=3D UTextChunk A UText operates on a buffer or chunk of the source text. In the earlier development versions of UText, such a buffer was described by a struct UTextChunk. UTextChunks were standalone, and could exist outside of a UText. As the design evolved, UTextChunk and UText became more tightly coupled, until by the time ICU 3.4 was done a UTextChunk could not meaningfully exist separately from its associated UText. The proposed change is to remove the struct UTextChunk altogether, and add the fields that are still needed directly to UText. The end result will be cleaner and simpler. The only reason this needs to be discussed at all is that UTextChunk is officially part of the UText API for 3.4. If it were implementation only, it would just silently disappear. -- Andy Heninger |