From: Mark D. <mar...@ic...> - 2006-03-22 18:29:23
|
There are a couple of different possible consistent modes. 1. The client doesn't want to see gaps. In that case, I agree that the best policy is that each returned index is always the start index for a native character (or the offset at the very end). If, however, the client wants to refer to just the character part, though, then s/he need a way to get the end offset of the character. I think this model is slightly more work for the providers 2. The client does generally want to see gaps. In that case, the best policy is to return the start of the native character or of the gap. I think this model is slightly more work for the clients. Mark Andy Heninger wrote: > One more thing to consider in thinking about handling text with gaps > > In UText as it currently exists, any incoming native index parameter > that is not on a code point boundary is treated as if it were to the > first boundary immediately preceding the specified position. Or, > considered another way, a native index parameter to UText can, when > referring to a multi-unit character, refer to any part of the > character. > > Native Indexes returned from UText functions always refer to the first > unit of a code point, which is the same thing as the boundary > preceding the code point. > > Gaps complicate this picture. If a user does utext_char32at(some > index in the middle of a gap), we need to decide what to do. > > I think that the cleanest answer is to just say that gaps logically > belong to the character that precedes them. As long as the > application is working through UText APIs, gaps completely disappear. > If the app extracts a native substring based on indexes obtained from > UText iteration positions, that substring may have gaps in its > interior or at its end, assuming a gappy native format. Native string > functions for a gappy formats better know how to recognize and deal > with strings containing gaps. > > The thing I want to avoid is having to make applications that have > zero interest in gaps have to be aware of the possibility of their > existence in order to understand and correctly use UText. > > -- Andy > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642 > _______________________________________________ > icu-design mailing list > icu...@li... > To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-design > > > |