From: Ivan P. <Iva...@se...> - 2004-06-24 15:55:03
|
> A) (and very important from my POV): > For the majority of application, no-case no-accent collations don't make > sense, and the 'normal' multi-level collations would fulfill all > requirements, if only they are used. As you are mentioning in your parallel > post, in STARTING WITH and LIKE 'foo%' > > STARTING WITH 'ABC' should select 'ABCDE' and 'abcde', when a multilevel > collation is used. Yes, but it (Starting With) will become accent-insensitive at the same time, somebody may want it, somebody do not. Nearly everybody (I believe) is content with sorting capabilities provided by multi-level collations, it is searching capabilities that cause troubles. This is why I said that we need more operators/functions, not collations. (e.g. new case-insensitive STARTING, instead of changing its current behaviour) In other words, ordering is property of data, but case/accent-insensitivity should be property of operation on those data (at least in some cases. Of course =,<,> must match the ordering, but I see no reason why the user should not be able to choose from case/accent-sens/insens variants of Containing). As Peter suggested, combination of expression-indexes and Noaccent() variant of Upper() can satisfy such needs too. > > B) The existing multi-level collations are painfully wasteful on key > storage space, limiting the maximally indexable buffer size to a third of > the general limit. For this reasons, triplicating them into no-case and > no-case / no-accent variants is -at least to me- wasted effort. Yes, and I noticed that many existing collations are in fact two-level only, yet they require three bytes. > D) Only with FB1.5.1 it has become an options to consistently use > connection charset NONE (ane be it only for the lack of a better solution) (I probably already asked this, but forgot the answer:) Does this change apply to client-server communication only, or is it consistent everywhere (e.g. Update Tab Set Iso8859Column = NoneColumn;) > H) Whoever needs no-case/no-accent very bad, can try my pj_colkit LOADABLE > collation. > http://www.jodelpeter.de/i18n/fbarch/loadable.txt > http://www.jodelpeter.de/i18n/fbarch/ > > J) As rdb$collations and rdb$character_sets are hardcoded into the engine's > code, instead of being deferred from the DLLs, it's unnecessarily > complicated to add charscter sets and collations (and all tools will ignore > them) I tried to write some external collation drivers too (using Delphi), and it is not much difficult, but such approach has its drawbacks too: - more complicated installation - even custom collations can't solve many requirements - narrow range for external collation_ids (250..254). These Charset_id/Collation_id numeric values are used directly e.g. in SP/Trigers blr, and there is no additional mapping (e.g. through system tables) between external/internal ids, thus it is dangerous to expect that nobody else has taken these numbers by different driver. Ivan |