Re[4]: [Firebird-devel] Additional Collate for Firebird (Case/Accent Insensitive)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hello, Peter !

>> Things are not so simple. German letter b (written like greek beta)
>> collates the same way as "ss" sequence.

> As a German I can tell you, that this is about 25% true on a scale of 0% =
to
> 100% true-ity. And if this is the only case holding back the
> implementation, we can argue to further decrease the true-ity level.

This is not the only case. Think of DOCBOOK collation implemented as
an example in Dave's collkit.

>> There are many other artefacts
>> like this. Correct solution is to preprocess both patterns and source
>> string the way simular to transformation used for indexing. But this
>> requires some changes to INTL interface.

> Can you elaborate? I would like to see this work in some way, but for the
> multi-level collations the sortkey returned consists of 2-4 parts and so =
it
> won't be of any direct use string searching:

> E.g: Caf=E9teria will return CAFETERIA333433333211111111
> and Caf=E9 will return CAFE33342111,
> and for obvious reasons the latter isn't a substring of former one.

> There is already a unused (?) but designed interface in INTL, to return
> only the primary differences ('partial'), then
> Caf=E9teria will return CAFETERIA and Caf=E9 will return CAFE33342111,
> so that would fit the bill for nocase/noaccent substring searching.

Needed transformation should return canonical representation of string
in terms of string equality. For example, if our string is "Caf=E9":

1) if collation is case-sensitive and accent-sensitive it should return
"Caf=E9".
2) If it is case-sensitive and accent-insensitive it should return
"Cafe"
3) If it is case-insensitive and accent-insensitive it should return
"CAFE"

If collation threats german "b" as "ss" it should return "strasse" for
string "strabe", etc...

Got the idea ? We need transformator of string data to canonical
representation that may be used for pattern-matching.

> Peter Jacobi

--=20
Nickolay Samofatov                       mailto:sk...@bs...

Re[4]: [Firebird-devel] Additional Collate for Firebird (Case/Accent Insensitive)

A powerful, cross platform, SQL database system

Re[4]: [Firebird-devel] Additional Collate for Firebird (Case/Accent Insensitive)