q-lang-users Mailing List for Q - Equational Programming Language (Page 51)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Albert Graef scripsit:

> >John, is there a quick and dirty way to convert either a utf-8 multibyte 
> >char or a unicode point represented as a long int to a wchar_t? It 
> >appears that on Linux and Windows I should be able to simply cast the 
> >long value to wchar_t, but I guess that this breaks on other systems? I 
> >don't want to use iconv for that, to avoid the overhead, if there is a 
> >simpler way...
> 
> Hmm, taking another look at the glibc documentation, it seems that if 
> __STDC_ISO_10646__ is #define'd, then casting the unicode char number 
> (if it's below WCHAR_MAX) to wchar_t should work, no?

I don't actually know.  wchar_t is one of the badly underspecified parts of C:
ISO C only guarantees that sizeof(wchar_t) <= sizeof(char), yes, greater than
or equal to.  Windows makes it 16 bits, most Unix systems make it 32 bits.
I stay away from it.

One technique worth mentioning for implementing binary predicates about Unicode:
create a table of (start, stop+1) pairs covering the ranges for which the
predicate is true, and then binary-search the table.  If the answer is even,
the predicate is true, otherwise it's false.  For example, the predicate for
"is a Latin letter" is represented thus:

[0041, 005B, 0061, 007B, 00C0, 00D7, 00D8, 00F7, 00F8, 0220, 0222, 0234,
 1E00, 1E9C, 1EA0, 1EFA, FF21, FF3B, FF41, FF5B]

(This hasn't been updated to the very latest Unicode tables, and may miss some.)
If you unroll the binary search into a (hideously ugly, but provably correct)
tree of conditionals, the speed is pretty fast and the space almost nil.

-- 
John Cowan  jc...@re...  www.reutershealth.com  www.ccil.org/~cowan
If a traveler were informed that such a man [as Lord John Russell] was
leader of the House of Commons, he may well begin to comprehend how the
Egyptians worshiped an insect.  --Benjamin Disraeli

2003	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec (1)
2004	Jan (3)	Feb (27)	Mar	Apr (4)	May (11)	Jun (5)	Jul (5)	Aug (6)	Sep (15)	Oct (28)	Nov (8)	Dec
2005	Jan (9)	Feb (5)	Mar (10)	Apr (43)	May (8)	Jun (31)	Jul (45)	Aug (17)	Sep (8)	Oct (30)	Nov (2)	Dec (6)
2006	Jan (4)	Feb (20)	Mar (1)	Apr	May (92)	Jun (179)	Jul (26)	Aug (65)	Sep (36)	Oct (38)	Nov (44)	Dec (68)
2007	Jan (11)	Feb (25)	Mar (37)	Apr (7)	May (83)	Jun (77)	Jul (44)	Aug (4)	Sep (28)	Oct (53)	Nov (12)	Dec (21)
2008	Jan (66)	Feb (45)	Mar (30)	Apr (50)	May (9)	Jun (18)	Jul (11)	Aug (6)	Sep (4)	Oct	Nov	Dec
2009	Jan	Feb	Mar	Apr (3)	May (2)	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2010	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep (2)	Oct	Nov	Dec

q-lang-users Mailing List for Q - Equational Programming Language (Page 51)

q-lang-users — general discussion, questions, suggestions, etc.