Re: [q-lang-users] More Unicode queries.
Brought to you by:
agraef
From: John C. <co...@cc...> - 2008-01-14 22:03:06
|
Rob Hubbard scripsit: > There's something similar in XML: character entities such as > ï > — > defined in the DTD for XHTML > <!ENTITY iuml "ï" > > <!ENTITY mdash "—" > > so I suppose these could then be used in XSLT if you count that as a > programming language. I think this is a reasonable compromise, as opposed to having either no names at all or the complete verbose Unicode official names, like, say, ARABIC LIGATURE UIGHUR KIRGHIZ YEH WITH HAMZA ABOVE WITH ALEF MAKSURA ISOLATED FORM (U+FBF9). In particular, the W3C has just released a draft set of unified character entities from XHTML, MathML, and the ISO sets: see the draft at http://www.w3.org/TR/2007/WD-xml-entity-names-20071214/ and the unified list at http://www.w3.org/2003/entities/2007/w3centities-f.ent . Once you have stripped comments and entities with more than one character in them, you have a list of 2114 short, plausible names for 1509 useful Unicode characters. There are duplicates for historical reasons, like ContourIntegral and conint -- longer dupes could be stripped if you saw fit. -- John Cowan <co...@cc...> http://www.ccil.org/~cowan .e'osai ko sarji la lojban. Please support Lojban! http://www.lojban.org |