Re: [q-lang-users] More Unicode queries.

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Rob Hubbard scripsit:

> There's something similar in XML: character entities such as
>     &iuml;
>     &mdash;
> defined in the DTD for XHTML
>     <!ENTITY iuml "&#239;" >
>     <!ENTITY mdash "&#8212;" >
> so I suppose these could then be used in XSLT if you count that as a
> programming language.

I think this is a reasonable compromise, as opposed to having either no
names at all or the complete verbose Unicode official names, like, say,
ARABIC LIGATURE UIGHUR KIRGHIZ YEH WITH HAMZA ABOVE WITH ALEF MAKSURA
ISOLATED FORM (U+FBF9).

In particular, the W3C has just released a draft set of unified
character entities from XHTML, MathML, and the ISO sets: see the draft at
http://www.w3.org/TR/2007/WD-xml-entity-names-20071214/ and the unified
list at http://www.w3.org/2003/entities/2007/w3centities-f.ent .

Once you have stripped comments and entities with more than one character
in them, you have a list of 2114 short, plausible names for 1509 useful
Unicode characters.  There are duplicates for historical reasons, like
ContourIntegral and conint -- longer dupes could be stripped if you
saw fit.

-- 
John Cowan                              <co...@cc...>
            http://www.ccil.org/~cowan
                .e'osai ko sarji la lojban.
                Please support Lojban!          http://www.lojban.org