Thread: [q-lang-users] More Unicode queries.
Brought to you by:
agraef
From: Rob H. <hub...@gm...> - 2008-01-13 16:58:02
|
Dear Albert, Thanks for all the info in reply to my email back in November. Still on the subject of Unicode, I wondered if the documentation on Unicode strings could be clarified a little. In =A73.5 you describe how to encode characters. Your method is subtly different from that in the other languages I know. I was expecting something like "\U1234" for a Unicode character. Also, in some languages, the character encodings are of the form "\0377" and "\xFF" (of fixed length), and so it's easy to misread the Q documentation if familiar with such other klanguages. So the differences are: hex codes begin "\0x" rather than just "\x", and the numerals following the escape are read "hungrily", hence the need for parenthesis sometimes. Perhaps you could emphasise these differences, and also give some Unicode examples, such as: "Gr\(0x00E4)f" // although this is in Latin-1 too "Gr\(0xE4)f" "Infinity =3D \(0x221E)" "\(0x2202)f/\(0x2202)x" // partial "df/dx" (or some better ones). I prefer your method, by the way, to that in other languages. I particularly like the parenthesis. I think "\(0x1B)" looks very clear even when not required. Finally, can the Unicode characters also be escaped by name? Thanks, Rob. |
From: Albert G. <Dr....@t-...> - 2008-01-13 20:48:33
|
Rob Hubbard wrote: > So the differences are: hex codes begin "\0x" rather than just "\x", > and the numerals following the escape are read "hungrily", hence the > need for parenthesis sometimes. That's right. Maybe a note for C/Python/whatever language users is in order to emphasize these differences. > Perhaps you could emphasise these differences, and also give some > Unicode examples, such as: There are examples for all kinds of escapes in that section. However, I'll gladly accept patches to the text (preferably in texi format, though). > Finally, can the Unicode characters also be escaped by name? No. Are there other languages which offer this? Cheers, Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: Rob H. <hub...@gm...> - 2008-01-14 16:57:26
|
> > Finally, can the Unicode characters also be escaped by name? > > No. Are there other languages which offer this? Python: \N{name} e.g. \N{LATIN SMALL LETTER I WITH DIAERESIS} or \N{EM DASH} e.g. in str = u'Encyclop\N{LATIN SMALL LETTER AE}dia' [I believe Perl is similar.] There's something similar in XML: character entities such as ï — defined in the DTD for XHTML <!ENTITY iuml "ï" > <!ENTITY mdash "—" > so I suppose these could then be used in XSLT if you count that as a programming language. I'm not sure there's sufficient need for something similar in Q. (Originally, I just wondered whether if might be available "for free" through a library used to implement Unicode support in Q.) Thanks, Rob. |
From: John C. <co...@cc...> - 2008-01-14 22:03:06
|
Rob Hubbard scripsit: > There's something similar in XML: character entities such as > ï > — > defined in the DTD for XHTML > <!ENTITY iuml "ï" > > <!ENTITY mdash "—" > > so I suppose these could then be used in XSLT if you count that as a > programming language. I think this is a reasonable compromise, as opposed to having either no names at all or the complete verbose Unicode official names, like, say, ARABIC LIGATURE UIGHUR KIRGHIZ YEH WITH HAMZA ABOVE WITH ALEF MAKSURA ISOLATED FORM (U+FBF9). In particular, the W3C has just released a draft set of unified character entities from XHTML, MathML, and the ISO sets: see the draft at http://www.w3.org/TR/2007/WD-xml-entity-names-20071214/ and the unified list at http://www.w3.org/2003/entities/2007/w3centities-f.ent . Once you have stripped comments and entities with more than one character in them, you have a list of 2114 short, plausible names for 1509 useful Unicode characters. There are duplicates for historical reasons, like ContourIntegral and conint -- longer dupes could be stripped if you saw fit. -- John Cowan <co...@cc...> http://www.ccil.org/~cowan .e'osai ko sarji la lojban. Please support Lojban! http://www.lojban.org |
From: Albert G. <Dr....@t-...> - 2008-01-15 03:17:10
|
John Cowan wrote: > In particular, the W3C has just released a draft set of unified > character entities from XHTML, MathML, and the ISO sets: see the draft at > http://www.w3.org/TR/2007/WD-xml-entity-names-20071214/ and the unified > list at http://www.w3.org/2003/entities/2007/w3centities-f.ent . > > Once you have stripped comments and entities with more than one character > in them, you have a list of 2114 short, plausible names for 1509 useful > Unicode characters. There are duplicates for historical reasons, like > ContourIntegral and conint -- longer dupes could be stripped if you > saw fit. I think that this is an excellent idea! Would everyone be happy with using the entity names instead of the unwieldy Unicode names? If so, then the remaining question is which syntax are we going to use in strings for that? I suggest something like \φ, as in: "The greek letter \φ is the 21st letter in the Greek alphabet." Any other suggestions? I'm looking forward to a nice lexical syntax flamefest. ;-) Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: Rob H. <hub...@gm...> - 2008-01-15 09:22:13
|
> > [John] Once you have stripped comments and entities with more than one character > > in them, you have a list of 2114 short, plausible names for 1509 useful > > Unicode characters. There are duplicates for historical reasons, like > > ContourIntegral and conint -- longer dupes could be stripped if you > > saw fit. I'd strip the historical duplicates. I think its okay for an entity to have more than one character. > [Albert] I think that this is an excellent idea! Would everyone be happy with > using the entity names instead of the unwieldy Unicode names? > > If so, then the remaining question is which syntax are we going to use > in strings for that? I suggest something like \φ, as in: > > "The greek letter \φ is the 21st letter in the Greek alphabet." I was thinking in terms of \N{...}, but I like \&...; too. Perhaps that would be less confusing for Python (and other languages that I can't bring myself to mention again :-) ) users. > Any other suggestions? I'm looking forward to a nice lexical syntax > flamefest. ;-) I think the user should be able to define his own, but I'm not sure how this should be expressed. Then if a user particularly needs a character outside that set, he can define his own. I can't think of a way to do it without introducing a new keyword such as "entity". Rob. |
From: Albert G. <Dr....@t-...> - 2008-01-15 15:07:55
|
Rob Hubbard wrote: > I think the user should be able to define his own, but I'm not sure > how this should be expressed. Then if a user particularly needs a > character outside that set, he can define his own. I don't think that this should be part of the language. Q doesn't want to be XML after all. Q strings are constants; there's no direct way to rewrite strings, as you suggested as a mechanism to add your own entity definitions. The escape mechanism just provides an alternative way to specify certain character literals, and I think it should be kept that way. If you really need more then you're always free to write your own Q module which provides the necessary operations to perform any kind of string substitutions that you want (albeit at runtime). I agree, however, that it's useful to have symbolic escapes for a fixed set of special characters such as extended Latin, math symbols, arrows, block graphics and the like. The W3C set has some 2200 of these, which AFAICS provides you with plenty of special symbols for almost any purpose. The downside is that the W3C set doesn't include most foreign scripts, but why would anyone want to write, say, Kanji using escape sequences? It's much easier to type these directly using the appropriate input methods and a Unicode-capable text editor. D'accord? Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: Albert G. <Dr....@t-...> - 2008-01-17 07:14:28
|
John Cowan wrote: > Once you have stripped comments and entities with more than one character > in them, you have a list of 2114 short, plausible names for 1509 useful > Unicode characters. This is what is implemented now. BTW, John, thanks for spotting this. That W3C draft just came out, what a lucky coincidence. ;-) If you happen to keep an eye on this, it would be nice if you could let me know when the draft gets revised, so that the support in Q can be updated accordingly. (I wrote a little Q script to generate the C code in src/w3centities.c automatically from the .ent file, which makes this easy. The script isn't included in the sources right now, but if anyone wants to have it, just let me know.) Rob Hubbard wrote: > I'd strip the historical duplicates. I left them in. The full list of names is just some 15KB now, not a big deal even on embedded devices nowadays. > I think its okay for an entity to have more than one character. I only included the single-char entities for now. This simplifies the implementation, and is also consistent with the other escapes which all represent single Unicode characters. If this is a problem then please let me know. Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: Albert G. <Dr....@t-...> - 2008-01-17 08:32:41
|
This is unrelated, but I took the opportunity to also update the uchar properties table to the latest from ICU 3.8. (Note that this is used to implement the Unicode char type predicates like isalpha.) New tarball at http://sourceforge.net/project/showfiles.php?group_id=96881&package_id=188958 -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: John C. <co...@cc...> - 2008-01-17 09:35:55
|
Albert Graef scripsit: > BTW, John, thanks for spotting this. That W3C draft just came out, > what a lucky coincidence. ;-) Indeed. Someone's blog pointed me to it, I'm not sure who, and then I incorporated it into the latest release of my TagSoup parser, a SAX parser written in Java that processes arbitrary HTML rather than XML. (plug: see http://tagsoup.info ). > If you happen to keep an eye on this, it would be nice if you could > let me know when the draft gets revised, so that the support in Q can > be updated accordingly. I'll let you know, as I'll be updating TagSoup as well. > (I wrote a little Q script to generate the C code in src/w3centities.c > automatically from the .ent file, which makes this easy. The script > isn't included in the sources right now, but if anyone wants to have > it, just let me know.) Just what I did, except that being in a hurry I wrote it in Perl. > Rob Hubbard wrote: > I'd strip the historical duplicates. > > I left them in. The full list of names is just some 15KB now, not a > big deal even on embedded devices nowadays. > > > I think its okay for an entity to have more than one character. > > I only included the single-char entities for now. This simplifies the > implementation, and is also consistent with the other escapes which > all represent single Unicode characters. If this is a problem then > please let me know. I made the same decisions. -- John Cowan http://www.ccil.org/~cowan co...@cc... Please leave your values Check your assumptions. In fact, at the front desk. check your assumptions at the door. --sign in Paris hotel --Cordelia Vorkosigan |
From: Albert G. <Dr....@t-...> - 2008-01-17 17:58:37
Attachments:
w3centities.q
|
John Cowan wrote: > I'll let you know, as I'll be updating TagSoup as well. Great, many thanks! > Just what I did, except that being in a hurry I wrote it in Perl. I've attached my Q script. It expects the w3centities.ent file in the current dir, output is written to w3centities.c. Could be interesting to compare the two scripts, if you're willing to share your Perl solution. Cheers, Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: John C. <co...@cc...> - 2008-01-18 00:52:58
|
Albert Graef scripsit: > I've attached my Q script. It expects the w3centities.ent file in the > current dir, output is written to w3centities.c. Could be interesting to > compare the two scripts, if you're willing to share your Perl solution. Sure. Note that & and < must be special-cased, because the definition of an entity may not contain an explicit & or <. #!/usr/bin/perl -w # Process W3 .ent file into tssl style # Sample input: # <!ENTITY AElig "Æ" ><!--LATIN CAPITAL LETTER AE --> # <entity name='AElig' codepoint='00C6'/> use strict; while (<>) { chomp; my ($entity, $name, $string) = split; next unless defined($entity); next unless $entity eq "<!ENTITY"; # reject cruft next if $name eq "%"; # sample declaration next unless length($string) == 11; # reject non-singletons my $codepoint = substr($string, 4, 5); $codepoint = substr($codepoint, 1, 4) if substr($codepoint, 0, 1) eq "0"; $codepoint = "0026" if $name eq "amp"; $codepoint = "003C" if $name eq "lt"; print " <entity name='$name' codepoint='$codepoint'/>\n"; } -- A mosquito cried out in his pain, John Cowan "A chemist has poisoned my brain!" http://www.ccil.org/~cowan The cause of his sorrow co...@cc... Was para-dichloro- Diphenyltrichloroethane. (aka DDT) |
From: Albert G. <Dr....@t-...> - 2008-01-18 02:43:35
Attachments:
w3centities.q
|
John Cowan wrote: > Sure. Note that & and < must be special-cased, because the definition of an > entity may not contain an explicit & or <. Ok, the corrected script is attached. I also updated cvs accordingly and uploaded a new tarball (in testing). I get 2111 singlechar entities now. Does that sound right? Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: John C. <co...@cc...> - 2008-01-18 06:30:45
|
Albert Graef scripsit: > I get 2111 singlechar entities now. Does that sound right? That's what I get. I hope it's correct. That's a beautiful script you have there. -- When I'm stuck in something boring John Cowan where reading would be impossible or (who loves Asimov too) rude, I often set up math problems for co...@cc... myself and solve them as a way to pass http://www.ccil.org/~cowan the time. --John Jenkins |
From: Albert G. <Dr....@t-...> - 2008-01-18 07:00:51
|
John Cowan wrote: >> I get 2111 singlechar entities now. Does that sound right? > That's what I get. I hope it's correct. Ok, great. Otherwise we'll blame the W3C. :) > That's a beautiful script you have there. Welcome to Q, the better Perl. ;-) Seriously, Q does cover 100% of my scripting needs these days. Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: John C. <co...@cc...> - 2008-01-18 07:14:51
|
Albert Graef scripsit: > Welcome to Q, the better Perl. ;-) Seriously, Q does cover 100% of my > scripting needs these days. You should probably write a manual section, web page, and/or book chapter on "Q scripting", loosely defined as Q programs without any rewrite rules. -- "But I am the real Strider, fortunately," John Cowan he said, looking down at them with his face co...@cc... softened by a sudden smile. "I am Aragorn son http://www.ccil.org/~cowan of Arathorn, and if by life or death I can save you, I will." --LotR Book I Chapter 10 |
From: Albert G. <Dr....@t-...> - 2008-01-18 12:32:43
|
John Cowan wrote: > You should probably write a manual section, web page, and/or book chapter on > "Q scripting", loosely defined as Q programs without any rewrite rules. I take it that by "without any rewrite rules" you actually mean "basic stuff we usually do with scripting languages". ;-) Stuff like traversing directories, batch processing of text or images (using ImageMagick), basic web programming etc. If anyone has some ideas which concrete examples should go in there (basic stuff, no 3 manyear projects please ;-), or maybe has some short but instructive Perl/Python/Ruby examples to be ported, I'll have a look, as time permits. As always, any help is appreciated. :) Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: Albert G. <Dr....@t-...> - 2008-01-16 09:45:53
|
John Cowan wrote: > In particular, the W3C has just released a draft set of unified > character entities from XHTML, MathML, and the ISO sets: see the draft at > http://www.w3.org/TR/2007/WD-xml-entity-names-20071214/ and the unified > list at http://www.w3.org/2003/entities/2007/w3centities-f.ent . Ok, this is in cvs now. I also made available a tarball (snapshot of current cvs) in testing: http://sourceforge.net/project/showfiles.php?group_id=96881&package_id=188958 Here's the blurb from the manual: As of version 7.11 and later, the interpreter also supports symbolic character escapes of the form `\&NAME;', where NAME is any of the XML single character entity names specified in the "XML Entity definitions for Characters", see `http://www.w3.org/TR/xml-entity-names/'. Note that, at the time of this writing, this is still a W3C working draft, so the supported entity names may be subject to change until the final specification comes out; the currently supported entities are described in the draft from 14 December 2007, see `http://www.w3.org/TR/2007/WD-xml-entity-names-20071214/'. Also note that multi-character entities are _not_ supported in this implementation. Examples (make sure you set your email client to UTF-8 encoding if this comes out garbled): ==> "Gr\äf" "Gräf" ==> "Gr\&junk;f" ! Invalid character escape in string constant >>> "Gr\&junk;f" ^ ==> puts "The greek letter \&phgr; is the 21st letter in the Greek alphabet.\n" The greek letter φ is the 21st letter in the Greek alphabet. () Enjoy, and please let me know if there's anything that doesn't appear to work right. Cheers, Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: Albert G. <Dr....@t-...> - 2008-01-16 23:18:07
|
I'm resending this in latin1, so that it doesn't end up in junk mail folders. ;-) BTW, does anyone know why Thunderbird 1.5 converts messages sent as utf-8 to base64? That's rather inconvenient. John Cowan wrote: > In particular, the W3C has just released a draft set of unified > character entities from XHTML, MathML, and the ISO sets: see the draft at > http://www.w3.org/TR/2007/WD-xml-entity-names-20071214/ and the unified > list at http://www.w3.org/2003/entities/2007/w3centities-f.ent . Ok, this is in cvs now. I also made available a tarball (snapshot of current cvs) in testing: http://sourceforge.net/project/showfiles.php?group_id=96881&package_id=188958 Here's the blurb from the manual: As of version 7.11 and later, the interpreter also supports symbolic character escapes of the form `\&NAME;', where NAME is any of the XML single character entity names specified in the "XML Entity definitions for Characters", see `http://www.w3.org/TR/xml-entity-names/'. Note that, at the time of this writing, this is still a W3C working draft, so the supported entity names may be subject to change until the final specification comes out; the currently supported entities are described in the draft from 14 December 2007, see `http://www.w3.org/TR/2007/WD-xml-entity-names-20071214/'. Also note that multi-character entities are _not_ supported in this implementation. Examples: ==> "Gr\äf" "Gräf" ==> "Gr\&junk;f" ! Invalid character escape in string constant >>> "Gr\&junk;f" ^ ==> puts "The letter \&phgr; is the 21st letter in the Greek alphabet.\n" The letter ? is the 21st letter in the Greek alphabet. () Enjoy, and please let me know if there's anything that doesn't appear to work right. Cheers, Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: Albert G. <Dr....@t-...> - 2008-01-18 02:09:02
|
John Cowan wrote: > Sure. Note that & and < must be special-cased, because the definition of an > entity may not contain an explicit & or <. Ah yes, thanks for pointing that out. I also noticed a few entities like the following: <!ENTITY DotDot " ⃜" ><!--COMBINING FOUR DOTS ABOVE --> Is this really supposed to be a two-character combination? Because all I get from " \0x020DC" is a blank followed by the "four dots above" character. It seems rather odd to define that as an entity, no? Thanks, Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: John C. <co...@cc...> - 2008-01-18 06:24:58
|
Albert Graef scripsit: > <!ENTITY DotDot " ⃜" ><!--COMBINING FOUR DOTS ABOVE --> > > Is this really supposed to be a two-character combination? Yes, it is. It is the character SPACE (U+0020) followed by a combining character, one which is nonspacing and normally sits above, below, left of, or right of another character called its base character. By convention, a nonspacing character placed on a SPACE character becomes the corresponding spacing character in appearance. (Unicode encodes both spacing and nonspacing versions of certain diacritics for backward compatibility; for example, there is both ^ and a COMBINING CIRCUMFLEX.) > Because all I get from " \0x020DC" is a blank followed by the "four > dots above" character. That is either a font problem or a font rendering problem on your system, more probably the latter. Linux is considerably behind both Windows and OS X in getting basic i18n correct, although it provides more localizations (particularly into languages considered non-commercial by the others). -- Evolutionary psychology is the theory John Cowan that men are nothing but horn-dogs, http://www.ccil.org/~cowan and that women only want them for their money. co...@cc... --Susan McCarthy (adapted) |
From: Albert G. <Dr....@t-...> - 2008-01-14 18:41:14
|
Rob Hubbard wrote: >> No. Are there other languages which offer this? > > Python: > \N{name} Ok, I see. Well, I could probably extract the necessary tables from ICU, but that would add quite as lot of static string data to the interpreter. So, before I put this on the TODO list, let me ask whether anybody really wants/needs this? > (Originally, I just wondered whether if might be available "for free" > through a library used to implement Unicode support in Q.) No, I didn't like the idea to add a huge dependency like ICU to the interpreter. So I wrote the code for parsing utf8 myself, and other than that Q only pilfers a few bits from ICU for the character predicates, and uses libiconv for encoding conversions. Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |
From: Rob H. <hub...@gm...> - 2008-01-15 09:11:24
|
Ah, I wrote this before I saw John's reply. Here it is anyway... > >> No. Are there other languages which offer this? > > > > Python: > > \N{name} > > Ok, I see. Well, I could probably extract the necessary tables from ICU, > but that would add quite as lot of static string data to the > interpreter. So, before I put this on the TODO list, let me ask whether > anybody really wants/needs this? I see. Well in that case, ironically, my vote is "no". I don't think the price is worth paying. Apart from anything else, the Unicode names are truly horrible. However, I wonder whether a mechanism like XML's entities could be used for Q strings. I can't think of a Q-like way to do it. Allowing \N{name} within a string to be subject to rewrite rules would be the sort of thing that would be useful. However, it ought to be done when the program is parsed rather than run. [XML does not restrict an entity to a single character.] I don't think this would upset the Q style, as currently a Q file can already affect the way a program is parsed, with the advent of under-defined infixed symbolic operators. A user would be free to define his string entities in a Unicode-like or XHTML-like or TeX-like style, according to taste. Thus he could define an em dash as \N{EM DASH} or \N{mdash} or \N{---}. In this way, it might be possible to supply a handful of <entity.q> files containing names for at least the most commonly used symbols, perhaps in varying styles and to varying extents. Rob. |
From: John C. <co...@cc...> - 2008-01-15 03:57:01
|
Albert Graef scripsit: > If so, then the remaining question is which syntax are we going to use > in strings for that? I suggest something like \φ, as in: I like that. Be aware that entity names are case-sensitive (unlike Unicode names which are all upper case) and that valid characters in names are A-Z, a-z, 0-9, and full stop. In principle, minus sign and underscore are also allowed, but are not in fact used. > "The greek letter \φ is the 21st letter in the Greek alphabet." Will these be allowed in all places (identifiers as well as strings)? I would say yes: def \π = 3.141592653 -- But the next day there came no dawn, John Cowan and the Grey Company passed on into the co...@cc... darkness of the Storm of Mordor and were http://www.ccil.org/~cowan lost to mortal sight; but the Dead followed them. --"The Passing of the Grey Company" |
From: Albert G. <Dr....@t-...> - 2008-01-15 09:01:35
|
John Cowan wrote: > Will these be allowed in all places (identifiers as well as strings)? > I would say yes: def \π = 3.141592653 I would say no, as they go against readability. Just have a look at ASCII'ified Fortress code; I think that it looks horrible. In Q, a notation like this really looks more like a conglomerate of operators and identifiers at first glance. Therefore, if you want Unicode letters in identifiers, I think it's better to just use a Unicode-capable editor and type them "as is". Character escapes in strings are a different kind of thing, however. I've put the entity character escapes for strings on my TODO list for the next release. Cheers, Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |