Re: [q-lang-users] New stuff in cvs: multichar ops, views
Brought to you by:
agraef
From: Albert G. <Dr....@t-...> - 2007-05-31 12:37:39
|
Hi Rob, Rob Hubbard wrote: > I'm very happy to see multi-character operators introduced. (Does Q > also allow Unicode operators?) Yes, Unicode all the way through. :) Just like you can have arbitary Unicode letters in identifiers, you can have arbitrary Unicode punctuation in operator symbols. (BTW, I'd appreciate it very much if our non-Western-locale users could check that those Russian/Japanese/whatever identifiers and operators still work. For me, the unicode.q example works ok, but I don't have many scripts using non-ASCII characters to test, so please let me know if you find any bugs there. Alexander? Keith? Anyone else?) > I presume that tokens will be delimited according to which set their > constituent characters belong to: 'alphanumeric' or 'other' (although > an 'alphanumeric' token must begin with an alphabetical character, of > course). Is white-space a special case, i.e. a third class of > character? > > Which characters are in the 'other' set for operators - are any (such > as quote and parentheses) excluded? Ok, I've attached a little description of the lexical operator symbol syntax I wrote while working on these things, to be included in the manual later. > Does this also mean that multi-other-character function names are also > supported? That is, can I now define a non-operator function called > '--'? (I suppose that includes the secondary question: would 'other' > characters count as lower case?) No, that would make the syntax too confusing IMHO. Function symbols must now be legal identifiers, punctuation is only allowed in operator symbols. > It seems that now, introducing a new symbol will affect the way that > code is parsed. This is something I find a little worrying. You're right. Right now the lexer inspects the symbol table to partition punctuation symbols. I agree that this is a bad idea since it makes the syntax depend on the declared operator symbols. I will fix that right away. Of course this means that 5--3 won't be legal any more (unless you've declared a (--) operator). But I think that this is a minor issue, and anyway the compiler will catch it if you've written anything like that in your scripts. > Would it be better to have, e.g. '--' always as an atomic token, > producing a normal form unless '--' is defined? That is, would is be > better to break backwards compatibility? Or would that be too painful? I think that, as pointed out above, '--' should actually be an error if you haven't declared it as an operator. Implicit declaration of operators is a bad idea, IMHO. It's much too easy to mistype them. The compiler would then just silently munge almost all arbitray line noise; it might even happily parse many Perl scripts. ;-) > Is the protection offered by the module system thought to be enough? Hmm, I'm not sure what you think about here? > [Can of worms! Sorry.] No need to feel sorry, I'm glad you opened it! I want to fix all those quirks before release. ;-) Thanks, Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |