From: Bill Y <ws...@me...> - 2007-08-26 22:28:28
|
From: "Gerrit E.G. Hobbelt" <Ger...@be...> This behaviour (5) is due the 'cooperation' between the preprocessor, who's 'winging it' in regard to escaped chars, and the subtleties in the compiler/line parser, which are, ahem, alleviated by the current behaviour of zexpand(). Funny thing is that, given the working (4) above, THIS does not produce the expected 'ab': (7) ./crm114 -e '-{ output /a/ output /b/ }' Of course. You left out a semicolon; only the first slash-param should be seen. Don't think of it as "regex slashes", think of it as "indirect object" slashes. And yes, maybe that wasn't such a great idea. Slashes get used far more often for other things than they deserve to be (among them, in directory names. Ugh... It's far uglier than I had originally expected. while this is completely silent due to the 'output' being consumed as a 'noop' argument section: (8) ./crm114 -e '-{ noop output /b/ }' which should've been like this to be at least somewhere near legal, but then I'd expect an error report for (8) _instead_ (which, of course, doesn't happen): (9) ./crm114 -e '-{ noop ; output /b/ }' Correct. The right fix is to change the JIT so that the 2nd-thru-N of un-declined (no quotes) commands on a line is an auto-error. ... And all this, just because I'm trying to fix that asymmetric escape-ing. :-) It gets worse yet. The other quotes: <>, [], (), etc. are all antisymmetric - there's a different close quote from open quote. Slashes (and _only_ slashes) are different. Maybe it should have been slash to start, and backslash to end? ------ Since those :....: constructs can be nested MAX_EVAL_ITERATIONS times (at least in 'eval') I'm considering to change both preprocessor and line parser into 'proper' stack-based lexers instead of adding extra muck to the existing multivar code. Good luck but it's far more difficult than you'd expect, because there's no fixed order among the <>, (), [], //, etc. I tried a yacc-style grammar once before, and gave up. It's really nontrivial. PEGs (Byron Ford's methodology) is actually much more powerful, and I got a little ways into writing a tool to use it before the latest flurry of classifier creation (SVM, SKS, FSCM, NN) put that on the back burner. My current code can handle (2) through (6), has a bit stricter language checking, though anything like (3) still gets through, while things like (7) are now corrected on the fly (instead of silently skipping the second 'output'), but it barfs on (1), which is a bug in my own code. So far, so good. Updates regarding this will be posted in the dev ML. ETA: somewhere around next weekend probably. I went ahead and refactored the code for bit entropy according to what Ger found out in terms of "finding he best alternative", the results are a halving of the error rate in the final 10% of the corpus, up to 99.81% accurate. It's still not as good as a SSTTR hyperspace with a final 10% of corpus error rate of _one_ mistake in 9400 documents, but it's getting better. That, and a better *error routineset (that tell you what C file, line in file, and function you were in) have now been rsynced up and are in the wget-set. Next- Joe Langeway's neural net classifier... - Bill Yerazunis |