Thanks for your response.
I added your implementation with two minor fixes, tested it,
then checked them in for now. One concern I have with your
implementation is that it adds a handler call which does a
regexp match for every token generated by semantic-flex!
Once the correctness of the python parser is verified and we
get to the optimizing stage, then I would like to see how
much run-time overhead this introduces.
What I tried to do before your response was to add wisent
handler for specific keyword tokens which I thought might be
more efficient. I'm not sure whether this is a crazy idea
or not. Shows my ignorance I guess.
>>>>> "DP" == David Ponce <david@...> writes:
DP> Hi Richard,
>> The next hurdle is the following unusual syntax:
>> r'some string or regexp'
DP> I don't know python at all and I am not sure I clearly understand the
DP> problem. Does it concern handling of "raw-strings" of the form
DP> Maybe could you provide some code snippets?
DP> To handle raw-strings I (quickly ;-) wrote the following
DP> `semantic-flex-extensions' that seems to work:
DP> (defconst wisent-python-raw-string-re "\\<[r]\""
DP> "Regexp matching python raw string prefix.")
I used a single quote rather than double quote, because that
is what I see in /usr/lib/python2.2/*.py files.
(defconst wisent-python-raw-string-re "\\<[r]\'"
"Regexp matching python raw string prefix.")
DP> (defconst wisent-python-flex-extensions
DP> (list (cons wisent-python-raw-string-re
DP> "`semantic-flex-extensions' to recognize python raw strings.")
DP> (defun wisent-python-flex-raw-string ()
DP> "`semantic-flex' extension to handle python raw-strings.
DP> Return a 'raw-string syntactic token."
DP> (let* ((b (point))
DP> (e (condition-case nil
DP> (forward-char) ;; skip 'r'
DP> (forward-sexp 1)
DP> ;; This case makes flex
DP> ;; robust to broken strings.
DP> start end))
DP> (cons 'raw-string (cons b e))))
DP> Then 'raw-string tokens can be easily handled by a specific
DP> `wisent-flex' handler like this (untested):
DP> ;; %token <raw-string> RAW_STRING_LITERAL
DP> ;; %put raw-string handler 'wisent-python-raw-string-handler
I deleted the single quote. With the quote,
(wisent-wy-update-outputfile) hung! It would have been nice
to see an error message rather than
(wisent-wy-update-outputfile) hanging given that adding the
single quote could be a common mistake.
%put raw-string handler wisent-python-raw-string-handler
DP> (defun wisent-python-raw-string-handler ()
DP> "`wisent-flex' handler of 'raw-string syntactic tokens."
DP> (let* ((stok (car wisent-flex-istream))
DP> (wisent-flex-istream (cdr wisent-flex-istream)))
DP> (cons 'RAW_STRING_LITERAL
DP> ;; Remove r" and " delimiters
DP> (substring (semantic-flex-text stok) 2 -1)
DP> (cdr stok)))))
DP> I think it can be interesting to differentiate normal strings
DP> (their value should be `read' to remove '\') and raw strings (their
DP> value is the text between r" and the final ", and '\' aren't removed).
DP> I hope I helped ;-)
It sured did.
With the fix, the parser is able to parse two more files in
/usr/lib/python2.2 directory bringing the total to four:
I notice some problems with the next file Cookie.py.
I'll investigate it next.
I'm just testing all /usr/lib/python2.2/*.py files
alphabetically to find and fix the problems that I
Thanks again for your help.