David,
Thanks for your response.
I added your implementation with two minor fixes, tested it,
then checked them in for now. One concern I have with your
implementation is that it adds a handler call which does a
regexp match for every token generated by semantic-flex!
Once the correctness of the python parser is verified and we
get to the optimizing stage, then I would like to see how
much run-time overhead this introduces.
What I tried to do before your response was to add wisent
handler for specific keyword tokens which I thought might be
more efficient. I'm not sure whether this is a crazy idea
or not. Shows my ignorance I guess.
>>>>> "DP" == David Ponce <david@...> writes:
DP>
DP> Hi Richard,
DP> [...]
>> The next hurdle is the following unusual syntax:
>>
>> r'some string or regexp'
DP> [...]
DP>
DP> I don't know python at all and I am not sure I clearly understand the
DP> problem. Does it concern handling of "raw-strings" of the form
DP> r"..."?
DP>
DP> Maybe could you provide some code snippets?
DP>
DP> To handle raw-strings I (quickly ;-) wrote the following
DP> `semantic-flex-extensions' that seems to work:
DP>
DP> (defconst wisent-python-raw-string-re "\\<[r]\""
DP> "Regexp matching python raw string prefix.")
I used a single quote rather than double quote, because that
is what I see in /usr/lib/python2.2/*.py files.
(defconst wisent-python-raw-string-re "\\<[r]\'"
"Regexp matching python raw string prefix.")
DP> (defconst wisent-python-flex-extensions
DP> (list (cons wisent-python-raw-string-re
DP> 'wisent-python-flex-raw-string))
DP> "`semantic-flex-extensions' to recognize python raw strings.")
DP>
DP> (defun wisent-python-flex-raw-string ()
DP> "`semantic-flex' extension to handle python raw-strings.
DP> Return a 'raw-string syntactic token."
DP> (let* ((b (point))
DP> (e (condition-case nil
DP> (progn
DP> (forward-char) ;; skip 'r'
DP> (forward-sexp 1)
DP> (point))
DP> ;; This case makes flex
DP> ;; robust to broken strings.
DP> (error
DP> (progn
DP> (goto-char
DP> (funcall
DP> semantic-flex-unterminated-syntax-end-function
DP> 'string
DP> start end))
DP> (point))))))
DP> (cons 'raw-string (cons b e))))
DP>
DP> Then 'raw-string tokens can be easily handled by a specific
DP> `wisent-flex' handler like this (untested):
DP>
DP> ;; %token <raw-string> RAW_STRING_LITERAL
DP> ;; %put raw-string handler 'wisent-python-raw-string-handler
I deleted the single quote. With the quote,
(wisent-wy-update-outputfile) hung! It would have been nice
to see an error message rather than
(wisent-wy-update-outputfile) hanging given that adding the
single quote could be a common mistake.
%put raw-string handler wisent-python-raw-string-handler
DP> ;;
DP> (defun wisent-python-raw-string-handler ()
DP> "`wisent-flex' handler of 'raw-string syntactic tokens."
DP> (let* ((stok (car wisent-flex-istream))
DP> (wisent-flex-istream (cdr wisent-flex-istream)))
DP> (cons 'RAW_STRING_LITERAL
DP> (cons
DP> ;; Remove r" and " delimiters
DP> (substring (semantic-flex-text stok) 2 -1)
DP> (cdr stok)))))
DP>
DP> I think it can be interesting to differentiate normal strings
DP> (their value should be `read' to remove '\') and raw strings (their
DP> value is the text between r" and the final ", and '\' aren't removed).
DP>
DP> I hope I helped ;-)
It sured did.
With the fix, the parser is able to parse two more files in
/usr/lib/python2.2 directory bringing the total to four:
BaseHTTPServer.py
Bastion.py
CGIHTTPServer.py
ConfigParser.py
I notice some problems with the next file Cookie.py.
I'll investigate it next.
I'm just testing all /usr/lib/python2.2/*.py files
alphabetically to find and fix the problems that I
encounter.
Thanks again for your help.
|