Thread: Re: [cedet-semantic] JavaScript support
Brought to you by:
zappo
From: Eric M. L. <er...@si...> - 2011-03-21 12:06:58
|
On 03/21/2011 04:29 AM, Mihai Călin Bazon wrote: > Hi folks, > > I've spent my weekend with CEDET and must say it's amazing; if only I'd > understand it better. :-) My goal was to add proper support for JavaScript > (sorry but the existing parser doesn't cut it for real world code). I've > started it from scratch, to better understand how to write parsers, but I > didn't get far. It's always nice to have some new folks trying things out. > The Semantic/Wisent manuals are quite good, yet I've had trouble getting > started and doing simple things. I think a step-by-step HOWTO on adding > support for a simple language (with nested structures) would be very > welcome! That's a good idea. There are some skeleton files around, but I don't think they go too deep into anything like that. > So anyway, I'm attaching my (highly incomplete) work so far and hope for > some advice on how to continue. Questions: I will attempt to answer given the brief amount of time i have this AM. > - I don't seem to be able to parse more than one statement. Presumably > because the return value of the `statement' rule is wrong. Generally I > couldn't figure out how to return the proper values. Each nonterminal you define with a %start pragma should return 1 production. The entire grammar you create is called iteratively, and the automatic value passing of the wisent parser generator framework is setup for this. The iterative nature makes error recover very simple. The grammar just "fails", and the upper level iterative parser skips over the bad semantics and moves on. Thus, it returns only one thing because that is all it can do. If you change statlist to only return statement, then after it finds the first statement, it should get called a second time, and return the second statement, and the parser framework will keep track of it all for you. > - I tried to use PAREN_BLOCK and the `iterative style' to parse variable > declarations, did that exactly as in other existing parsers and as > documented, but it wouldn't work... I know "it doesn't work" is not good > information, but that's all I can say, the blocks simply didn't > parse. So > I switched to the recursive style and collecting (EXPANDTAG (VARIABLE-TAG > ...)). (btw, perhaps something similar should go in `statlist'?). EXPANDFULL will use the same iterative nature as I describe above inside a parent block. The nonterminal symbol passed to EXPANDFULL should have rules about (, ), and some variable declation. If you use EXPANDTAG, you need to create your own rule that parsers ( varlist ) and the varlist will need to cons all the found variables together itself. It is much easier to use EXPANDFULL, as it handles bad syntax easily. > - That seems to parse: a function declaration with an argument list: > > function foo(a, b) { > } > > (semantic-fetch-tags) returns the function tag and the variables are > there. However, if I complicate that a bit: > > function foo(a, b) { > function bar() { > } > } > > only the outer function is returned. Inner functions are ubiquituous in > JS and they need to be parsed correctly to provide useful functionality > (BTW, the existing JS parser distributed with Wisent fails here too). The semantic lexer skips over { } and ( ) blocks and does not go into them unless a rule action explicitly calls EXPANDTAG or EXPANDFULL on the value returned from the PAREN_BLOCK. In your nonterminal for a function, the BRACE_BLOCK part of the rule will need to be passed into EXPANDFULL which will iteratively parser your function body looking for more functions. Code will show up as bad syntax unless you write rules for all that too. > - what is EXPANDTAG? and is it related to the value of > semantic-tag-expand-function? (I just copied the expander from the > existing JS mode for now, but I'd like to understand why is it useful, > what argument it receives and what it should return). Didn't put too > much > time into this yet, but from the docs I'm not clear. EXPANDTAG and EXPANDFULL let you look inside some _BLOCK with a new nonterminal start. For each rule you pass to EXPAND* you need to add a %start pragma. The output of EXPAND* will be (presumably) some tag or tag list. EXPANDFULL will return a tag list and handle expanding and the data needed for "cooking" the tags so they are bound into the buffer with overlays. In your support file, you will need to write an overload of semantic-tag-components if you do anything besides function arguments or type members. For function args and type members, you just need to put the tag lists into the correct tag attributes. > - generally, how do you debug a grammar? You can debug a rule in wisent, but not how the wisent grammar parses. the grammar debugger was never ported to wisent. :( > * * * > > I have in mind a few things for now: > > - be able to detect the local variables around the cursor. For example if I > place the cursor on a variable, it should highlight the occurrences of > that name in the enclosing scope. I already did something like this for > js2-mode [1], but I'd like to get rid of that setup. If you visit http://cedet.sourceforge.net/addlang.shtml step 4 is about context parsing. > - having done the above, it should be easy to provide some keybindings to > quickly move through such occurrences, and a keybinding to rename the > variable (again, my js2-mode setup supports that). semantic-symref output (using idutils, gnu global, or other) has features like that that may "just work" for you. > - use the knowledge from the parser to indent var properly: > > var foo = 1, > bar = 2; Many folks have wanted to do this, but as far as I know, no one has built a framework for it. > Then of course I know that much functionality would come for free from > existing Semantic applications. > > JavaScript is quickly taking over the world (it's the most popular language > on GitHub right now) and it's a pity not to support it properly. I have > some previous knowledge on parsing JavaScript [2] and I use Emacs for 12 > years now; though I'm not very skilled at Elisp, I do Common Lisp at my day > job and have good knowledge of it. I'm willing to invest the time to write > this parser for Semantic, just need some help! :-) > > Thanks in advance! > -Mihai > > [1] > http://mihai.bazon.net/projects/editing-javascript-with-emacs-js2-mode/js2-highlight-vars-mode > [2] https://github.com/mishoo/UglifyJS > > PS: By the way, don't you guys consider switching to GitHub? SourceForge > is... uhm... better not say it. I've been too lazy to look into reasons to move anything. Right now we're just trying to get a good method for keeping synchronized with Emacs. The tactic is good, but it takes a lot of time. Eric |