From: Jonathan R. <re...@gm...> - 2021-10-17 18:18:07
|
Jeff, with all due respect, you seem to be missing the key point in my message regarding the Python parser generated by JavaCC 21. *It is already done!* Really, it is. Just try this: git clone https://github.com/javacc21/javacc21.git cd javacc21/examples/python ant Then you can launch the test harness on one or more files with: java PyTest <files or directory root> If you feed it a single file, it dumps a flat representation of the tree that it built. Throughout your message, you keep saying: "one would need to do this"... "one would need to do that"... *This is done!* And I'm pretty sure it's directly usable. Now, it's true that the tree API that it generates out-of-the-box may not be exactly what you would want for Jython, but it is possible to annotate the grammar to generate a somewhat different tree hierarchy and also JavaCC 21 has an INJECT feature allowing you to inject code into the various node types. So, let's see, I guess I'll intersperse some comments... On Sat, Oct 16, 2021 at 9:48 PM Jeff Allen <ja...@fa...> wrote: > Jonathan, Vinay: > > thanks for thinking of us. I played with JavaCC a long time ago and > thought it useful. > > I'm working on some fundamental things in the core of Jython 3 at the > moment. Having (as it were) the engine in parts scattered over the garage > floor, a compiler and "jump starting" anything, seems a distant gleam. :) > But obviously we will need a compiler as the whole goes back together. > > It is true that maintaining a grammar of Python, as the language evolves, > is work. In that past, this has meant evolving an ANTLR grammar. Vinay > would be familiar with PEP 617 I guess. The PSF will maintain a grammar of > Python directly in the form the PEG parser reads. My working assumption is > that the compiler for Jython 3 will be generated by the PEP 617 PEG parser, > adapted to generate Java. > Yeah, "will be", future tense. Again, this python parser already exists, it generates Java classes that correspond to all the syntactic/lexical elements in the Python language. Given that this is already done and works, wouldn't it make sense to take a good look at it before deciding to do something else yourself? > Our work then is in the action routines, when the grammar needs new ones. > In fact there has been some promising work, not mine, in just this > direction (private correspondance). > Well, in terms of converting a Python source file into a tree of Java objects with an API for traversing the tree and all that, what I'm telling you about is *finished work.* > Of course, there are two layers to this. The parser in Python of the PEG > language needs to emit a parser for Python in Java. > Well, fine, but that's future tense again. The work you're talking about "needing to be done"... *we've done it*. And that parser then has to emit AST nodes that are objects in Java, but > also in Python. > I'm getting repetitive now, but we've done it. The parser that is generated from the grammar emits AST nodes that are objects in Java. Well, true, above you say "also in Python". Well, look at Vinay's blog post from a couple of months ago. That side of things is more experimental admittedly, but it is also working. This also is not future tense... > To use JavaCC21 one presumably has to transform the PSF's grammar to LL(k) > in JavaCC21 notation. > This has already been done. Here it is: https://github.com/javacc21/javacc21/blob/master/examples/python/Python.javacc > I know one can do this, removing left recursion at the expense of > additional rules. But it might be worse. The PEG parser allows unbounded > backtracking. One motivation for PEG is to be free of the constraints the > previous relaxed LL(1) parser imposes. Do you not think that in the future, > the grammar of Python will develop in ways that *only* the PEG parser can > deal with, and no LL(k) parser could? > No, I don't think that is a very real concern at all. The likelihood that there will be new features in the Python language coming along that we cannot handle is essentially zero. It is much more likely that I will step in front of a bus. Or just lose interest, I suppose. I don't want to get into a whole conversation about the capabilities of PEG vs. JavaCC21. I think that overall JavaCC 21 is more elegant and powerful, since it has unlimited syntactic lookahead. Generally speaking, I think that any grammar written with JavaCC 21 will tend to be more elegant and maintainable than the equivalent one based on the PEG parsing. Now, that could be a question of taste. However, you can just take the taste test. Compare the JavaCC21 Python grammar I link above with the one here: https://docs.python.org/3/reference/grammar.html But again, I don't really know how to put this any more delicately. All of your discourse so far is in the future or conditional maybe. We've done the work. This is an up-to-date python parser that generates a tree of Java objects. And all that is in the NOW mode. This exists and works. If you think there is some great benefit to replicating this work, by all means, I guess, but maybe I didn't express myself clearly enough. All the stuff you are talking about wanting to do at some point in the future... well, we did it, it's done, and it's directly usable. I guess I'm repeating myself, but I have to get out the door to meet somebody for dinner and I wanted to send this message. Regards, Jon Revusky lead developer, JavaCC 21 project > Jeff Allen > > On 16/10/2021 14:19, Jonathan Revusky wrote: > > ... > First of all, JavaCC 21 <https://javacc.com/> is a continuation of > development of the venerable JavaCC parser generator released by Sun > Microsystems back in the 90's. One could call it a "fork" but in reality, > at this point, it is basically a complete rewrite. For the last few months, > my main collaborator on the project has been Vinay Sajip, who is a > committer on the CPython project. His main focus has been on reworking the > JavaCC 21 codebase to be able to support generating parsers in Python -- a > PythonCC if you will. And, actually, that subproject is in quite an > advanced state and Vinay blogged about it here: > > https://blog.red-dove.com/posts/parsing-in-python/ > > ... > > > Or, in other words, you guys could just rely on us to handle the parsing > side and you can concentrate on the actual run-time internals and bridging > the Java and Python object models basically. > > |