[SrcML] CFG support
Status: Beta
Brought to you by:
crashchaos
From: Frank R. <fra...@in...> - 2005-03-15 15:07:46
|
As we will be spreading development to hopefully several students in about a month I decided to move our discussions and progress reports to this list for later reference. This list will also be used to coordinate design decisions and for general exchange of ideas. I'm currently working on a Control Flow Graph (CFG) support for the API and it's coming along quite nice. There will be three levels of detail: full, expr and block. Full detail might only be suitable in case we add an actual compiler backend, as I don't see many other reasons to dissect the control flow within an addition or similar expressions. The expression detail level might become the most used version as it creates a new node for every expression, but abstracts from the details of the expression. For even further abstraction the block level will combine several expressions following each other without any branching into one node. The first application which will make use of these CFGs will probably be the source code presenter. It will be a great aid in teaching programming when you can easily get a visualization of the control flow for a section of your program (f.ex. to talk about the possible flows in a for-loop including break, continue or return statements). However there is no guarantee that the CFG support will be fully available at the start of the semester. As for the presentation of the CFG this will not be part of the API. The API will only support creation of the graph and will provide several helper methods in order to work with the created graph. Right now a proof of concept visualization is done using a jython script which outputs a dot-compatible graph representation. Using the dot tool we can get a simple .ps or what might proof to be more useful in the future a .svg file for the graph. Using the ViewPlatform we can get a highly customizable visualization of the graph, as it allows us to easily render the expression a node is representing as plain source code. An example of a graph visualized like this can be found in the attached .ps file. Now for the remaining problems: - We need a more formal way of defining what exactly will be a new node in the graph and which parts of the DOM tree can share a node (actually three definitions for the different levels are needed). - The CFG can easily be determined for a method on its own, but what about a whole class? The current implementation is only dissecting the class into its methods and creates a forest of rooted graphs for all those methods. We could take a closer look at a main method though and create an actual flow with links between those methods (also see the next point for this). - How are method calls treated within an expression? Obviously this is a form of branching, however for the block and expression level detail the recursion would normally just treat the whole expression as one node. Of course that expression may consist of any number of method calls. Will it be enough to just find those calls and add edges to the respective methods' CFG trees (and back again)? - how can we create useful unit tests for the CFG creation? I can only think of preselected sample source files with known should-be CFGs with which to compare the results. That would at least catch some problems when the CFG creation process is modified and returning different results. Please feel free to join in on the discussion. Especially comments and ideas on the open problems are very much appreciated. -- Raiser, Frank Student @ University of Ulm (www.uni-ulm.de) The difficult is what takes a little time; the impossible is what takes a little longer. (Fridtjof Nansen) |