Menu

Parsing and semantic search multiple COBOL files

Help
mahesh
2015-01-19
2015-01-20
  • mahesh

    mahesh - 2015-01-19

    Kris,

    The current sample UI in Koopa.java parses one file at a time. I would like to build my own GUI or Command line utility to allow mass parsing and searching across multiple COBOL files in a directory.

    Inputs will be COBOL program directories, COBOL copybook directories, Setting for options like koopa.xml.include_positioning, Global Search criteria (e.g. EXEC CICS, EXEC SQL, CALL etc) or Requst to Parse and Dump ASTs in XML for further processing.

    What is the best way to do this? I essentially need to setup the necessary parameters and parse COBOL programs one by one to get an in-memory AST in XML format. Then I can use an XPATH engine like Jaxen, Saxon, VTDXML etc to carry out semantic searches on the AST.

    I had used koopa.app.cli.ToXml earlier to get an AST dump in XML format for a single COBOL program - without pre-processing support etc. When I looked at the current version of ToXml.java - it does NOT seem to be in-sync with the Koopa evolution. For example,
    1. koopa.app.Koopa uses the ParsinCoordinator approach which is not used in ToXml.java
    2. The usage help in ToXml.java says "GetASTAsXML" indicating that the code is probably stale?

    Do we have an SSCCE for batch parsing of multiple COBOL programs, somewhere?

     
  • KrisDS

    KrisDS - 2015-01-19

    You were right, the ToXml class was somewhat out of date. I fixed that just now.

    As for doing that in bulk, the best I can point you to for now is the Cobol85RegressionTest class. But it just boils down to searching the file system for Cobol source files and then asking the ParsingCoordinator to parse them.

    If that's not sufficient I can always take some time and set up an SSCCE for you.

    Kris

     
  • KrisDS

    KrisDS - 2015-01-19

    So I quickly extended ToXml with a batch mode. Instead of an input and output file you now have the choice of choosing an input and output folder. It will search the input folder for Cobol files, parse each one, convert it to XML, and store that XML in a matching location in the output folder.

    If you combine that with code from the Jaxen example you can set something up which will do bulk XPath querying. Or you can take the AST of each file and do whatever you want.

    Hope that helped.

    Kris

     
  • mahesh

    mahesh - 2015-01-20

    Thanks a lot Kris, you are really lightning fast!.

    Regards,
    Mahesh

     

Log in to post a comment.