Menu

Tree [2030f1] master /
 History

HTTPS access


File Date Author Commit
 grammar_rule_gen 2014-04-19 Danielfar Danielfar [650016] Commit 2
 parser_example 2014-04-19 Danielfar Danielfar [650016] Commit 2
 php_arrays 2014-04-19 Danielfar Danielfar [650016] Commit 2
 ric_sql 2014-04-19 Danielfar Danielfar [650016] Commit 2
 src 2014-04-19 Danielfar Danielfar [650016] Commit 2
 .gitattributes 2014-04-11 Danielfar Danielfar [7a4a1f] Commit 1
 .gitignore 2014-04-11 Danielfar Danielfar [7a4a1f] Commit 1
 LICENSE 2014-04-19 Danielfar Danielfar [650016] Commit 2
 README 2014-04-19 Farinotti Daniele Farinotti Daniele [2030f1] Update README

Read Me

Cheope-php-parser-generator
===========================

RECURSIVE DESCENT PARSER GENERATOR FOR PHP.

The parser generator is in the 'grammar_rule_gen' directory.
The main module is 'parser_gen_exec.php'.

Usage:
php parser_gen_exec.php |prjName| -l |logFileName|

where |prjName| is the project name and must match the |prjName|.xml
grammars definition file.

Example:
php parser_gen_exec.php example_1 -l log.txt
or (the -l option is optional)
php parser_gen_exec.php example_1

The module can be called from a browser too: the default log file name 
is 'log.txt'.
In this case, to specify the project name, you must edit the 
'grammar_rules_gen_def.php' file and set the PRJ_NAME constant to the right 
project name.

The grammar definition file allows to write any number of grammars in the 
'grammar_rules' section.
Each grammar must have a 'grammar_rule' main section with a 'name'
not null attribute. 
The main section contains 3 sub-sections:
'tokens_def' that contains tokens regular expressions definitions ('token' tags);
'tokens_attributes' that contains tokens attributes definitions ('Attr' tags);
'productions' that contains all the productions for the grammar ('production' tags).

A generic production has a very simple sintax; it is composed by
terminals (in uppercase), not terminals (in lowercase), the equal (=) symbol that
divides the left and the right side of the production 
and the logical OR symbol (|).
The terminal 'epsilon' is the always true terminal.
The terminal 'ws' is used to specify the white space and his exact definition 
must be included (with his regular expression) in the tokens definition section.
The code for not terminal SPACE is automatically generated and the associated production
(implicitly present) is SPACE = ws | epsilon. Not terminal SPACE can be freely
used in the others productions.
The right side is typically composed by successions of terminals and not terminals 
divided by logical ORs symbols; each succession is a logical AND succession.

Each token is defined by a 'token' xml tag that have two mandatory attributes:
'type' and 'val'.
The precedence of the tokens definitions is important.
See 'example_1.xml' for an example of a grammar definition file.

Attributes are optionally, but the section 'tokens_attributes' must be present.
See directory 'Ric_sql' and files 'ric_sql_def.php' and 
'Ric_sql_parser_grammar_rules.php' for an example of using them.
 
The php module generates two others modules that are : |prjName|_def.php and
|prjName|_parser_grammar_rules.php.
The first contains constants, lex rules objects instances definitions and grammar 
rules objects instances definitions. The second contains the parser recursive 
engine.
The results of the execution are added at each call, so to create a new 
parser you have to flush the old files.
These files must then be copied in the destination directory.
As an example of destination directory you can use the 'parser_example'
directory. This one contains the others parser files.
The parser generic engine is in the 'Parser.php' file and contains the main php
Parser class.
This class has ,as entry point, the public method 'exec' that tries all the grammars 
and returns true if the text to be parsed satisfied at least one.
It fills the '$results' array with the execution status for each grammar.
See 'Parser.php' file comments.
 
In the 'Parser_example' directory there is the 'parser_exec.php' module that
calls the parser engine.

Usage:
php parser_exec.php |textToBeParsedFileName|

where |textToBeParsedFileName| is the file that contains the text to be parsed.

Example:
php parser_exec.php example.txt

The 'parser_exec.php' module acts like a validator and returns 'Ok.' if succedes
or the current error if not.
Furthermore, if it is called by a browser, it displays the symbol table.

The applications 'php_arrays' and 'ric_sql' are under development.

This application has been developed with PHP version 5.5.7, but I suppose , it can run
with many others previous, since no particulary advanced techniques has been used, except class
construct.