conpar-general Mailing List for ContextParser

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi all,

Here is my thoughts about what is parsing and how to parse.

There is 3 levels of parsing:
1. Tokenization
2. Syntaxical parsing
3. Convertion or syntaxical action

What we actually have are Parsing Tree (PT), Parsing Sequence (PS) and 
Parsing Mashine (PM).

Tokenization is a process of mapping PS at given position to a token. In 
general the final result of tokenization is a bit mask for each token 
with 1 if token present at a given place or 0 otherwise.

Syntaxical parsing is a process of choosing of sequence of tokens which 
fits both PS and PT.

Syntaxial action is external to parsing action which could chahge the 
following:
1. PT
2. PS
3. Change the context for syntaxial action.

The goal is dynamical parsing with efficacy defined by the task. The 
efficacy mainly affected by the changing of parsing tree andd context in 
the syntaxial action. While tokenization and syntaxial parsing could be 
fast enough.

There is two basic method for the parsing. First is breath first and 
second depth first. in the first method we are lokking for all 
possibilities and keep them in the memory, while in the second method we 
search sequently through all possible variants keeping in memory 
backtracking information. By the looking each time through the same 
parsing rules (PR) again and again we loose efficacy in time, while in 
breath first (actually I mean bottom-up) we looking for the cases which 
is even out of interest, so we again loose time and memory.

What I want to suggest is to perform only work have to be done while 
keeping results of tests for the future use until it is relevant.

To be continued...

conpar-general Mailing List for ContextParser

conpar-general — General discussion about parsers and this project