conpar-general Mailing List for ContextParser
Status: Alpha
Brought to you by:
bergamot
You can subscribe to this list here.
| 2002 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|---|
|
From: Mikhail K. <ka...@wi...> - 2002-01-28 14:22:32
|
Hi all, Here is my thoughts about what is parsing and how to parse. There is 3 levels of parsing: 1. Tokenization 2. Syntaxical parsing 3. Convertion or syntaxical action What we actually have are Parsing Tree (PT), Parsing Sequence (PS) and Parsing Mashine (PM). Tokenization is a process of mapping PS at given position to a token. In general the final result of tokenization is a bit mask for each token with 1 if token present at a given place or 0 otherwise. Syntaxical parsing is a process of choosing of sequence of tokens which fits both PS and PT. Syntaxial action is external to parsing action which could chahge the following: 1. PT 2. PS 3. Change the context for syntaxial action. The goal is dynamical parsing with efficacy defined by the task. The efficacy mainly affected by the changing of parsing tree andd context in the syntaxial action. While tokenization and syntaxial parsing could be fast enough. There is two basic method for the parsing. First is breath first and second depth first. in the first method we are lokking for all possibilities and keep them in the memory, while in the second method we search sequently through all possible variants keeping in memory backtracking information. By the looking each time through the same parsing rules (PR) again and again we loose efficacy in time, while in breath first (actually I mean bottom-up) we looking for the cases which is even out of interest, so we again loose time and memory. What I want to suggest is to perform only work have to be done while keeping results of tests for the future use until it is relevant. To be continued... |