Re: [SimpleParse] Having SimpleParser ignore whitespaces

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

At the moment, there's no such mechanism.  This is largely a historic 
artefact of the parsing mechanism that SimpleParse uses.  "Normal" EBNF 
tools are 2-stage processors.  You define a set of tokens, a set of 
punctuation, and a set of whitespace, then tokenise into discreet tokens 
using those low-level definitions (lexing).  You then just deal with the 
resulting tokens.  The definitions are a requirement for the engine and 
if they happen to make certain things more convenient for the user, 
well, that's okay, as long as they aren't too happy about it :) ;) .

Since SimpleParse never had the need to require those 3 definitions (it 
doesn't lex), it's never grown a way to specify or use them.

If you wanted to add this functionality, you'd have lots of ways to do 
it (here's three off the top of my head):
	modify objectgenerator.Range and Literal to always add a generic 
"consume whitespace after parse" tag to the tag-table they produce.

	modify objectgenerator.SequentialGroup to insert a whitespace-consumer 
between each pair.

	add a new group-type to the SimpleParse EBNF format (e.g. using just a 
space between the element tokens) which defines a white-space-seperated 
group.  You will then need to make sure that  a := b c d := e is 
un-ambiguous (basically a name is now a name iff it is not followed by a 
:= token, so use a negative look-ahead check).

In all of those cases, you need to declare the composition of 
"whitespace" somewhere and make it available to your objectgenerator 
classes (likely in the generator object).  In all save the last, you'd 
need a way to differentiate when you do/do-not want the whitespace 
consumption.

BTW: Manually altering the tag-tables SimpleParse produces is probably 
one of the most painful ways to have your brain explode :o) .  I don't 
even look at them 99% of the time I'm working with the system.  The only 
real reason to do it is to debug an error in SimpleParse or to try to 
optimise the tables it generates.

Feel free to shout if this was unclear,
Mike

Karl Trygve Kalleberg wrote:
> Hi fellow parsists.
> 
> I notice that all of the example grammars include whitespaces in the
> productions explicitly. Is there any simple way to tell SimpleParse that
> the charset "[ \t\n\r]+" is considered a generic token separator, as is
> customary with other EBNF tools ? 
> 
> funcall := id, '(', arglist, ')', ';'
> 
> is most definitely easier to read and reason about than
> 
> funcall := id, ws, '(', ws, arglist, ws, ')', ws, ';'
> 
> I tried modifying the resultant tuple returned by generator.buildParser
> thusly;
> 
> parser = generator.buildParser(decl).parserbyname('root')
> parser = ((None,TextTools.AllInSet,TextTools.set(' \r\n\t'),+1),) + parser
> pprint.pprint( TextTools.tag( input, parser ))
> 
> but that does not seem to have any effect.
> 
> Any suggestions/pointers to solutions are most welcome.
> 
> 
> Kind regards,
> 
> Karl T
> 
> 
> 
> 
> -------------------------------------------------------
> This sf.net email is sponsored by: OSDN - Tired of that same old
> cell phone?  Get a new here for FREE!
> https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390
> _______________________________________________
> SimpleParse-users mailing list
> Sim...@li...
> https://lists.sourceforge.net/lists/listinfo/simpleparse-users
> 

-- 
_______________________________________
   Mike C. Fletcher
   Designer, VR Plumber, Coder
   http://members.rogers.com/mcfletch/