Carl Barron wrote:
>
> On Sep 29, 2005, at 2:15 AM, Joel de Guzman wrote:
>
>> Carl Barron wrote:
>>
>>> Do you process one item after another or do you need to store them all?
>>> using modification of the parser in
>>> .../libs/example/fundamental/matching_tags.cpp
>>> it is fairly easy to create a pair of strings from the starting tag.
>>> Another modification
>>> will allow
>>> <A>
>>> <b="c"> </b>
>>> <c="d"></c>
>>> </A>
>>> <B>
>>> <x = "x"></x>
>>> </B>
>>> etc.
>>> do you need some indication of which item the paring (yldl,",,,")
>>> belongs to?
>>> The actual parser is fairly easy to write with a matching_tag parser for
>>> the two cases denoted by case in the tag names above.
>>> More info means a more meaningful solution and the more specific
>>> probably the faster the run time should be. I have a general solution
>>> in mind for a file with the above syntax.
>>
>>
>> Good starting point. I also assume:
>>
>> 1) you'll allow a number of attributes for each tag.
>> e.g. <tag x="123" y="456">
>> 2) that there are only allowable attributes for
>> each tag (symbol table)
>> 3) you'll allow the syntax <tag />
>> 4) and comments <!-- blah blah blah -->
>> 5) allow strings between markups
>> e.g. <tag> some string </tag>
>>
>> I imagine these input params for the grammar:
>>
>> 1) A filled symbol table for the tags
>> 2) the data slot in the symbol table is another
>> symbol table of attributes
>>
>> I imagine the output as:
>>
>> a std::list of variant<tag, std::string> objects where tag is
>> something like:
>>
>> struct tag
>> {
>> std::string tag;
>> std::map<std::string, std::string> attributes;
>> std::list<variant<tag, std::string> > children;
>> }
>>
>> Such an example will be tremendously useful. I'd like to have
>> one in the examples/intermediate directory.
>>
> I think I have the grammar that allows strings or nested tags but not
> both.
> what I am missing is a fairly efficient non greedy definition of inner
> tagged_item = strt >> inner >> end;
> inner = + (tagged_item | *(anychar_p - end))
> sttrt is the start tag
> end is the correspondiing end tag.
Yeah! I think you are on the right track. That is why
it is a variant<tag, std::string>. It's either a tag
or a string but not both. Then, a tag *can* contain
tags *or* strings as its children.
I think that *(anychar_p - end) can simply be:
*(anychar_p - '<')
It is illegal to have '<' in strings anyway; you use <.
(aside: be mindful of tags like <tag /> though which do not
need endtags)
> How far is phoenix 2, I'd hate to write a lot of convoluted code that
> Pheonix 2 handles
> easier,,,
There is a planned release by Oct 10. Unfortunately, it does not
have closure support which basically makes it unsuitable for
Spirit-1. Closures might go away in Spirit-2, replaced by true
rule local variables.
> What is the easiest way to store a pointer to or reference of the inner
> symbol table, getting
> compiler errors on obvious choices, the grammar takes references to
> input and output,
> also passing closure variables to a grammar [as in passing the inner
> symbol table to
> an atribute list grammar, constructing this grammar in the rule at run
> time. I don't
> see a solution without closures as it is recursive.,
>
> Getting lost in details of what is compatible with what.:)
Sorry. You lost me here.
Regards,
--
Joel de Guzman
http://www.boost-consulting.com
http://spirit.sf.net
|