From: stephan b. <st...@wa...> - 2003-09-13 04:34:57
|
i've just completed the first go at a more complex XML flexer. We now hav= e 3=20 supported dialects (not including the binary formats). Sample input (intentionally a tad convoluted in places, to test the lexer= ): <!DOCTYPE elib::simplexml> <open class=3D"XmlRoot" foo=3Dbar bar=3D"one two">"cdata more cdata. <![CDATA[tucked-away CDATA"]]> <!-- comment --> <two class=3D"NonExistentTwo" attr=3D#ffaabb> <!-- comment --></two> <three one=3Dtwo three=3Dfour class=3D"ThreeAlmostExists" long_string=3D"you won't like me when i'm angry!"/> </open> Deser'd, then reser'd, in 3 dialects: <!DOCTYPE elib::simplexml> <open class=3D"XmlRoot" bar=3D"one two" foo=3D"bar"><![CDATA["cdata more = cdata. tucked-away CDATA"]]> <two class=3D"NonExistentTwo" attr=3D"#ffaabb"></two> <three class=3D"ThreeAlmostExists" long_string=3D"you won't like = me when=20 i'm angry!" one=3D"two" three=3D"four"></three> </open> #SerialTree 1 open class=3DXmlRoot { CDATA "cdata more cdata. tucked-away CDATA" bar one two foo bar two class=3DNonExistentTwo { attr #ffaabb } three class=3DThreeAlmostExists { long_string you won't like me when i'm angry! one two three four } } <!DOCTYPE SerialTree> <open class=3D"XmlRoot"> <CDATA>"cdata more cdata. tucked-away CDATA"</CDATA> <bar>one two</bar> <foo>bar</foo> <two class=3D"NonExistentTwo"> <attr>#ffaabb</attr> </two> <three class=3D"ThreeAlmostExists"> <long_string>you won't like me when i'm angry!</long_stri= ng> <one>two</one> <three>four</three> </three> </open> (My favorite all-around is still fun-xml, because of it's lack of the nee= d to=20 decide whether to store stuff as attributes or elements. i hate making th= at=20 type of decision. That said, i mainly use fun-txt because it's easier to = look=20 at.) The XML parser was /simple/ to implement - i had the whole thing done in = about=20 two hours, including integration into the sernode dom builder. i've also=20 added a helper lexer which looks at the first line of a file and guesses = the=20 proper parser needed to decode it. Anyway... at some point i would like to put these lexers into the libfun = tree=20 so i can qt-free SerialTree-{txt,qxml}. There is one distribution-related= =20 consideration, and that is regarding the generated flex code. It doesn't = need=20 any libraries, but it does need flex to process them. If you don't want t= he=20 flex dependency i can put the generated lexers in the tree instead. --=20 ----- st...@wa... http://qub.sourceforge.net http://libfunutil.sourceforge.net =20 http://toc.sourceforge.net http://countermoves.sourceforge.net http://stephan.rootonfire.org |