Is there any mechanism to serialize out?

  • Benson Margulies

    I see that view/source in lobo is showing what was read in, not necessarily what resulted from the use of open/write or other dom functions.

    Is there some code I'm not seeing for dumping out the HTML as text? Or would you recommend trax?

    • Lobo Project Lead

      If you parse the document with Cobra, you can call getOuterHTML() on the elements just below the document to get essentially that result.

      • Benson Margulies

        Could you offer any comparison between your parser and TagSoup in terms of tolerance of messy-looking inputs?

    • Lobo Project Lead

      I haven't used TagSoup, but if you find Cobra is unable to parse any HTML the way a browser would be expected to, I would ask you to post a bug report.

      • Benson Margulies

        I have to decide whether to move to you from TagSoup. TagSoup is focussed in getting XML from sloppy html. That works well for me until the point where I might want HTML back, instead of XML. The outerHTML thing hadn't occurred to me, I'll try it out at some point.

        • Lobo Project Lead

          If you want to get an idea of what kind of DOM Cobra might generate for any particular document, try the Cobra Test Tool or the Parser Test program. It gives you a TreeView representation of the HTML DOM.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks