Hi there.
What's the right way to rebuild html from the parse tree?
The tag name and his content comes from Node.tagName and Node.text, but the attributes (possibly in the right order) and all the remainings (if there were)?
Thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It's taken me a while to get to the bottom of this one. The documentation on tree.hh is quite good for working out how to solve the problem. All it needs is a function to recursively parse the sub nodes of the tree out:
Hi there.
What's the right way to rebuild html from the parse tree?
The tag name and his content comes from Node.tagName and Node.text, but the attributes (possibly in the right order) and all the remainings (if there were)?
Thanks
It's taken me a while to get to the bottom of this one. The documentation on tree.hh is quite good for working out how to solve the problem. All it needs is a function to recursively parse the sub nodes of the tree out:
This should get you going:
tr = parser.getTree();
tree<HTML::Node>::sibling_iterator sib = tr.begin();
parseNode(&sib, 1);
int parseNode(tree<HTML::Node>::sibling_iterator *thisnode, int level)
{
tree<HTML::Node>::sibling_iterator sib = thisnode->begin();
while(sib != thisnode->end())
{
for(int y=0; y<level; y++) cout << "-";
cout << "'" << sib->text() << "'" << endl;
if(sib->isTag()) parseNode(&sib, level+1);
for(int y=0; y<level; y++) cout << "-";
cout << "'" << sib->closingText() << "'" << endl;
++sib;
}
}
Hope this helps somebody