If you have a String, use parser.setInputHtml (string) and then use the StringBean as a visitor with parser.visitAllNodesWith (mystringbean).
If you have a stream, it's a little more complicated; use the Page class constructor:
public Page (InputStream stream, String charset)
Then pass the page into the Lexer constructor and pass the Lexer into the Parser constructor, and then visit it with the StringBean as above.
You are welcome to modify the StringBean to do this sort of thing and donate the modifications back to the project.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
if given url,we can get textual contents of a web page by using StringBean
but how to accumulate the same result from given html source as StringBean does?
thanks
If you have a String, use parser.setInputHtml (string) and then use the StringBean as a visitor with parser.visitAllNodesWith (mystringbean).
If you have a stream, it's a little more complicated; use the Page class constructor:
public Page (InputStream stream, String charset)
Then pass the page into the Lexer constructor and pass the Lexer into the Parser constructor, and then visit it with the StringBean as above.
You are welcome to modify the StringBean to do this sort of thing and donate the modifications back to the project.