Hello,
I have to say it is the best java html parser, even better than mozzila's mozzila2java.
I am using awt Browser, and trying to display your structure in a jtree.
I have the html string but how can i tell the source to parse html, you just have constructor for:
Source(final CharSequence text)<br/>
Source(final EncodingDetector encodingDetector)<br/>
Source(final Reader reader, final String encoding)<br/>
Source(final CharSequence sourceText, final StreamedParseText streame…
If i send the URL it works great.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I tried doing Source source = new Source(htmlText); where html text is
org.eclipse.swt.browser.Browser..getText()
String org.eclipse.swt.browser.Browser.getText()
Returns a string with HTML that represents the content of the current page.
And it does not work. I have tested with ebay.com and google.com. Both work if i try to get the content dirrectly from the URL. using
Source(final URL url), but when i try to get it from the string..it fails.
Thanks Martin.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This method exists but it is not correctly implemented https://bugs.eclipse.org/bugs/show_bug.cgi?id=107142
It is converting some > to &lt; and &gt. I will have to see how to fix this problem.
A short question, do you thread Tbody tags, because i did not see any tag in my testing of the valid display ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I have to say it is the best java html parser, even better than mozzila's mozzila2java.
I am using awt Browser, and trying to display your structure in a jtree.
I have the html string but how can i tell the source to parse html, you just have constructor for:
Source(final CharSequence text)<br/>
Source(final EncodingDetector encodingDetector)<br/>
Source(final Reader reader, final String encoding)<br/>
Source(final CharSequence sourceText, final StreamedParseText streame…
If i send the URL it works great.
Use the Source(CharSequence) constructor.
Cheers
Martin
I tried doing Source source = new Source(htmlText); where html text is
org.eclipse.swt.browser.Browser..getText()
String org.eclipse.swt.browser.Browser.getText()
Returns a string with HTML that represents the content of the current page.
And it does not work. I have tested with ebay.com and google.com. Both work if i try to get the content dirrectly from the URL. using
Source(final URL url), but when i try to get it from the string..it fails.
Thanks Martin.
I don't see a getText() method in the Browser class.
What do you mean by "it doesn't work"? What error are you getting?
This method exists but it is not correctly implemented
https://bugs.eclipse.org/bugs/show_bug.cgi?id=107142
It is converting some > to &lt; and &gt. I will have to see how to fix this problem.
A short question, do you thread Tbody tags, because i did not see any tag in my testing of the valid display ?
I'm not sure what you mean by threading Tbody tags.
Sorry about that, i mean check to see if you also parse Tbody tags, because in my testing you jumped over those.
Thanks.