-
Hi
is this project still on-going?
is there an api documentation ?.
2008-05-29 14:43:01 UTC by ashwin_ittoo
-
me too!
2008-04-03 10:30:20 UTC by nobody
-
In my case, it was an issue with encoding of some chars in the html file I parsed.
I seems that Hotsax doesn't handle well UTF-8 chars that are 2 bytes-wide.
So, i used recode -d UTF-8..HTML before using hotsax and It was all ok then.
2008-03-16 01:23:22 UTC by nikobonnieure
-
In my case, it was an issue with encoding of some chars in the html file I parsed.
I seems that Hotsax doesn't handle well UTF-8 chars that are 2 bytes-wide.
So, i used recode -d UTF-8..HTML before using hotsax and It was all ok then.
2008-03-16 01:22:01 UTC by nikobonnieure
-
In my case, it was an issue with encoding of some chars in the html file I parsed.
I seems that Hotsax doesn't handle well UTF-8 chars that are 2 bytes-wide.
So, i used recode -d UTF-8..HTML before using hotsax and It was all ok then.
2008-03-16 01:21:02 UTC by nikobonnieure
-
When parsing the page: http://www.cs.dartmouth.edu/~fabio/
I had the following error:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 8364
at hotsax.html.sax.HtmlLexer.yylex(HtmlLexer.java:549)
at hotsax.html.sax.HtmlLexer._yylex(HtmlLexer.java:218)
at hotsax.html.sax.HtmlParser.yylex(HtmlParser.java:285)
at...
2008-03-13 04:26:57 UTC by nobody
-
hi,
Im using HotSAX2 and it seems to very fast & accurate. Good work!
While trying stuff, I found out that it would not parse elements inside <td align="right" valign="bottom" nowrap>
"nowrap" >> Causes the Parser to skip nodes.
Do you have some setting to parse nowrap's too ?
Thanks in advance,
Gaurav.
2007-08-08 17:36:37 UTC by hencre
-
When I try to debug xhtmlMaker.java using a html page that contains Chinese words, I encountered a ArraysOutOfBounds error!
It's quite frustrating!
Hoping there're some solutions.
If you know,please email xiao7cn@126.com to tell me how to fix it. Thanks.
2007-01-26 06:38:01 UTC by nobody
-
The parser throws a NullPointerException when I use the
following HTML as input for xhtmlMaker.java:
I use the files from the distribution directory
chapt06, version
HotSAX-0.1.2c.
2006-11-14 08:03:29 UTC by brnrd
-
I have the exact same BUG (HotSAX-0.1.2c.tar.gz) by passing
an InputSource created from a File.toURI().toString(). The
file exists.
Pretty frustrating...
2006-10-14 21:51:05 UTC by nobody