User Activity

  • Posted a comment on discussion Open Discussion on Jericho HTML Parser

    OK, if you feel this way, I deleted the GitHub project. I converted it to IntelliJ so that I could better understand it, fix bugs myself if I find any, trace under debugger what happens etc. I tend to not trust something I cannot build myself. And to me an IntellJ Idea build is simpler, not Windows .bat files. I howerver respect your wish and I'm grateful for the excellent code you so generously provide. I'm still testing its usefulness for my purpose (splitting very long HTML files for processing...

  • Modified a comment on discussion Help on Jericho HTML Parser

    Or I guess, I could use StreamedSource(final Reader reader) constructor, where reader is an instance of InputStreamReader, which I can create with InputStreamReader(InputStream in, String charsetName)... Guess will use this method, thanks! Few minutes later: OK, it worked, created my source like: streamedSource = new StreamedSource(new InputStreamReader(new FileInputStream(sourceFileName), "Windows-1251")); and all is fine. Now need to connect to this my ICU encoding detector. Thank you again for...

  • Posted a comment on discussion Help on Jericho HTML Parser

    Or I guess, I could use StreamedSource(final Reader reader) constructor, where reader is an instance of InputStreamReader, which I can create with InputStreamReader(InputStream in, String charsetName)... Guess will use this method, thanks!

  • Posted a comment on discussion Help on Jericho HTML Parser

    I have some HTML files without encoding specification, written e.g. in Russian Windows encoding (Windows-1251), where Jericho parser detects cp1252 encoding, so nothing works correctly in processed output. All the Source and StreamedSource constructors where either encoding or EncodingDetector can be specified, as well as setEncoding() method, are either private or package private. My app actually uses ICU library to detect encodings in text and HTML, so I would prefer either to provide encoding...

  • Posted a comment on discussion Open Discussion on Jericho HTML Parser

    Don't know if this will be of any interest to the author of Jericho HTML Parser or anyone else in this community, but considering usage of the Parser in my Android app project, I converted it into IntelliJ Idea project, with the main library build under Maven. Posted it on GitHub, see: https://github.com/gregko/jericho-html-parser If the author of this Parser would like to take over this conversion - improve it and maintain, will happily transfer it or remove my port and let the author post his....

View All

Personal Data

Username:
gregko
Joined:
2007-12-18 14:56:39

Projects

  • No projects to display.

Personal Tools