Lenient charset processing

Brought to you by: derrickoswald

Lenient charset processing

Forum: Help

Creator: Lionel Capiez

Created: 2006-03-09

Updated: 2013-04-27

Lionel Capiez - 2006-03-09

Hello,

I'm trying to parse a page that advertises a bogus charset in its META data (iso-8859-1 when it's really windows-1252).

Is there a way to tell HTML Parser to be lenient about charset identification and ignore the charset specified in the META data ? It would thus simply accept any encoding given to it by means of parser.setEncoding().

Thanks in advance.
Lionel

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.