Menu

#41 Error in HTML-Parser-3.56 warns "Parsing of undecoded UTF-8

open
nobody
None
5
2007-09-04
2007-09-04
No

If you are parsing undecoded UTF-8 hparser.c:1861 issues the warning "Parsing of undecoded UTF-8 will give garbage when decoding entities..." if p_state->argspec_entity_decode is "true" - and this is set in HTML-Parser-3.56/hparser.c:729

if (a == ARG_ATTR || a == ARG_ATTRARR || a == ARG_DTEXT) {
p_state->argspec_entity_decode++;
}

But it seems to me that entities are only REALLY decoded if p_state->attr_encoded is false. So it seems to me, that the setting of p_state->argspec_entity_decode should also somehow depend on p_state->attr_encoded, something along the lines of

if (((a == ARG_ATTR || a == ARG_ATTRARR) && !p_state->attr_encoded) || a == ARG_DTEXT) {
p_state->argspec_entity_decode++;
}

I'm not sure if p_state->attr_encoded is avaliable at that moment, but you get the idea :-)

Discussion


Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.