CyberNeko HTML Parser / Patches / #20 meta charset can cause UnsupportedOperationException

#20 meta charset can cause UnsupportedOperationException

Milestone: Unstable (example)

Status: closed-fixed

Owner: Marc Guillemot

Labels: None

Priority: 5

Updated: 2015-04-16

Created: 2014-06-10

Creator: Steve McKay

Private: No

When a meta charset is able to decode but not encode, HTMLScanner.isEncodingCompatible() will throw UnsupportedOperationException in String.getBytes(). Attached patch allows for this case, test case included.

1 Attachments

patch

Discussion

Steve McKay - 2014-06-11

In the patch, I returned false after 2 UnsupportedOperationExceptions. Upon reflection I don't think that's the desirable. Both encodings are valid, or there would have been an UnsupportedEncodingException. Unless there's something further that could be done to check whether the new encoding will work, I think the right thing is to trust the document about its content. ignore-specified-charset is already available if the client isn't willing to trust the document.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Marc Guillemot - 2015-04-16

status: open --> closed-fixed

assigned_to: Marc Guillemot
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Marc Guillemot - 2015-04-16

Patch applied. Many thanks and sorry for the delay.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

meta charset can cause UnsupportedOperationException

Group

Searches

Help

#20 meta charset can cause UnsupportedOperationException

Discussion