Does a
get.getHttpRecorder().getRecordedInput().getCharSequence()
which gives back a char sequence that assumes
singlebyte encoding. Parse will fail going against
multibytes that don't have single-byte ascii at the
base of the encoding.
Looks like the extractor needs to get the charset from
the HTML HEAD content-type meta tag and then back up
the stream if it gets anything other than the current
JVM encoding and ask for a char sequence of the found
encoding (Need to add support for this to
ReplayCharSequence).
Michael Stack
Disk I/O
None
Public
|
Date: 2007-03-14 00:07
|
|
Date: 2004-03-10 19:58 Logged In: YES |
|
Date: 2004-02-27 20:09 Logged In: YES |
|
Date: 2004-02-27 18:42 Logged In: YES |
|
Date: 2004-02-19 20:44 Logged In: YES |
|
Date: 2004-02-19 17:43 Logged In: YES |
| Field | Old Value | Date | By |
|---|---|---|---|
| close_date | - | 2004-03-10 19:58 | stack-sf |
| resolution_id | None | 2004-03-10 19:58 | stack-sf |
| status_id | Open | 2004-03-10 19:58 | stack-sf |
| summary | ExtractorHTTP ignores charset | 2004-02-27 18:42 | stack-sf |
| assigned_to | nobody | 2004-02-17 22:32 | gojomo |