From: Coram, R. <Rog...@bl...> - 2011-07-20 10:39:39
|
We appear to have found a problem with the replay of some site, an example of which is here: http://www.webarchive.org.uk/wayback/archive/20110604080034/http://www.c hilton-computing.org.uk/ <http://www.webarchive.org.uk/wayback/archive/20110604080034/http:/www.c hilton-computing.org.uk/> After doing some digging it seems that a UTF-16 BOM is added to the response - the culprit appears to be this line in the original site which (incorrectly, I'm guessing) specifies the encoding: <meta http-equiv="Content-type" content="text/html;charset=utf-16"> As a test, if we switch the ArchivalUrlReplay.xml to exclusively use the 'identityreplayrenderer' then this doesn't happen. Presumably when Wayback needs to amend the response it sets the encoding which results in the above. Has anyone else seem anything similar? Or know how to prevent it? Thanks, Roger G. Coram Web Archiving Engineer The British Library T: +44 (0)1937 546607 F: +44 (0)1937 546872 E: rog...@bl... <mailto:rog...@bl...> |