From: SourceForge.net <no...@so...> - 2008-08-15 01:26:37
|
Bugs item #1351814, was opened at 2005-11-09 11:51 Message generated for change (Comment added) made by ian_little You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=577089&aid=1351814&group_id=85722 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: Fixed >Priority: 2 Private: No Submitted By: Liz Mac (liz_mac) Assigned to: michael (mcarden) Summary: XML tags incorrect in HTML normalisation Initial Comment: When normalising an HTML document, XML is including incorrect tags. This can cause underlining, bold, indent that is not in the original document. ---------------------------------------------------------------------- >Comment By: Ian Little (ian_little) Date: 2008-08-15 11:26 Message: Logged In: YES user_id=2125694 Originator: NO My mistake, didn't follow through to 'open in browser window'. The text is only incorrect in the NAA Package view. I'll leave the query open, but drop the priority. Ian ---------------------------------------------------------------------- Comment By: Ian Little (ian_little) Date: 2008-08-15 11:18 Message: Logged In: YES user_id=2125694 Originator: NO Re-checking the original files. The bad characters are there in the originals, but they are successfully interpreted by Firefox, but not by Xena. Attached doc shows different versions of some text. Ian File Added: Corrupt html normalisation.odt ---------------------------------------------------------------------- Comment By: Justin Waddell (jwaddell) Date: 2007-04-30 12:12 Message: Logged In: YES user_id=1417827 Originator: NO This should now be fixed. It should be noted that the HTML files causing the wacky bolding, underlining and indentation in fact contained dodgy HTML which we have to make a best guess at fixing; the problems in these files have been fixed but it is likely that other cases of bad HTML will break when normalised - I don't think there's too much we can do about that. It also should be noted that the strange characters appearing is caused by the browser using the wrong encoding to display the HTML - the user can go to the View menu and change the encoding themself, and this removes the strange characters (ie the problem is not in the HTML we produce). However code has been added to produce an HTML tag that will force browsers to use the correct UTF-8 encoding, so it should be fixed either way. Finally, some of the HTML files in the test directory for this bug contain the invalid characters themselves, so the output will also (correctly) contain them. ---------------------------------------------------------------------- Comment By: John (vombatus) Date: 2006-12-21 16:04 Message: Logged In: YES user_id=1606691 Originator: NO I noticed that in the normalised version of a html file, quote marks are rendered as ", rather than the " entity. No idea if this is a problem or not It would be useful to have some sort of application to implement diff, so I can easily see the differences between the source file and the normalised version ---------------------------------------------------------------------- Comment By: Liz Mac (liz_mac) Date: 2006-05-19 11:15 Message: Logged In: YES user_id=1261891 This is still happening. Example HTML files are in the test directory under this bug number. ---------------------------------------------------------------------- Comment By: Justin Waddell (jwaddell) Date: 2006-05-17 16:22 Message: Logged In: YES user_id=1417827 Can you please retest this, hard to say if it would have been fixed or not. ---------------------------------------------------------------------- Comment By: Liz Mac (liz_mac) Date: 2005-11-09 12:00 Message: Logged In: YES user_id=1261891 In some instances it also is putting in strange characters at the end of an HTML document. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=577089&aid=1351814&group_id=85722 |