From: SourceForge.net <no...@so...> - 2009-12-11 17:40:32
|
Patches item #2912672, was opened at 2009-12-11 14:40 Message generated for change (Comment added) made by blueyed You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=547457&aid=2912672&group_id=76550 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: daniel hahler (blueyed) Assigned to: Nobody/Anonymous (nobody) Summary: Handle UTF-8 correctly in Simplepage::normalise Initial Comment: Simplepage::normalise normalizes HTML, but because it uses "preg_replace('#\s+#')" it fails with a Non-breaking space (" ") encoded in UTF-8 (which is 194+160 and preg_replace removes only the second byte). The attached patch fixes this, if mbstrings is available and $text is valid utf8 (according to mb_check_encoding). It also adds tests. ---------------------------------------------------------------------- >Comment By: daniel hahler (blueyed) Date: 2009-12-11 18:40 Message: This are really several patches.. I've split them into various commit locally, and will provide a link to them later, if there's any interest after all.. ;) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=547457&aid=2912672&group_id=76550 |