Thanks a lot for this patch. I tried it, and it works perfectly; unlike the old code, which doesn't work at all - I can't even imagine what I was thinking before; I vaguely remember having tested the old code, but that hardly seems possible. Anyway, a new version of Data Transfer, 0.3.4, has been released, with your fix.


On Tue, Jul 28, 2009 at 4:43 AM, Patrick Nagel <mail@patrick-nagel.net> wrote:
Hi Yaron,

On 2009-07-28 12:20, Yaron Koren wrote:
> Okay, I released a new version of Data Transfer (0.3.3) that has the
> discussed dropdown - the file gets UTF-8-encoded only if it's not using
> UTF-8 already. Anyone who was having problems with this before can
> upgrade to the latest version, and let me know if that improves things.

I tested it, but unfortunately there is more to it than just doing an
utf8_encode on the CSV elements.

Let me try to explain the two problems I solved in the attached patch:

1) utf8_encode() can only be used on ISO-8859-1 encoded (non-Unicode) strings.
What you did, was, to use it on UTF-16 encoded strings when the new dropdown
was set to UTF-16 and the file was actually UTF-16 encoded. That will just
produce garbage. What really has to be done, is a conversion from UTF-16 (one
Unicode encoding) to UTF-8 (another Unicode encoding). This can be done with
iconv() or mb_convert_encoding(). The latter doesn't seem to be as smart as
iconv() when it comes to BOMs, so I chose iconv().

2) fgetcsv() does not work right when presented a UTF-16 encoded file - it can
only work with ISO-8859-1 or UTF-8. So the conversion has to be done *before*
fgetcsv() reads the CSV file.


Key ID: 0x86E346D4            http://patrick-nagel.net/key.asc
Fingerprint: 7745 E1BE FA8B FBAD 76AB 2BFC C981 E686 86E3 46D4

Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
Semediawiki-user mailing list