heidisql / Tickets / #648 Export to CSV corrupt

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2008-07-31

Originally posted by: spero.sh...@gmail.com

This seems to have started in rev 1549, rev 1548 and earlier looks to be unaffected.

*Originally posted by:* [spero.sh...@gmail.com](http://code.google.com/u/101502024872797268469/) This seems to have started in rev 1549, rev 1548 and earlier looks to be unaffected.

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2008-07-31

Originally posted by: a...@anse.de

The attached file was saved as an ANSI-file while the original CSV output creates a
UTF16-LE file. Do not open multibyte charset files with a ANSI only editor or the
file will look corrupted.

If we would save CSV files as ANSI it would silently (!) convert critical characters
to their representation in the ANSI charset. That's what [r1549] fixes. Now, please
don't tell me Excel doesn't support Unicode files? What a shitty peace of software
is that?

Labels: -Priority-Medium Severity-Problematic
Status: Accepted

Related

Commit: [r1549]

*Originally posted by:* [a...@anse.de](http://code.google.com/u/109183940118681839166/) The attached file was saved as an ANSI-file while the original CSV output creates a UTF16-LE file. Do not open multibyte charset files with a ANSI only editor or the file will look corrupted. If we would save CSV files as ANSI it would silently $\!$ convert critical characters to their representation in the ANSI charset. That's what [r1549] fixes. Now, please don't tell me Excel doesn't support Unicode files? What a shitty peace of software is that? **Labels:** -Priority-Medium Severity-Problematic **Status:** Accepted

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2008-07-31

Originally posted by: a...@anse.de

At least Excel 2007 loads Unicode CSV files correctly (even if I leave the ANSI
option enabled while importing which is funny but anyway). So, do we have a problem
here at all? The original report says nothing about some real error here, does it?

*Originally posted by:* [a...@anse.de](http://code.google.com/u/109183940118681839166/) At least Excel 2007 loads Unicode CSV files correctly $even if I leave the ANSI option enabled while importing which is funny but anyway$. So, do we have a problem here at all? The original report says nothing about some real error here, does it?

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2008-08-01

Originally posted by: rosenfie...@gmail.com

There is not enough information in the original report to say whether there is a bug
or not. Specifically, there is no mention of whether the source data contains a NUL
character or not.

I'm closing this bug. To reopen, please provide an SQL export of the original data,
and we'd be happy to re-investigate. (Make an SQL export with version 3.2 also,
just so we're sure that it's pristine and unaffected by Unicode issues.)

...

And on a more theoretical note, I don't think there exists a CSV standard, so
precisely how NUL characters should be handled is probably not documented anywhere,
and will be handled different by different applications.

Since HeidiSQL aims to be user friendly, I guess it could pop up a warning after
data export to tell the user that his/her data contains NUL characters, and that
some applications may be unable to handle this.

Perhaps such a dialog could even have a link to a help file or web page containing
advice on how to convert NUL to SPACE with for example an SQL sentence against the
original data or a hex editor against the exported file.

Status: NeedInfo

*Originally posted by:* [rosenfie...@gmail.com](http://code.google.com/u/117240754853497489692/) There is not enough information in the original report to say whether there is a bug or not.  Specifically, there is no mention of whether the source data contains a NUL character or not. I'm closing this bug.  To reopen, please provide an SQL export of the original data, and we'd be happy to re-investigate.  $Make an SQL export with version 3.2 also, just so we're sure that it's pristine and unaffected by Unicode issues.$ ... And on a more theoretical note, I don't think there exists a CSV standard, so precisely how NUL characters should be handled is probably not documented anywhere, and will be handled different by different applications. Since HeidiSQL aims to be user friendly, I guess it could pop up a warning after data export to tell the user that his/her data contains NUL characters, and that some applications may be unable to handle this. Perhaps such a dialog could even have a link to a help file or web page containing advice on how to convert NUL to SPACE with for example an SQL sentence against the original data or a hex editor against the exported file. **Status:** NeedInfo

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2008-08-01

Originally posted by: martinkr...@gmail.com

Thanks for the feedback gents.

This problem was actually picked up when using data exported from HeidiSQL as .csv as
input for a Perl script. The Perl script is "blind" for ANSI / UTF format settings as
it only makes use of the function split(';', $line) to break the data apart as needed
by the script.
Since Heidi rev 1549 I have to open up the HeidiSQL .csv export from Excel and save
again as .csv to be able to use it in my Perl script. I can of course change the
script source to "fix" the .csv file before using it, but thought that since it
worked before there might be a new "bug" in HeidiSQL.

I did a comparison between the same data as exported from HeidiSQL and Excel to try
and debug the problem myself. I could only see that the header of the file exported
by HeidiSQL contains hex FF FE which Excel doesn't, and also the hex 00 inbetween
other characters as mentioned before. Perhaps the problem actually lies with the FF
FE header part and not the 00 inbetween characters?

Attached the original data in SQL as well as .csv format (exported by HeidiSQL).

*Originally posted by:* [martinkr...@gmail.com](http://code.google.com/u/101155604876611155701/) Thanks for the feedback gents. This problem was actually picked up when using data exported from HeidiSQL as .csv as input for a Perl script. The Perl script is "blind" for ANSI / UTF format settings as it only makes use of the function split$';', $line$ to break the data apart as needed by the script. Since Heidi rev 1549 I have to open up the HeidiSQL .csv export from Excel and save again as .csv to be able to use it in my Perl script. I can of course change the script source to "fix" the .csv file before using it, but thought that since it worked before there might be a new "bug" in HeidiSQL. I did a comparison between the same data as exported from HeidiSQL and Excel to try and debug the problem myself. I could only see that the header of the file exported by HeidiSQL contains hex FF FE which Excel doesn't, and also the hex 00 inbetween other characters as mentioned before. Perhaps the problem actually lies with the FF FE header part and not the 00 inbetween characters? Attached the original data in SQL as well as .csv format $exported by HeidiSQL$.

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

tempCSV.sql

tempcsv.csv

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2008-08-01

Originally posted by: rosenfie...@gmail.com

Ok, so we've got two issues:

1) Perl script does not understand UTF-8.

2) Perl script chokes on NUL characters.

Neither of which seems to be a bug in HeidiSQL.

It's true that older versions of HeidiSQL used the local code page for exports
(often iso-8859-1), but that's not a viable solution going forward, since it cannot
express many international characters, and therefore did not work well for people
operating in multilingual environments.

My suggestion would be to fix the Perl script. Either open the file as UTF-8:

open(FH, "<:encoding(UTF-8)", "mydata.csv");

Or run an iconv/tr pair to do a lossy conversion to ASCII (or your local codepage)
and remove NUL characters:

system("iconv", "-f utf8 -t ascii --unicode-subst=' ' mydata.csv > mydata2.csv");
system("tr", "-d '\000' < mydata2.csv > mydata3.csv");
// (open mydata3.csv which is now in ascii and without NULs found in original data)

Hope this helps?

*Originally posted by:* [rosenfie...@gmail.com](http://code.google.com/u/117240754853497489692/) Ok, so we've got two issues: 1\) Perl script does not understand UTF-8. 2\) Perl script chokes on NUL characters. Neither of which seems to be a bug in HeidiSQL. It's true that older versions of HeidiSQL used the local code page for exports $often iso-8859-1$, but that's not a viable solution going forward, since it cannot express many international characters, and therefore did not work well for people operating in multilingual environments. My suggestion would be to fix the Perl script.  Either open the file as UTF-8:   open$FH, "<:encoding\(UTF-8$", "mydata.csv"\); Or run an iconv/tr pair to do a lossy conversion to ASCII $or your local codepage$ and remove NUL characters:   system$"iconv", "-f utf8 -t ascii --unicode-subst=' ' mydata.csv > mydata2.csv"$;   system$"tr", "-d '\000' < mydata2.csv > mydata3.csv"$;   // $open mydata3.csv which is now in ascii and without NULs found in original data$ Hope this helps?

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Export to CSV corrupt

Searches

Help

#648 Export to CSV corrupt

Discussion

Related