Menu

csv2odf fails when converting a csv of 300.000 lines into xlsx

2017-02-03
2017-02-05
  • Javier Bazan

    Javier Bazan - 2017-02-03

    Hey Larry,

    Having an issue when converting a csv file that has 300.000 lines into XLSX.

    This is what I'm doing:

    csv2odf -t 1 -c ";" "$FILE_CSV1_FINAL" "$FILE_XLSX_TEMPLATE" "$FILE_XLSX_FINAL"

    This is the error I get:

    Traceback (most recent call last):
    File "/export/nim/apps_tools/csv2odf/csv2odf-2.04/csv2odf", line 2699, in <module>
    zip.replace_file_object('xl/sharedStrings.xml', string_file.finish())
    File "/export/nim/apps_tools/csv2odf/csv2odf-2.04/csv2odf", line 2130, in finish
    self.out_file.write(self.intermediate_file.read())
    File "/export/nim/apps_tools/csv2odf/csv2odf-2.04/csv2odf", line 2252, in write
    self.ram_file.write(data)
    MemoryError

    The content of the csv file do not contain any strange character nor anything, just plain text and numbers.

    Does csv2odf have a maximum amount of line text u can convert to?

    Thank you for this great tool.
    Best regards,

     
  • Larry Jordan

    Larry Jordan - 2017-02-03

    What is supposed to happen is the output goes to a temporary file in memory, and when it gets to a certain size (I think 10MB) it shifts to a temporary file on disk. I will have to investigate why it is not changing to a disk file.

     
  • Larry Jordan

    Larry Jordan - 2017-02-03

    Hey, I noticed you are using version 2.04, would you mind trying the latest version (2.06) and see if it still has the problem? Thanks.

     
  • Javier Bazan

    Javier Bazan - 2017-02-04

    Thanks Larry for your quick prompt. Yes, excellent point you make about the version, I will definitely try now to see if that does the trick.

    Regarding the size, that makes total sense as my csv file containing the 300.000 lines is 41MB in size so probably as you have mentioned something funny is going with the temporary file on disk.

     
  • Javier Bazan

    Javier Bazan - 2017-02-04

    Ok so just tried with v2.06 and got the same error :(

    csv2odf -t 1 -c ";" "$FILE_CSV1_FINAL" "$FILE_XLSX_TEMPLATE" "$FILE_XLSX_FINAL"

    Traceback (most recent call last):
    File "/csv2odf-2.06/csv2odf", line 2771, in <module>
    zip.replace_file_object('xl/sharedStrings.xml', string_file.finish())
    File "/csv2odf-2.06/csv2odf", line 2201, in finish
    self.out_file.write(self.intermediate_file.read().decode(self.app.ENCODING))
    File "/opt/freeware/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
    MemoryError

     
  • Larry Jordan

    Larry Jordan - 2017-02-05

    Javier,
    Attached is a version you can try for the memory error.
    Thanks.
    Larry

     
  • Javier Bazan

    Javier Bazan - 2017-02-05

    It works now with the new version! Thanks Larry !!!

     
  • Larry Jordan

    Larry Jordan - 2017-02-05

    Great! That change will be in version 2.07.

    One extra tip, with large csv files, Pypy will run several times faster than regular Python.

    Thanks!
    Larry

     

    Last edit: Larry Jordan 2017-02-06

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.