SourceForge has been redesigned. Learn more.
Close

#4 xml instead of csv

open
nobody
None
5
2015-07-09
2011-08-29
Anonymous
No

Hi
How about allowing xml format instead of csv. Most of my files are in xml format which I find easier to used than csv.

Thanks

Related

Feature Requests: #4

Discussion

  • Sebastien Bracquemont

    XML is not a "generically" readable file format,its just a syntax to present custom data in a structured way.
    CSV using headers, gives much less freedom, 1 line is column names, other are data for each colum.

    So, in XML
    Each data provider has its own xml , with its own tags & it's own way to express a product.
    An XML datasource could be done but, it would need many advanced parameters to be setup (like product info container XPath & relative XPath from that container for each field value)
    So , it won't be easy to setup for people that do not know XPath.

     
  • Barry

    Barry - 2013-11-19

    How about some other options of importing? Like 'preprocessors' or something. I would like a Excel (xls/xlsx) import, because it is easier to create/share (MS Office doesn't allow to change the delimiter, so you have to use Libra office and convert all your files etc).
    I could also see that an XML parser would be useful, if you were able to supply some kind of attribute mapper. (Which could be shared for common formats)

     
  • Barry

    Barry - 2013-11-19

    However, I guess all of this should be possible by using the Datapump API, right?

     
  • Sebastien Bracquemont

    Hi , Excel support (xls & xlsx) is difficult to have on linux based server, most PHP libraries that do this will try "instantiating" a local microsoft office instance & do data exchange with it. Also , most xls & xlsx would do a big data load instead of line by line reading , which would use a lot of memory for big files.
    XML is NOT a generic format in the way you can express the same thing in an infinite number of variations. you need a custom reader per way of expressing your data in XML. provider 1 will use a structure A in its XML , provider 2 may use structure B & C as well as cross referenced ids between B & C . so there is as many ways of deserializing XML files as there are XML files.

    Unless "defining" an XML format for magmi, there is no easy way to adapt to existing XML formats.

    So i think the best is that you use your own parser & Magmi Datapump API to import your specific deserialized products.

     
    Last edit: Sebastien Bracquemont 2013-11-19
  • Barry

    Barry - 2013-11-19

    Okay. I actually already created the Excel parser with https://github.com/PHPOffice/PHPExcel and a small function that transforms the excel file into an array for Datapump, which seems to work for small files, but haven't yet tested with large files.
    PHPExcel does seem to have some support for chunked reading, so I'll see what is possible.

     
  • Sebastien Bracquemont

    Just some hints:

    The best use of datapump is not to create a big array of items but an array
    by iterating on data source.

    magmi is memory savvy and is designed to operate mostly in a "single row"
    scope.

    loading a big array of items in PHP then call datapump item by item doing
    a loop on your memory behemoth is totally against magmi philosophy.

    magmi is designed to import millions of items without needing much memory.

    say your excel file has 800k lines (or even 100k lines) it may take about
    1GB of RAM to hold the needed data structures to represent Excel lines in
    PHP.

    given PHP environment might be memory limited out of your control (like an
    admin wanting to do shared hosting) , your script won't run in this memory
    constrained environment. and with PHP you should always assume you are in a
    memory constrained environment.

    2013/11/19 Barry barryvdh@users.sf.net

    Okay. I actually already created the Excel parser with
    https://github.com/PHPOffice/PHPExcel and a small function that
    transforms the excel file into an array for Datapump, which seems to work
    for small files, but haven't yet tested with large files.
    PHPExcel does seem to have some support for chunked reading, so I'll see
    what is possible.


    Status: open
    Created: Mon Aug 29, 2011 07:17 AM UTC by Anonymous
    Last Updated: Tue Nov 19, 2013 02:00 PM UTC
    Owner: nobody

    Hi
    How about allowing xml format instead of csv. Most of my files are in xml
    format which I find easier to used than csv.

    Thanks

    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/magmi/feature-requests/4/

    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/

     

    Related

    Feature Requests: #4

  • Topcat

    Topcat - 2014-04-28

    XML would be a useful format as we have datafields with line breaks and these do not import well (if at all) with CSV.

     
  • fabio

    fabio - 2015-07-09

    Xml is good format, becouse sometime i have file with more caracters in sields like pipe, dublequote and more, and are very very big problems parsing the csv fields

     

Log in to post a comment.