Menu

#1232 Slow Import with replace statement

Fixed
nobody
Performance (9)
Problematic
Defect
2009-11-16
2009-06-18
Anonymous
No

Originally created by: klchi... (code.google.com)@gmail.com

Hi,

I do not know this issue had report or not.

I had a file about 23Mbyte or 215,000 lines.

The queries were with "Replace Into" statement, even I
had drop the table, the import were very slow about
1.30 hr to complete the task, anyway to improve this
issue?

Tested the speed will improve alot with "Insert Into"
statement.

Regards,
KL Chin

Discussion

  • Anonymous

    Anonymous - 2009-06-18

    Originally posted by: rosenfie... (code.google.com)@gmail.com

    > anyway to improve this

    Definitely.  The first step would be to find a way to profile the code so that we
    actually identify the bottlenecks.  I'm not, on top of my head, sure how to
    efficiently profile Delphi Pascal code.

    > Tested the speed will improve alot with "Insert Into" statement.

    Any improvements you see from doing that means it's the server that performs replace
    poorly/slowly.  For example, I could imagine that it does a SELECT for every
    REPLACE, instead of keeping, say, hash tables of unique keys in memory to compare
    against on multiple REPLACEs.

    Labels: Severity-Problematic Performance
    Status: Accepted

     
  • Anonymous

    Anonymous - 2009-06-19

    Originally posted by: klchi... (code.google.com)@gmail.com

    Ho Rosenfield Albert,

    IMHO, for a fast work around, if the db had dropped or the script
    had a delete the db first, why not change the REPLACE or INSERT.
    Because, what I had notice the different was about 10min compare
    to 1.30 hr.

    If you required the sql scripts, I can send to you.

    Regards,
    KL Chin

     
  • Anonymous

    Anonymous - 2009-06-20

    Originally posted by: rosenfie... (code.google.com)@gmail.com

    Uhm, that doesn't make sense at all.  If you, in the export dialog, choose:

    - "replace data", then TRUNCATE + INSERT statements are generated.
    - "update existing data", then REPLACE statements are generated.

    Similarly if you choose to drop and recreate the entire table, then INSERT is always
    used.

    There's simply no possible way of doing something stupid such as what you suggest in
    comment #3, creating a combination of a drop or truncate and REPLACE, in the current
    export GUI.

    And even if there was, that wouldn't be the subject of this issue - this issue as
    originally submitted is about REPLACE INTO being slow.

     
  • Anonymous

    Anonymous - 2009-06-22

    Originally posted by: klchi... (code.google.com)@gmail.com

    I think the problem was how often people check the query before
    start import it to the db. Beside, sometime the export file was
    done by novice (client), if they can export it we should be very
    happy it about it.

    BTW, would it be nice to have a smarter import, where it will
    auto scan the script during import, and it will indicate the
    import file too huge and take some times, or suggest to have
    a new import script with the option for the export dialog.

    Regards the subject, I not sure the best words but it just help
    me to dscribe my problem/issue in general.

     
  • Anonymous

    Anonymous - 2009-06-22

    Originally posted by: rosenfie... (code.google.com)@gmail.com

    > sometime the export file was done by novice (client),
    > if they can export it we should be very happy it about it

    Guess so.

    The function could be renamed so that 'export as sql' is called 'script to sql'. 
    That way it doesn't sound nearly as much as a pure model/data export.

    A larger improvement would be to add a pure model/data export.  The question is
    which data format to use.  CSV doesn't include the schema plus the de facto
    specification (Microsoft Office) is quite ambiguous, eg. NULL and empty string are
    equivalent, the header line is optional, etc.  frm/MYD is very convenient, at least
    for MySQL databases.  XML is quite wasteful, but could be used in gzipped format. 
    For either of these, an import function would be needed to, of course.

    Anyway.

    > would it be nice to have a smarter import, where it will
    > auto scan the script during import,

    For sure.  Other intelligent add-ons could be:

    - switch method on import (REPLACE INTO --> INSERT INTO or the other way around)
    - map to another target (eg a different database name) on import
    - skip some objects when more than one table is stuffed into the same SQL file
    - rewrite from one-row-per-INSERT-statement (ANSI SQL) to multiple-rows-per-INSERT-
    statement (MySQL) syntax
    - rewrite to use MySQL protocol-level prepared statements (skips sending query
    logic for each INSERT, only sends data.  also skips quoting and unquoting values,
    and sends values in binary format instead of string format over the network.)

    Lots of good ideas, mostly a question of finding someone who has time to flesh out
    the algorithms and codify them.

    Patches are accepted graciously, by the way ;-).

     
  • Anonymous

    Anonymous - 2009-06-25

    Originally posted by: klchi... (code.google.com)@gmail.com

    Hi,

    With the exsiting import file, I had replacing the keyword "REPLACE"
    with "INSERT", the speed was slow too, the Queries and Affected Rows
    changes were increment 1 by 1 and for 2 mins about 1,500 record
    inserted.

    BTW, cannot the source code compile with Delphi 5?

     
  • Anonymous

    Anonymous - 2009-06-25

    Originally posted by: rosenfie... (code.google.com)@gmail.com

    > replacing the keyword "REPLACE"
    > with "INSERT", the speed was slow too

    Okay, in that case it sounds like the main problem is probably a large per-query
    overhead in HeidiSQL.

    > BTW, cannot the source code compile with Delphi 5?

    Hmm, no, looking at the source code repository, it seems that at the moment packages
    are only maintained for Delphi 11.

     
  • Anonymous

    Anonymous - 2009-06-25

    Originally posted by: klchi... (code.google.com)@gmail.com

    Hmm, any plan to make the code able compile with Delphi 5?

     
  • Anonymous

    Anonymous - 2009-06-25

    Originally posted by: rosenfie... (code.google.com)@gmail.com

    Delphi 5 plans:  Not at all.  The project used to maintain packages for older
    compilers, but it needed constant decision making about what to do each time
    something was implemented that required a feature which an old version didn't have.
    Plus it was also a bit of a hassle to synchronize changes into the different project
    files manually.

    Ansgar managed to maintain the Subversion commit history when the project was moved
    to Google Code, so you can still find these old packages by doing "svn log" on each
    of the "packages" directories.  For example see [r1056].

    There used to be various versions of Delphi 11 available for download from the
    manufacturer on the Internet, as trial versions and various offers.  The product has
    changed owner a couple of times now, so I'm not sure what the current situation is,
    but you can probably still find a Torrent of some of the previously published
    downloads.

     

    Related

    Commit: [r1056]

  • Anonymous

    Anonymous - 2009-11-04

    Originally posted by: a...@anse.de

    klchin66, I guess the new database layer tuned performance here, so it might be worth
    trying the latest build to see if such an import is still so slow as you mentioned.

    On my machine I just started to import a 50M/1Mio rows file, which was at 17,000 rows
    after 1 minute. I don't think there is much room for further speed optimization but
    we'll see.

    Status: NeedInfo

     
  • Anonymous

    Anonymous - 2009-11-04

    Originally posted by: klchi... (code.google.com)@gmail.com

    Hi Ansgar Becker,

    I had verify the new release 2717, look solved the issue.

    * New export sql with replace option, fast - OKed (17M file abt 30sec)

    * Use the old script with "insert" for every row, still very slow (22M file)

    BTW, can the export performance be improve (17+M files about 3-4 min)?

    TQ.

     
  • Anonymous

    Anonymous - 2009-11-04

    Originally posted by: a...@anse.de

    Thanks for the update!

    One INSERT per row is always slow, does not wonder. Won't be much room to optimize that.

    Export dialog will be rewritten and tuned currently.

    Status: Accepted

     
  • Anonymous

    Anonymous - 2009-11-11

    Originally posted by: a...@anse.de

    Export performance should be slightly faster in [r2735]. But there's still much room
    for optimization. I just tested the half ready new export dialog with different logic
    and simpler code, and it's ~4 times faster than the old one.

     

    Related

    Commit: [r2735]

  • Anonymous

    Anonymous - 2009-11-11

    Originally posted by: klchi... (code.google.com)@gmail.com

    Hi Angsa Becker,

    I had downloaded [r2739], improvement cannot be see still required about 3-4 min,
    for 17+Mb files, 13 tbales.

    Regards,
    KL Chin

     

    Related

    Commit: [r2739]

  • Anonymous

    Anonymous - 2009-11-12

    Originally posted by: a...@anse.de

    Testing with my half ready export stuff in the table tools dialog:

    exporting a 20M table, 7 columns, 100,000 rows:
    * New dialog: 4 seconds
    * Current dialog: 1:45 minutes

    exporting a tall 40M table, 3 columns, 1 Mio rows:
    * New dialog: 13 seconds
    * Current dialog: 2:10 minutes

    I hope adding various features won't break this good performance again, like it
    happened in the past.

     
  • Anonymous

    Anonymous - 2009-11-12

    Originally posted by: klchi... (code.google.com)@gmail.com

    Hi Angsar Becker,

    Tested with [r2744], took abt 3min, I think a bit improvement.
    Btw, is there any logs that will display the performance,
    for all the tables?
    Is there any performance diff, between database type
    for exports for MyISAM and InnoDB? I using MyISAM.

     

    Related

    Commit: [r2744]

  • Anonymous

    Anonymous - 2009-11-12

    Originally posted by: a...@anse.de

    No - InnoDB and MyIsam behave the same when doing a "SELECT * FROM tbl LIMIT x, y"

     
  • Anonymous

    Anonymous - 2009-11-16

    Originally posted by: klchi... (code.google.com)@gmail.com

    Hi Ansgar Becker,

    Great, really fast now.
    Only problem, the GUI changes were too much.

    Regards,
    KL Chin

     
  • Anonymous

    Anonymous - 2009-11-16

    Originally posted by: a...@anse.de

    Thanks! There are at least two or three things which have to be fixed again, fx. a
    memory leak and such.

    > the GUI changes were too much
    Could you be more specific?

     
  • Anonymous

    Anonymous - 2009-11-16

    Originally posted by: klchi... (code.google.com)@gmail.com

    Hi Ansgar Becker,

    What I mean was the GUI was so different now,
    and hard for some non IT people to use it.

     
MongoDB Logo MongoDB