Menu

How to INSERT 100's of blobs

2020-03-27
2022-02-09
  • Paul Higgins

    Paul Higgins - 2020-03-27

    I have 1000s of blobs that I want to INSERT into a table. Currently I do one INSERT for each blob which takes forever. Is there a way I can insert all the blobs with one INSERT? Thanks again.

     
  • Paul Higgins

    Paul Higgins - 2020-03-27

    I should add that each blob is a jpg in a file.

     
  • Paul Higgins

    Paul Higgins - 2020-03-27

    Jon,
    I am using the oorexx squlite interface that is why I posted here, Heere is my present INSERT:

    -- insert this blob
       sql = "INSERT INTO tiles (zoom_level,tile_column,tile_row,tile_data) VALUES (?, ?, ?, ?);"
       stmt = .ooSQLiteStmt~new(dest, sql)
       if stmt~initCode <> 0 then do 
          msg = 'Error: ooSQLiteStmt INSERT initialization error:'stmt~initCode stmt~lastErrMsg
          self~logmsg(msg)
          stmt~finalize
          dest~close
          signal returnit
       end    
       rc = stmt~~reset~~bindInt(1, zm)
       rc = stmt~~reset~~bindInt(2, column)
       rc = stmt~~reset~~bindInt(3, y)
       rc = stmt~~reset~~bindBlob(4, blob_data)
       stmt~step
       stmt~finalize
    
     
    • Jon Wolfers

      Jon Wolfers - 2020-03-28

      Thanks Paul,

      I've not used the sqlite package in ooRexx so I can't give you a definitive
      answer, but I wonder if you have sufficient memory whether building a
      single mega-sql statement in rexx and forgoing the binds might improve your
      performance. You might try something like this:

      values = .array~new
      do tile over tiles
      values~append("("tile~zm,tile~column,tile~y,"'"tile~blob_data"')")
      end

      sql = "INSERT INTO tiles (zoom_level,tile_column,tile_row,tile_data)
      VALUES" values~makestring('l',',')
      stmt = .ooSQLiteStmt~new(dest, sql)
      if stmt~initCode <> 0 then do
      msg = 'Error: ooSQLiteStmt INSERT initialization error:'stmt~initCode
      stmt~lastErrMsg
      self~logmsg(msg)
      stmt~finalize
      dest~close
      signal returnit
      end
      stmt~step --< not sure if you need this now
      stmt~finalize --< not sure if you need this now

      Of course the cost of this is if you have bad data everything fails and it
      might perform worse than binding the rows one at a time.

      Please let us know how you get on

      Jon

      On Fri, 27 Mar 2020 at 21:29, Paul Higgins phiggins-99@users.sourceforge.net wrote:

      Jon,
      I am using the oorexx squlite interface that is why I posted here, Heere
      is my present INSERT:

      -- insert this blob
      sql = "INSERT INTO tiles (zoom_level,tile_column,tile_row,tile_data) VALUES (?, ?, ?, ?);"
      stmt = .ooSQLiteStmt~new(dest, sql)
      if stmt~initCode <> 0 then do
      msg = 'Error: ooSQLiteStmt INSERT initialization error:'stmt~initCode stmt~lastErrMsg
      self~logmsg(msg)
      stmt~finalize
      dest~close
      signal returnit
      end
      rc = stmt~~reset~~bindInt(1, zm)
      rc = stmt~~reset~~bindInt(2, column)
      rc = stmt~~reset~~bindInt(3, y)
      rc = stmt~~reset~~bindBlob(4, blob_data)
      stmt~step
      stmt~finalize


      How to INSERT 100's of blobs
      https://sourceforge.net/p/oorexx/discussion/408477/thread/978d52ec75/?limit=25#1982


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/oorexx/discussion/408477/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
  • Paul Higgins

    Paul Higgins - 2020-03-29

    Jon,

    Thanks for the suggestion. I already tried something similiar where I loaded all the blobs into a directory but as you warned this blew all the memory and caused the program to crash. There were about 50,000 blobs each reperesenting a small jpg.

    I was hoping there might be an INSERT that contained all the filespecs of the jpgs. But I guess this doesn't exist and it probably wouln't be much faster than getting each blob and doing an INSERT.

    What I am going to try next it do the SQL calls in C++ in a routine in my dll.

     
  • Paul Higgins

    Paul Higgins - 2020-03-29

    Just to let you know that doing the calls using C++ was a bit faster but still way to slow. I was still reading in the blobs one by one and inserting them.
    But this gives me an idea: I could beak up the list of blob filesepes into into smaller lists and the dispatch separate threads for each list.
    This brings up a question: can separate threads insert rows into the same table at the same time? Anyone know?

     
  • Paul Higgins

    Paul Higgins - 2020-03-29

    Thanks for the info Rony.
    My approach is now to break up the blob filesecs into into smaller lists and dispatch a thread for each list that creates a temporary sql db with the same table definition.
    Once all the threads have finished. I attach each of the temporary databases and then use the:
    INSERT INTO tiles SELECT * FROM source.tiles;
    to merge them all into the main database table. I found that insert to be very fast.
    This reduced the time to process 57,000 blobs from 13 minutes using one thread down to 4.5 minutes using 23 threads.
    Increasing the number of threads to 115 didn't improve the time.

     
  • turgut kalfaoglu

    Um, is your Rexx script and your database server close to each other? if not, you are probably waiting for the JPG upload to finish..
    Maybe you could have Rexx create the INSERT statments and the JPG's, and upload the whole thing to the server and run the script there? Might be faster.

     
  • Paul Higgins

    Paul Higgins - 2022-02-09

    I don't use servers. It just runs on a Windows PC.

     

Log in to post a comment.