[Cppcms-users] Fast bulk-insert into SQLITE3 using cppDB
Brought to you by:
artyom-beilis
From: Vizcayno T. <viz...@gm...> - 2011-08-20 22:19:41
|
**************** This replace to previous post because the display format is ugly *************************** Hello: I show you the main part of a C++ which loads a 540MB delimited ("|") text file into an SQLite3 db using last cppDB (trunk 1875): int main () { Timer t; cout << t.ShowStart() << endl; ifstream input("in1.txt"); const size_t size = 512 * 1024; char buffer[size]; string* f = new string[10]; cppdb::session sql("sqlite3:db=db.db; Page Size=65536; Cache Size=65536; Synchronous=Off; Journal Mode=Off;"); sql << "DROP TABLE IF EXISTS ttb" << cppdb::exec; sql << "CREATE TABLE ttb (alm varchar(4), alm2 varchar(4), \ mat varchar(18), ser varchar(20), fac varchar(18), almttb varchar(10), \ cant integer, sts varchar(1), ser_2_20 varchar(20), rowid_sap integer, \ stsmat varchar);" << cppdb::exec; cppdb::transaction guard(sql); cppdb::statement stat; stat = sql << "INSERT INTO ttb(alm,alm2,mat,ser,fac,almttb,cant,sts,ser_2_20,rowid_sap,stsmat) " "VALUES(?,?,?,?,?,?,?,?,?,?,?)"; int cont = 0; int regLei = 0; while (input) // Read a text file with fields separated by "|" { input.read(buffer, size); size_t readBytes = input.gcount(); for (size_t i = 0; i < readBytes; i++) { if (*(buffer+i) == '\n') // || *(buffer+i) == '\r') { if (regLei++ > 0) // Don't save first line, it's header { stat.reset(); stat.bind(f[0]); stat.bind(f[1]); stat.bind(f[2]); stat.bind(f[3]); stat.bind(f[4]); stat.bind(f[5]); stat.bind(f[7]); stat.bind_null(); stat.bind_null(); stat.bind_null(); stat.bind(f[9]); stat.exec(); } for (int j=0;j<10;f[j++]=""); // Reset string arrays of fields cont = 0; } else if (*(buffer+i) != '|') f[cont] += *(buffer+i); // Build each field char by char else cont++; // Build data on next string of the array f } } guard.commit(); t.Stop(); cout << t.ShowEnd() << endl; cout << "Executed in: " << t.ElapsedSeconds() << " seconds." << endl; input.close(); return 0; } Please, be patient with my C++ style. There is a part of the code that is trying to read the file as fast as possible (using a buffer), but I can't get to save the data into Sqlite at the same speed that I read. In the code you can see and attemp to accelerate the writing but, I am not sure if it's working because the journal file continues generating data, I am sure I am doing something wrong: cppdb::session sql("sqite3:db=db.db; Page Size=65536; Cache Size=65536; Synchronous=Off; Journal Mode=Off;"); My questions are: 1) What is bad in the cppdb::session syntax I wrote? 2) Checking the code, can you find a way to speed-up the writing of the data into the SQLite DB (I am sure that the reading is very fast), the command compilation in VS2010 is: cl /EHsc /Ox /Ob2 /Oi for both, the cppDB and my C++ program; i'm under windows 7 and program generation is for 32 bits. 3) I also wrote a program in C# and it is always 10-15 seconds faster than my C++ and, as a curiosity, the file size of the database running C# is 386MB and after running the C++ the result size is 408MB (one possible reason because C++ is slower than my C# program); the records and fields generated are the same in both databases and I'm breaking my head trying to figure out the reason of this difference in sizes; may be you know possible reasons? Many thanks for your attention. |