Menu

Update large file, 3-4 million records

Help
Sergei
2019-12-06
2019-12-07
  • Sergei

    Sergei - 2019-12-06

    Hey.
    Sorry for my English.
    Faced such a problem, I was looking for a solution for 2-3 days, but could not find.
    3-4 million records need to be updated.
    I tried through update, but after a long wait, it gives an error about out of memory.
    Code like this, in different variations
    Correct me somewhere

    Connection conn = DriverManager.getConnection
                    ("jdbc:ucanaccess://D:/Conv2/test.mdb;memory=false", null, null);
             String sql = "UPDATE items SET col = 3";
             System.out.println("conn rdy");
             PreparedStatement st = conn.prepareStatement(sql);
             System.out.println("st rdy");
             st.execute();
             conn.close();
    

    I also tried updating records by adding a column with a default value.

    ALTER TABLE items ADD col int DEFAULT 5
    

    The program runs for a long time and if it is interrupted, the data is added.
    Is there any way to update this number of records within 10 minutes.

    There is an idea of copying all the data to a new file, with the addition of a new field when overwriting, but I'm not sure that it will be faster.
    Thanks

     
  • Sergei

    Sergei - 2019-12-07

    Found a way, but it doesn’t suit me.
    I transfer data and add new data to the column.

    String columnName = "";
            String forSqlData = "";
            String sqlData = "";
            int i,y,columnDate=0;
            final int batchSize = 100000;
            int count = 0;
            PreparedStatement pst;
            // UCanAccess
            try { Connection conn = DriverManager.getConnection
                    ("jdbc:ucanaccess://D:/Conv2/test.mdb;memory=false", null, null);
                Statement stmt = conn.createStatement();
                ResultSet rs = stmt.executeQuery("SELECT * FROM items WHERE 0=1");
                ResultSetMetaData rsmd = rs.getMetaData();
                System.out.println("Column names as reported by ResultSetMetaData:");
                for (i = 1; i <= rsmd.getColumnCount(); i++) {
                    if (rsmd.getColumnName(i).equals("col")) {
                        columnDate=i;
                    }
                    columnName=columnName+rsmd.getColumnName(i)+",";
                    forSqlData = forSqlData+"?,";
                    }
                columnName = StringUtils.substring(columnName, 0, columnName.length() - 1);
                sqlData = "INSERT INTO items ("+columnName+") values("+forSqlData;
                sqlData = StringUtils.substring(sqlData, 0, sqlData.length() - 1);
                rs.close();
                String sql = "SELECT * FROM items";
                ResultSet rst = stmt.executeQuery(sql);
                conn = DriverManager.getConnection
                        ("jdbc:ucanaccess://D:/Conv2/test2.mdb;memory=false", null, null);
                pst = conn.prepareStatement(sqlData+")");
                while (rst.next()) {
                    y=1;
                    while(y<i) {
                        if (y==columnDate) {
                                pst.setString(y, "777");
                                } 
                                else
                                {
                                pst.setString(y, rst.getString(y));
                                }
                    y++;            
                    }
                    pst.addBatch();
                    if(++count % batchSize == 0) {
                        pst.executeBatch();
                    }
                }
                pst.executeBatch();
                rst.close();
                conn.close();
            } catch (SQLException e) {
                e.printStackTrace(System.out);
            }
    

    5 columns with 4.5 million rows + 1 new column, processed in 7-8 minutes,but it's a long time.
    If I do something wrong, I will be glad to know how to do it right and better :) Thanks

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.