On Tue, 2003-08-05 at 21:35, Dave Poirier wrote:
> I'm trying to import around 7 million rows from one database connection
> to another. Due to the enormous quantity of data, displaying some
> progress to the user is a basic requirement. I've tried various methods
> but it seems that SQLObject, even when using the _connection directly,
> fetch all the data from the database connection before returning from
> the execute statement.
>
> #! /usr/local/bin/python -u
>
> from myclasses import *
>
> connection = X_sqlobject._connection
> c = connection.getConnection()
> print "selecting..."
> c.execute('SELECT * FROM remote_table;')
> print "fetching..."
> print c.fetchone()
>
> I get the "selecting..." right away but it takes some serious time, with
> heavy network activity before "fetching..." ever appears. Is there any
> way to use the database connection so as to be able to display progress
> while the records are being fetched?
I presume from the example that it's actually the database driver that's
taking a long time for you -- or perhaps the underlying database takes a
long time to prepare the results (which is quite likely). Did Frank's
suggestion to batch (i.e., using limit/offset) help? If it's the
database that is the bottleneck, it might not (or even be worse), since
the database may take a long time to decide what portion of the results
to send. Anyway, curious how it turned out.
Ian
|