Re: [cx-oracle-users] Help with fetchraw

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On 5/17/05, Chris Dunscombe <cdu...@ya...> wrote:
>=20
> --- Anthony Tuininga <ant...@gm...> wrote:
> > 1) You should be able to do something like this without the need to
> > descend into fetchraw().
> >
> > import cx_Oracle
> >
> > fromConnection =3D cx_Oracle.Connection("user/pw@tns")
> > toConnection =3D cx_Oracle.Connection("user/pw@tns")
> > fromCursor =3D fromConnection.cursor()
> > fromCursor.arraysize =3D 250 # set as desired
> > toCursor =3D toConnection.cursor()
> > toCursor.arraysize =3D 250 # set as desired, same as fromCursor
> >
> > fromCursor.execute("some query")
> > toCursor.setinputsizes() # set things up if nulls are allowed
> > while True:
> >     data =3D fromCursor.fetchmany()
> >     if not data:
> >         break
> >     toCursor.executemany("some insert statement", data)
> >     toConnection.commit() # if desired
> >
> > That should do it and should perform quite well. The fetchraw(),
> > bindvar.copy() and executemanyprepared() trio were created solely to
> > avoid the overhead of creating Python objects for the data. Its more
> > complicated than the above code but I can give that as an example as
> > well if you really want it.
> >
>=20
> I've done essentially the same as you've got above and yes it works fine =
with pretty good
> performance. I'm just trying to improve it further as there are cases whe=
n the app will be running
> a number of parallel streams each selecting and inserting millions of row=
s in a fixed window so
> saving 5% - 15% of the Python CPU time within this loop is definately wor=
th doing. I'd like a
> fetchraw example so I can benchmark it against my current approach. I'll =
then post the results so
> everyone can see the general sort of performance differences between the =
two.

The following is the code for using fetchraw(). My own preliminary
results show that in fact it is slower than the original method above
that I showed you and is more complicated to boot. The fact that the
data can be passed from the one cursor to the other directly and that
the bindvar.copy() command must be called for each bind variable for
each row makes the overhead of fetching Python objects less than the
overhead of calling the bindvar.copy() methods. The fetchraw() method
was considerably faster when lists of dictionaries needed to be passed
through to executemany().

import cx_Oracle

sourceConnection =3D cx_Oracle.Connection("user/pw@tns")
sourceCursor =3D sourceConnection.cursor()
sourceCursor.arraysize =3D 250
sourceVars =3D sourceCursor.execute("select * from testdata_source")

targetConnection =3D cx_Oracle.Connection("user/pw@tns")
targetCursor =3D targetConnection.cursor()
targetCursor.execute("truncate table testdata_copy")
targetCursor.arraysize =3D 250
targetVars =3D targetCursor.setinputsizes(cx_Oracle.NUMBER, cx_Oracle.DATET=
IME,
        cx_Oracle.NUMBER, cx_Oracle.STRING)
targetCursor.prepare("insert into testdata_copy values (:1, :2, :3, :4)")

vars =3D zip(sourceVars, targetVars)
indexes =3D range(sourceCursor.arraysize)
while True:
    rowsFetched =3D sourceCursor.fetchraw()
    print "fetched", rowsFetched, "rows"
    if not rowsFetched:
        break
    if rowsFetched !=3D sourceCursor.arraysize:
        indexes =3D range(rowsFetched)
    for index in indexes:
        for sourceVar, targetVar in vars:
            targetVar.copy(sourceVar, index, index)
    targetCursor.executemanyprepared(rowsFetched)
    targetConnection.commit()

Another method which also works but only when the exact data is being
replicated from one cursor to another is the following. It is faster
than both methods but limited in usefulness of course.

import cx_Oracle

sourceConnection =3D cx_Oracle.Connection("user/pw@tns")
sourceCursor =3D sourceConnection.cursor()
sourceCursor.arraysize =3D 250
sourceVars =3D sourceCursor.execute("select * from testdata_source")

targetConnection =3D cx_Oracle.Connection("user/pw@tns")
targetCursor =3D targetConnection.cursor()
targetCursor.execute("truncate table testdata_copy")
targetCursor.arraysize =3D 250
targetVars =3D targetCursor.setinputsizes(*sourceVars)
targetCursor.prepare("insert into testdata_copy values (:1, :2, :3, :4)")

while True:
    rowsFetched =3D sourceCursor.fetchraw()
    print "fetched", rowsFetched, "rows"
    if not rowsFetched:
        break
    targetCursor.executemanyprepared(rowsFetched)
    targetConnection.commit()

Hope this helps,

Anthony