While working with a largish (22,000 tuples, 2
columns) set of query results, I noticed that although
the query returns from postgres in a few seconds, it
was taking over a minute to populate the data.frame in
R. During that time R was pretty using 97-99% of one
CPU.
This patch implements a new C function,
rpgsql_get_results, and an R function,
db.result.fetchall, which populates the data.frame
almost entirely within C.
On the slow machine mentioned above, this made a
dramatic improvement --> the whole R script now runs
in about 5 seconds. On a different (much faster)
machine I timed the difference for this same script
and data:
db.result.fetchall ~2 seconds
db.fetch.result ~9 seconds
This is my first attempt at an R extension, so please
take a good look.
Regards,
Joe Conway
joseph.conway@home.com