From: Colin S. <co...@ow...> - 2004-10-06 22:13:13
|
Hi Ian, > > I've just started using SQLObject in a program that loads web server log > > information into a MySQL database. The program consists of a tight loop > > that reads a web server entry, parses the data, performs one SQL lookup, > > and then inserts the data into a table. > > > > When I use the MySQL libraries directly the program handles ~290 rec/s. > > When I use SQLObject to perform the lookup and SQL insert the rate drops > > to 60 rec/s. > > For that use case, it's hard to say. Really the value of an ORM > decreases when you are dealing with large datasets, especially when > dealing with that data as a collection, like you would with web server > logs. You're really going to want to deal with that data inside MySQL, > not in Python -- e.g., to get a hit count, you'll want to run the > appropriate SQL command, not load the rows and count them in Python. > You can do a count in SQLObject, but there's lots of aggregate functions > that you can't do, so you'll hit a wall. Is it safe to use the DBAPI.getConnection() call to get the underlying connection and use this mixed in with ordinary SQLObject calls? It would be nice to be able to mix raw SQL and SQLObject as needed over the same connection. I'm not sure how SQLObject's caching works, but there's presumably a mechanism for invalidating results from individual tables? > > Is it unrealistic to use SQLObject for DB interaction when handling > > batch loads of data? I've done a quick profile of the code (top few > > calls below) and nothing jumps out as being particularly easy to optimise... > > > > ncalls tottime percall cumtime percall filename:lineno(function) > > 15308/14042 2.500 0.000 4.430 0.000 converters.py:179(sqlrepr) > > It's interesting that sqlrepr is at the top. I'll have to think about > how I'm using it. It's also been suggested that SQLObject rely on the > database driver's quoting instead of doing its own. This may lend more > weight to that opinion. Using the db driver's quoting would also mean that the SQL can be cached as prepared statements by the DB. For some (e.g. Oracle) that can lead to a pretty significant speed up in itself. Thanks for your reply, Colin. |