Thread: [SQLObject] Re: SQL Object's scalability
SQLObject is a Python ORM.
Brought to you by:
ianbicking,
phd
From: Ian B. <ia...@co...> - 2003-11-12 20:19:59
|
On Nov 12, 2003, at 1:04 PM, Scott Chapman wrote: > Hi Ian, > A while back I asked how you get the inserted ID of a record in > PostgreSQL and Luke sent back: > > Depends on the database/dbdriver. You'll find the implementation in > SQlObject/DBConnection.py, _queryInsertID() for each > DBConnection-class. Some Python db interfaces provide a non-standard > option for getting this from the cursor (MySQL, SQLite), some require > a second query of the database (Postgres using OIDs, for instance). > > I recently saw a post on the Postgres mailing list on this so I put up > your code as I was asking for clarification. You might want to check > out the thread: > > http://archives.postgresql.org/pgsql-general/2003-11/msg00376.php > > especially the message: > > http://archives.postgresql.org/pgsql-general/2003-11/msg00565.php Hmm... Does anyone else have thoughts on this? Basically it's an issue that the query "SELECT %s FROM %s WHERE oid = %s" % (idName, table, c.lastoid()) may not be using an index on the oid= portion, which would be problematic. At one time SQLObject used nextval to pre-allocate the ID, then did the insert (which is what Firebird does now, and what Oracle and Sybase would do). The disadvantage of this is that you needed to know the name of the sequence, which would sometimes vary (and maybe there were truncation problems). The disadvantage with the current scheme (besides the index) is that you need to define your ID field as "DEFAULT nextval('somesequence')" (or do that implicitly through SERIAL). The current scheme is convenient, but I don't know if it's really worth it. Should we go back to preallocating an ID through a nextval(sequence)? Really, to facilitate non-integer IDs, it would be nice if we could define a generic ID generator (which may or may not access the database). But I'm not really clear about all the use cases for that, so it's probably not a good idea to generalize that yet. -- Ian Bicking | ia...@co... | http://blog.ianbicking.org |
From: Scott C. <sco...@mi...> - 2003-11-12 20:30:43
|
On Wednesday 12 November 2003 12:19, Ian Bicking wrote: > On Nov 12, 2003, at 1:04 PM, Scott Chapman wrote: > > Hi Ian, > > A while back I asked how you get the inserted ID of a record in > > PostgreSQL and Luke sent back: > > > > Depends on the database/dbdriver. You'll find the implementation in > > SQlObject/DBConnection.py, _queryInsertID() for each > > DBConnection-class. Some Python db interfaces provide a > > non-standard option for getting this from the cursor (MySQL, > > SQLite), some require a second query of the database (Postgres > > using OIDs, for instance). > > > > I recently saw a post on the Postgres mailing list on this so I put > > up your code as I was asking for clarification. You might want to > > check out the thread: > > > > http://archives.postgresql.org/pgsql-general/2003-11/msg00376.php > > > > especially the message: > > > > http://archives.postgresql.org/pgsql-general/2003-11/msg00565.php > > Hmm... Does anyone else have thoughts on this? Basically it's an > issue that the query "SELECT %s FROM %s WHERE oid = %s" % (idName, > table, c.lastoid()) may not be using an index on the oid= portion, > which would be problematic. It's also an issue where not all tables have OID's these days. I posted a followup question on the PostgreSQL mailing list about the need for the front end (SQLObject in this case) to know "too much" about the database's structure and asked if there's an easy way to fix this. So far no good answers appear. > At one time SQLObject used nextval to pre-allocate the ID, then did > the insert (which is what Firebird does now, and what Oracle and > Sybase would do). The disadvantage of this is that you needed to > know the name of the sequence, which would sometimes vary (and maybe > there were truncation problems). The disadvantage with the current > scheme (besides the index) is that you need to define your ID field > as "DEFAULT nextval('somesequence')" (or do that implicitly through > SERIAL). What do you mean "truncation problems"? > The current scheme is convenient, but I don't know if it's really > worth it. Should we go back to preallocating an ID through a > nextval(sequence)? I see this as the only way to do it effectively as things are now. This is not a very elegant solution but that's the way it goes. The convenience could be made the same for an end-user of the SQLObject, perhaps. You could also make it so the user had more than one option here. (I am very fond of user-togglable in cases like this). > Really, to facilitate non-integer IDs, it would be nice if we could > define a generic ID generator (which may or may not access the > database). But I'm not really clear about all the use cases for > that, so it's probably not a good idea to generalize that yet. A generic ID generator in the front end (not the database itself)? That sounds like a Bad Thing, in that you'd better be SURE that no other application inserts anything into the database because it won't have access to your generator to generate a unique ID. |
From: Ian B. <ia...@co...> - 2003-11-12 20:51:36
|
On Nov 12, 2003, at 2:30 PM, Scott Chapman wrote: >> Hmm... Does anyone else have thoughts on this? Basically it's an >> issue that the query "SELECT %s FROM %s WHERE oid = %s" % (idName, >> table, c.lastoid()) may not be using an index on the oid= portion, >> which would be problematic. > > It's also an issue where not all tables have OID's these days. > > I posted a followup question on the PostgreSQL mailing list about the > need for the front end (SQLObject in this case) to know "too much" > about the database's structure and asked if there's an easy way to fix > this. So far no good answers appear. > >> At one time SQLObject used nextval to pre-allocate the ID, then did >> the insert (which is what Firebird does now, and what Oracle and >> Sybase would do). The disadvantage of this is that you needed to >> know the name of the sequence, which would sometimes vary (and maybe >> there were truncation problems). The disadvantage with the current >> scheme (besides the index) is that you need to define your ID field >> as "DEFAULT nextval('somesequence')" (or do that implicitly through >> SERIAL). > > What do you mean "truncation problems"? That names get truncated to 32 characters, which can more easily hit with SERIAL's names (table_id_name_seq, e.g., long_table_name_long_table_name_id_seq). Things still mostly work as long as the first 32 characters are unique (which they generally are), but it can get awkward. So you might instead want to use a sequence name like long_table_name_seq, but then you can't as easily predict the sequence name. >> The current scheme is convenient, but I don't know if it's really >> worth it. Should we go back to preallocating an ID through a >> nextval(sequence)? > > I see this as the only way to do it effectively as things are now. > This > is not a very elegant solution but that's the way it goes. > > The convenience could be made the same for an end-user of the > SQLObject, > perhaps. You could also make it so the user had more than one option > here. (I am very fond of user-togglable in cases like this). At least there should be something like an _idSequence option, so you can identify the sequence you want to use. That's probably good enough -- easy to explain, and the only real requirement it places on the user is that there must be some sequence (unless you always give explicit IDs on row creation). >> Really, to facilitate non-integer IDs, it would be nice if we could >> define a generic ID generator (which may or may not access the >> database). But I'm not really clear about all the use cases for >> that, so it's probably not a good idea to generalize that yet. > > A generic ID generator in the front end (not the database itself)? > That > sounds like a Bad Thing, in that you'd better be SURE that no other > application inserts anything into the database because it won't have > access to your generator to generate a unique ID. Yeah, I'm not really sure how it might work, since I'm not sure about the use cases. If you just want to be explicit about the key (e.g., use username as the primary key) then that's not so big a deal, you just pass it in as part of new(). In other instances you might query something else -- like another sequence-like-object in the database, or a shared sequence, or an ID generator from another system. -- Ian Bicking | ia...@co... | http://blog.ianbicking.org |
From: Nick <ni...@dd...> - 2003-11-12 20:39:25
|
On Wed, 2003-11-12 at 14:19, Ian Bicking wrote: > The current scheme is convenient, but I don't know if it's really worth > it. Should we go back to preallocating an ID through a > nextval(sequence)? psycopg, at least, supports lastrowid for cursor objects now, which may help. Nick |