Thread: [SQLObject] Re: Bug, was: one to one relationship via foreign key--bug or dumbness? (Page 2)
SQLObject is a Python ORM.
Brought to you by:
ianbicking,
phd
From: Ian B. <ia...@co...> - 2003-05-31 02:01:40
|
On Thu, 2003-05-29 at 22:50, Edmund Lian wrote: > Ian Bicking wrote: > > > Okay, my fix was just all wrong, I mixed up the argument order. CVS > > fixed this time, maybe. > > Sorry, not really. My tests: Okay, *this* time! Finally wrote the dumb unit test... |
From: Edmund L. <el...@in...> - 2003-05-31 02:53:46
|
Ian Bicking wrote: > Okay, *this* time! Finally wrote the dumb unit test... Almost there! :-) I have one failure: >>> from test import * >>> cc = ComponentCategory.new(name="CAT") >>> cc <ComponentCategory 3 name='CAT' description=None sequenceNum=None> Create component with default None for category: >>> c = Component.new(name="Mad") >>> c <Component 9 name='Mad' categoryId=None sequenceNum=None> >>> c.category Try assigning instance of ComponentCategory to category attribute: >>> c.category = cc >>> c.category <ComponentCategory 3 name='CAT' description=None sequenceNum=None> Now try creating component while passing in an instance of ComponentCategory: >>> c = Component.new(name="Mad", category=cc) >>> c <Component 10 name='Mad' categoryId='3' sequenceNum=None> >>> c.category Traceback (most recent call last): File "<stdin>", line 1, in ? File "/usr/lib/python2.2/site-packages/SQLObject/SQLObject.py", line 928, in __repr__ ' '.join(['%s=%s' % (name, repr(value)) for name, value in self._reprItems()])) TypeError: an integer is required ...Edmund. |
From: Ian B. <ia...@co...> - 2003-05-31 03:36:08
|
On Fri, 2003-05-30 at 21:53, Edmund Lian wrote: > Ian Bicking wrote: > > > Okay, *this* time! Finally wrote the dumb unit test... > > Almost there! :-) I have one failure: > > >>> from test import * > >>> cc = ComponentCategory.new(name="CAT") > >>> cc > <ComponentCategory 3 name='CAT' description=None sequenceNum=None> > > > Create component with default None for category: > > >>> c = Component.new(name="Mad") > >>> c > <Component 9 name='Mad' categoryId=None sequenceNum=None> > >>> c.category > > > > Try assigning instance of ComponentCategory to category attribute: > > >>> c.category = cc > >>> c.category > <ComponentCategory 3 name='CAT' description=None sequenceNum=None> > > > Now try creating component while passing in an instance of > ComponentCategory: > > > >>> c = Component.new(name="Mad", category=cc) > >>> c > <Component 10 name='Mad' categoryId='3' sequenceNum=None> > >>> c.category > Traceback (most recent call last): > File "<stdin>", line 1, in ? > File "/usr/lib/python2.2/site-packages/SQLObject/SQLObject.py", line > 928, in __repr__ > ' '.join(['%s=%s' % (name, repr(value)) for name, value in > self._reprItems()])) > TypeError: an integer is required Something weird is going on here. Note: <Component 10 name='Mad' categoryId='3' sequenceNum=None> categoryId is '3', not the integer 3. Why is this happening? That seems to be the problem, but it shouldn't be possible to get a non-integer in there. Unless cc was already messed up before it got assigned, and its id was '3'. Component('3') would very possibly work... maybe there should be an assert somewhere to avoid this... I assume that happened in some step you're not showing. Ian |
From: Edmund L. <el...@in...> - 2003-05-30 04:33:14
|
Ian, I've been banging away at SQLObject all day, trying to jam a small portion of my data model into it, or else rewrite it so it would fit. Here are my thoughts so far, each worth $0.02, and in no particular order. I have appended + and minus signs to indicate strength of feeling. More of either = stronger positive or negative reation. +- = no reaction. 1. Being able to autogenerate schema is nice, and this is how I've been using testing SQLObject (SO). ++ 2. Support for multiple DBs, particularly PostgreSQL, is good. +++ 3. There needs to be an object.delete() method that does not require you to supply the object ID. When not supplied, it deletes the current object. There is a bug in this right now. When you do object.destroySelf(), the corresponding row is deleted from the DB, but the instance data is not invalidated. +- 4. Being able to supply a callable object for the default value of a column is good. +++ 5. Being able to override table name is important because when you have lots of tables floating around, you often want to control their names so that you group them together is a specific way when the table names are sorted. ++++ 6. Need to have some way of specifying unique constraints for single and groups of columns for auto schema generation. --- See (7) below. 7. I don't use enums at all. For table driven applications, I use primary keys coupled with foreign key constraints. This allows you to change the allowed values of a column by varying the contents of another table. This is one argument against using synthetic primary keys. But using synthetic primary keys is still OK so long as unique columns can be specified, and the foreign key constraint can take the name of a column. 8. There is no way to specify (at least not that I can see), outer joins. These are important. --- 9. Method of specifying joins seems a tad clumsy. Object Relational Membrane's method is cleaner. 10. There needs to be a way to specify a not null constraint. 11. As an extension of (7), it is clumsy to have to pass in an object (A) when creating another object (B) that has a foreign key constraint refering to a column in (A). Here's an example that assumes I can add unique (6), not null constraints (10), and name a column for a foreign key (7) to a column: class Gender(SQLObject): _columns = [ StringCol("name", unique=1, nullable=0)] class Person(SQLObject): _columns = [ StringCol("name"), StringCol("gender", nullable=0, foreignKey="Gender.name")] With this schema, one should be able to do: Gender.new(name="Male") # Done once Gender.new(name="Female") # Done once man = Person(name="Geoff", gender="Male") # somewhere else in the code If one has to pass in an instance of Gender, then it becomes very clumsy and slow, because you have to instantiate the object you want first. i.e.: male = Gender.select(Gender.q.name=="Male") man = Person(name="Geoff", gender=male) A big yuk. Makes table driven applications awful. ---- 12. Similarly, if a class/table is just being used for integrity checks, returning an object rather than the referenced column within the row is clumsy. Assuming the schema in (11), one would like: target = Person.select(Person.q.name=="Geoff") print target.gender "Male" When an object is returned, one would have to know the name of the column and then dereference it: target = Person.select(Person.q.name=="Geoff") print target.gender.name "Male" 13. Being able to add and drop columns at runtime is great. This removes the need to regenerate the tables in a live database. +++ 14. Modeling (which seems rather hard to get into) seems to have very expressive ways to get complex joins. I haven't tried it at all, but have browsed the docs. Might be worth stealing some ideas from them. But gee, theyReallyLikeLongMethodNames! Or maybe I'm sensitive to them due to RSI... ...Edmund. |
From: Bud P. B. <bu...@si...> - 2003-05-30 08:53:58
|
On Fri, 30 May 2003 00:33:00 -0400 Edmund Lian <el...@in...> wrote: > 7. I don't use enums at all. For table driven applications, I use > primary keys coupled with foreign key constraints. This allows you to > change the allowed values of a column by varying the contents of another > table. This is one argument against using synthetic primary keys. But > using synthetic primary keys is still OK so long as unique columns can > be specified, and the foreign key constraint can take the name of a column. I've used the above pattern a lot and very successfully so. But I believe also that it is a pattern that is limited strictly to the relational domain--so I've been pondering how to deal with this in a more object-oriented way. And here is what I've come up with so far: I believe the "lookup tables" or whatever you call them should behave as Python Classes to the Application programmer. They may of course still be stored in the dbms (class attributes instead of instance attributes...). If this sounds interesting, I'll keep you posted on my progress in this.. |
From: Edmund L. <el...@in...> - 2003-05-30 17:24:27
|
Bud P.Bruegger wrote: > I've used the above pattern a lot and very successfully so. But I > believe also that it is a pattern that is limited strictly to the > relational domain--so I've been pondering how to deal with this in a > more object-oriented way. And here is what I've come up with so far: > > I believe the "lookup tables" or whatever you call them should behave > as Python Classes to the Application programmer. They may of course > still be stored in the dbms (class attributes instead of instance > attributes...). If this sounds interesting, I'll keep you posted on > my progress in this.. But this is how they are behaving right now, and it seems awkward and is inefficient. Contents of the lookup table should be accessible as objects for updates, etc. But when it is being used to constrain the values of a column in another table, then that column should really be a first class attribute. What I think is more natural is to have two kinds of foreign key constraints along the lines of: class LookUpTable(SQLObject): _columns = [ StringCol("stuff", default=None), StringCol("name", unique=1)] class ConstrainedTable(SQLObject): _columns = [ StringCol("moreStuff", default=None), StringCol("constrainedStuff", foreignKeyAttribute="LookUpTable.name")] class NormalTable(SQLObject): _columns = [ StringCol("otherStuff", default=None), StringCol("objectStuff", foreignKey="LookUpTable.name") In use: # Add entries to LookUpTable, treating it as an object LookUpTable.new(name="Cat") LookUpTable.new(name="Dog") # Add entries to NormalTable, as per current SQLObject behavior obj = NormalTable.new(objectStuff=<instance of LookUpTable>) print obj.objectStuff <instance of LookUpTable> # Add entries to ConstrainedTable, where we have specified that # the lookup table should not be "objectified" since it is really # just a constraint table. obj = ConstrainedTable.new(constrainedStuff="Cat") print obj.constraintedStuff "Cat" ...Edmund. |
From: Ian B. <ia...@co...> - 2003-05-31 03:33:34
|
On Thu, 2003-05-29 at 23:33, Edmund Lian wrote: > Ian, > > I've been banging away at SQLObject all day, trying to jam a small > portion of my data model into it, or else rewrite it so it would fit. > Here are my thoughts so far, each worth $0.02, and in no particular order. > > I have appended + and minus signs to indicate strength of feeling. More > of either = stronger positive or negative reation. +- = no reaction. > > 1. Being able to autogenerate schema is nice, and this is how I've been > using testing SQLObject (SO). ++ Yes, very useful for testing, moreso than I expected. Great for unit testing. > 2. Support for multiple DBs, particularly PostgreSQL, is good. +++ > > 3. There needs to be an object.delete() method that does not require you > to supply the object ID. When not supplied, it deletes the current > object. There is a bug in this right now. When you do > object.destroySelf(), the corresponding row is deleted from the DB, but > the instance data is not invalidated. +- Invalidating instance data is annoying. destroySelf is annoying. It all reeks of manual memory management, though I suppose such is life with an RDBMS. > 4. Being able to supply a callable object for the default value of a > column is good. +++ > > 5. Being able to override table name is important because when you have > lots of tables floating around, you often want to control their names so > that you group them together is a specific way when the table names are > sorted. ++++ > > 6. Need to have some way of specifying unique constraints for single and > groups of columns for auto schema generation. --- See (7) below. At some point SQLObject shouldn't do every detail of schema creation. Maybe there should be a hook for adding your own SQL that gets executed on createTable/dropTable, so you can use those methods without (or in addition to) the automatic table definition. > 7. I don't use enums at all. For table driven applications, I use > primary keys coupled with foreign key constraints. This allows you to > change the allowed values of a column by varying the contents of another > table. This is one argument against using synthetic primary keys. But > using synthetic primary keys is still OK so long as unique columns can > be specified, and the foreign key constraint can take the name of a column. > > 8. There is no way to specify (at least not that I can see), outer > joins. These are important. --- You'll have to give an example of what you want to do. I don't see the need. > 9. Method of specifying joins seems a tad clumsy. Object Relational > Membrane's method is cleaner. Yes, needing to declare both forward and backward references is a tad awkward. There may yet be advantages of joins being in both locations -- for instance, being able to add ordering information. But eh... it's not perfect. Though when I look at ORM, it's not that much different than what I'm doing, and I even avoid the circular dependency problem. The names may be better in ORM -- I don't like my names. But then one2many -- who is the one, who is the many? That's what I find hard in naming. > 10. There needs to be a way to specify a not null constraint. notNull=True or notNone=True as a keyword argument to your column. > 11. As an extension of (7), it is clumsy to have to pass in an object > (A) when creating another object (B) that has a foreign key constraint > refering to a column in (A). Here's an example that assumes I can add > unique (6), not null constraints (10), and name a column for a foreign > key (7) to a column: > > class Gender(SQLObject): > _columns = [ > StringCol("name", unique=1, nullable=0)] StringCol("name", alternateID=True) # or with the new syntax: name = StringCol(alternateID=True) male = Gender.byName('male') > class Person(SQLObject): > _columns = [ > StringCol("name"), > StringCol("gender", nullable=0, foreignKey="Gender.name")] Hmmm... that's not so bad. But it does complicate the SQL a fair amount, at least if it wasn't going to involve two SQL queries for doing person.gender. > With this schema, one should be able to do: > > Gender.new(name="Male") # Done once > Gender.new(name="Female") # Done once > man = Person(name="Geoff", gender="Male") # somewhere else in the code This, though... I dunno. I'm not sure about it at all. > If one has to pass in an instance of Gender, then it becomes very clumsy > and slow, because you have to instantiate the object you want first. i.e.: > > male = Gender.select(Gender.q.name=="Male") > man = Person(name="Geoff", gender=male) Also: male = Gender.selectBy(name="Male")[0] alternateID is, of course, what's really intended in this situation. > A big yuk. Makes table driven applications awful. ---- > > 12. Similarly, if a class/table is just being used for integrity checks, > returning an object rather than the referenced column within the row is > clumsy. Assuming the schema in (11), one would like: > > target = Person.select(Person.q.name=="Geoff") > print target.gender > "Male" > > When an object is returned, one would have to know the name of the > column and then dereference it: > > target = Person.select(Person.q.name=="Geoff") > print target.gender.name > "Male" Yeah... but I'm still unsold. You can always do: class Person(SQLObject): def _get_gender(self): return self._SO_get_gender().name def _set_gender(self, value): self._SO_set_gender(Gender.byName(value)) It's not that bad to use properties and create your own attributes, if SQLObject's attributes don't suite you. People should do that more -- I went out of my way to make it possible (and to document it)! This is exactly the sort of thing it's there for. > 13. Being able to add and drop columns at runtime is great. This removes > the need to regenerate the tables in a live database. +++ > > 14. Modeling (which seems rather hard to get into) seems to have very > expressive ways to get complex joins. I haven't tried it at all, but > have browsed the docs. Might be worth stealing some ideas from them. But > gee, theyReallyLikeLongMethodNames! Or maybe I'm sensitive to them due > to RSI... I've only looked at Modeling a bit. But ouch... those functions and methods are painful to read, not to mention write. Like: from Modeling.FetchSpecification import FetchSpecification from Modeling.Qualifier import qualifierWithQualifierFormat qualifier=qualifierWithQualifierFormat('lastName=="Hugo"') fetchSpec=FetchSpecification(entityName='Writer', qualifier=qualifier) objects=ec.objectsWithFetchSpecification(fetchSpec) That's a "simple" fetch. Hurts my head just having that on the page. Oh my: aBook.addObjectToBothSidesOfRelationshipWithKey(anAuthor, 'author') Okay. I guess that's enough of that, it's not fair to pick on other ORMs. Besides joinSemantic, there's not a whole lot else that's not just standard infrastructure. joinSemantic has outer joins, but I don't know what they really mean to an ORM anyway. I mean, say you want all people, and you also want their picture if they have one... for person in Person.select(): image = person.image # and do stuff, image may be None # and don't tell me an outer join is easier to understand... ;) If you want to do it in one query, well, you can't. ORMs tend to throw out many more SQL queries. Some of them can probably be optimized, you could do so using the selectResults keyword (for instance), but that does require careful reading of the source to use... maybe that could be made easier, but I don't think it can (or should) be made seemless. I think a lot of uses outer joins are just fine as multiple queries, that's something you should optimize based on an empirical need. Ian |
From: Edmund L. <el...@in...> - 2003-05-31 19:41:01
|
Ian Bicking wrote: >>6. Need to have some way of specifying unique constraints for single and >> groups of columns for auto schema generation. --- See (7) below. > > > At some point SQLObject shouldn't do every detail of schema creation. > Maybe there should be a hook for adding your own SQL that gets executed > on createTable/dropTable, so you can use those methods without (or in > addition to) the automatic table definition. True, however, if you're going to support schema generation, it makes sense to at least support the most common constraints: unique, not null, etc. As it stands, SQLObject nearly supports the common cases. Adding uniqueness support would allow SQLObject to support most common cases. >>7. I don't use enums at all. For table driven applications, I use >>primary keys coupled with foreign key constraints. This allows you to >>change the allowed values of a column by varying the contents of another >>table. This is one argument against using synthetic primary keys. But >>using synthetic primary keys is still OK so long as unique columns can >>be specified, and the foreign key constraint can take the name of a column. Speaking of enums, I notice that there's no explicit support for booleans. No big deal unless you consider schema generation, where the boolean column type should be generated. >>8. There is no way to specify (at least not that I can see), outer >>joins. These are important. --- > > You'll have to give an example of what you want to do. I don't see the > need. Maybe it is just that I don't know how to do it in SQLObject? Here's what I'm trying to do... Schema: create table person ( id serial, username varchar(30), -- constraint person_key primary key (id), constraint person_uq unique (username) ); create table car ( id serial, owner_id integer, make varchar(30), -- constraint car_key primary key (id) constraint car_owner_fk foreign key (owner_id) references person (id) ); What I want: a list of car makes and owners. In SQL: select p.username, c.make from person as p left join car as c on p.id = c.owner If I don't use a left outer join, I will get a list of people who have cars, and the people who don't have cars won't appear in it. In SQLObject: class Person(SQLObject): _columns = [ StringCol("username", length=30, notNone=True, alternateID=True)] class Car(SQLObject): _columns = [ StringCol("make", length=30, notNone=True), IntCol("ownerId", foreignKey="Person")] To get a list of all people: Person.select() To get a list of all car makes: Car.select() How do I achieve the same as a left outer join without writing a loop? > notNull=True or notNone=True as a keyword argument to your column. OK, it wasn't in the manual, or I'm going blind. > Yeah... but I'm still unsold. You can always do: > > class Person(SQLObject): > > def _get_gender(self): > return self._SO_get_gender().name > def _set_gender(self, value): > self._SO_set_gender(Gender.byName(value)) > > It's not that bad to use properties and create your own attributes, if > SQLObject's attributes don't suite you. People should do that more -- I > went out of my way to make it possible (and to document it)! This is > exactly the sort of thing it's there for. Sorry, I didn't understand the implications of this when I read the documentation. I'll give it a shot. Regarding the _connection class attribute. How do I use it? Do I just pass it a connection object? This raise the bigger question of how to handle connections when multiple modules, each implementing SQLObjects are used. What is the best way to do this? Should I just import the connection string from a global settings file and have each module open its own connection via __connection__? Seems wrong since I might want to change the backend DB, and don't want to dive into each module to edit the __connection__ arguments. I have a global connection pool, so perhaps I should just have each SQLObject-based module import the pool and fetch a connection. Is this what the _connection attribute is for? Can we not use the DBConnection object? Oh about orderBy... how do we specify whether it is ascending or descending? ...Edmund. |
From: Luke O. <lu...@me...> - 2003-06-01 03:02:56
|
> True, however, if you're going to support schema generation, it makes > sense to at least support the most common constraints: unique, not null, > etc. As it stands, SQLObject nearly supports the common cases. Adding > uniqueness support would allow SQLObject to support most common cases. Actually, notNull and unique are both currently supported by columns. Ian's a good documenter, there's just been a lot of changes since the last cut. :) Note that uniqueness is implied by "alternateID" as well, which is what generates the byColName-type acccesses... maybe unique and alternateID should become aliases? (only downside I can see is extra method creation overhead if you don't actually care to access byName...) > Speaking of enums, I notice that there's no explicit support for > booleans. No big deal unless you consider schema generation, where the > boolean column type should be generated. Yep. Big problem holding this back is that any current implementation of a BooleanCol doesn't work in Postgres, because saying "obj.boolCol = True" fails during SQL conversion, since PG won't accept integer literals for boolean values. Grr.. (I'm a happy PG user, but this always bothers me...) > If I don't use a left outer join, I will get a list of people who have > cars, and the people who don't have cars won't appear in it. ... > How do I achieve the same as a left outer join without writing a loop? Hmm. I see your example, and from an object dispatch perspective I'm not sure how to solve it. I'm less interested from a performance viewpoint (it doesn't matter to me immediately whether SQLObject actually uses LEFT/RIGHT join SQL), but how I would specify this in python other than a Set (list) operation... I suppose something explicit like Car.unionPerson / Person.unionCar could work... ahh, there's the reason this doesn't fit into SO's model: It specifies results that are neither Cars nor Persons. How would you propose to represent the return values from such a function as objects? I think in the O-R world there is no way to do this without loops or Set operations. But I'd happily be proven wrong. > This raise the bigger question of how to handle connections when > multiple modules, each implementing SQLObjects are used. What is the > best way to do this? Here's how we do it: Each group/registry defines a "CoreObject", an abstract superclass for all actual objects that defines _connection. One place to change it for an entire collection of objects. > I have a global connection pool, so perhaps I should just have each > SQLObject-based module import the pool and fetch a connection. Is this > what the _connection attribute is for? Can we not use the DBConnection > object? We use a global pool too, so we have a simple subclass of PostgresConnection that imports the pool and overrides getConnection (return globalPool.getConnection()), releaseConnection (conn.close()), and makeConnection (a no-op). _connection (and DBConnection etc) have a confusing name, as they define a lot more than just the connection... they also do all the SQL access generation, so you can't just set it to your current pool's connections. (Ian: thoughts on in a version or two renaming that to one of the other possibilities like _dbDriver or ...?) > Oh about orderBy... how do we specify whether it is ascending or descending? You're the second (or third, if you count me :)) person who's asked this recently.. not currently supported. It would need to be added to SelectResults, and preferrably an addition to _defaultOrder would also be made, although I didn't really like my initial naming _defaultOrderDirection, but I can't think of a way to easily make it simply part of the _defaultOrder var either (an optional tuple? urgh).. Enough for now, - Luke |
From: Edmund L. <el...@in...> - 2003-06-01 06:46:17
|
It's boring me again. I've come across a conundrum... how to store attributes of a relationship. Here's the problem... Suppose you have two entities: Person and Dog. You want to be able to model the fact that there is a relationship between any person and any dog. The way it would be done with SQLObject, I suppose, would be to use RelatedJoin as per: class Person(SQLObject): _columns = [ "name"] _joins = [RelatedJoin("Dog")] class Dog(SQLObject): _columns = [ "name"] _joins = [RelatedJoin("Person")] When RelatedJoins are used, the relationship is stored as a row in a mapping table named Dog_Person. So far so good, this is how it is done in a traditional schema too. Now the problem is that relationships have attributes too. For example, we might want to store the date relationship started, intensity, etc. These are attributes of the relationship, not of Person or Dog. In traditional schema, this would be stored in the mapping table, or some table that is related to the mapping table. How do we deal with this in SQLObject? ...Edmund. |
From: Luke O. <lu...@me...> - 2003-06-01 07:10:54
|
Well, I have to admit to dealing with this problem, in not very nice ways, and not trying to find a better solution for SO yet. So this idea is not directly supported, but... First, I define an SO class for the intersection, DogPerson or something more meaningful like DogOwners. Whatever. And define it as though it has foreignKeys to Dog and Person, and whatever other attrs. Then Person and Dog each have MultipleJoins to this class. aDog.owners or aPerson.ownedDogs gives access to the intersection objects. Each main class also gets a RelatedJoin to the other. This just facilitates shortcut access (aDog.persons == [ x.person for x in aDog.owners ] ). I think that about covers it. Something makes me think there are also cases where I've done a version of joinWhere (aDog.ownersWhere('otherAttr = x')), but I can't think of why now, and it probably was due to the more general desire for joinWhere than anything to do with intersection records. Having written this up, I can't think of how I'd really clean this up to be more directly supported... Thoughts? The RelatedJoin shortcut might be able to be cleared up, to be more of a generic multiple-level-join shortcut thing, but other than that... I guess I kind of like it. - Luke P.S. joinWhere... just a combination of select and join behaviors, but accessed more like a parameterized join. I ought to hurry up and make this code more generic, as I think other people have asked for similar functionality.. |
From: Edmund L. <el...@in...> - 2003-06-01 07:53:07
|
Luke Opperman wrote: > Having written this up, I can't think of how I'd really clean this up to be > more directly supported... Thoughts? The RelatedJoin shortcut might be able to > be cleared up, to be more of a generic multiple-level-join shortcut thing, but > other than that... I guess I kind of like it. I think the best way to handle this is to always make the mapping/intersection table a first class table/object, with a serial id column, etc. Kind of like what you did. Right now, it lurks around, buried in RelatedJoin declarations, does not have a serial id, etc. If it were a distinct, accessible object then users could add other tables/objects with foreign keys pointing to the relevant mapping. ...Edmund. |
From: Edmund L. <el...@in...> - 2003-06-01 17:22:04
|
I did an experiment this morning to see if the mapping table could be treated as a first class table. It can, so I don't think SQLObject needs to be modified. Here's what I did: class Person(SQLObject): _columns = [ StringCol("firstName"), StringCol("middleInitial", default=None), StringCol("lastName")] _joins = [RelatedJoin("Role")] class PersonRole(SQLObject): _columns = [ IntCol("personId"), IntCol("roleId"), StringCol("color"), StringCol("weight")] class Role(SQLObject): _columns = [ StringCol("name")] _joins = [RelatedJoin("Person")] Rather than rely on auto schema generation, I just created the mapping table myself, and included an id column as well as test attribute columns. With this, the many-to-many mapping via .addRole, .addPerson etc. works. And, you can create a PersonRole object directly and manipulate its attributes as: pr = PersonRole.new(...) pr.color = "Blah" There is one thing though--without specifying a unique (person_id, role_id) constraint, it is possible to insert duplicate mappings into the table. Adding the constraint stops the redundent entry, but causes SQLObj to raise an exception (because psycopg raises an exception). I think the exception should be caught and silently ignored. ...Edmund. |
From: Nick <ni...@dd...> - 2003-06-01 18:28:40
|
Edmund Lian wrote: > There is one thing though--without specifying a unique (person_id, > role_id) constraint, it is possible to insert duplicate mappings into > the table. Adding the constraint stops the redundent entry, but causes > SQLObj to raise an exception (because psycopg raises an exception). I > think the exception should be caught and silently ignored. Actually it would be nice if psycopg actually raised meaningful exceptions rather than OperationalError for every damn thing. Nick |
From: Edmund L. <el...@in...> - 2003-06-03 20:35:48
|
Nick wrote: > Actually it would be nice if psycopg actually raised meaningful > exceptions rather than OperationalError for every damn thing. Here, here. Gerhard has been putting in changes, and it does raise IntegrityErrors now too. But, the real problem lies in the DB-API's silence on more detailed, cross-DB error messages. I ran into this problem a few months back when I was trying to write some reasonably DB independent code. Using simple SQL would generally do the trick, but since there were no standard exceptions to give detailed errors (e.g., foreign key constraint violation versus duplicate key insertion), I crashed and burned. ...Edmund. |
From: Ian B. <ia...@co...> - 2003-06-03 07:41:35
|
On Sun, 2003-06-01 at 01:46, Edmund Lian wrote: > Now the problem is that relationships have attributes too. For example, > we might want to store the date relationship started, intensity, etc. > These are attributes of the relationship, not of Person or Dog. > > In traditional schema, this would be stored in the mapping table, or > some table that is related to the mapping table. How do we deal with > this in SQLObject? It looks like you got this resolved, and that's probably the right resolution as well. After the next release I'm going to beef up the join stuff a bit more, and it will be possible to add extra values to the join. So you might do (in my as-yet-imaginary extension): class Person(SQLObject): dogs = ExtensibleRelatedJoin('Dog', extraColumn='times_per_day') class Dog(SQLObject): people = ExtensibleRelatedJoin('Person', extraColumn='times_per_day') d = Dog.new() p = Person.new() d.addPerson(p, 1) # p walks d once per day p2 = Person.new() d.addPerson(p2, 3) # p2 walks d three times per day (lots of walking) d.people >>> [(<Person 1>, 1), (<Person 2>, 3)] p.dogs >>> [(<Dog 1>, 1)] If the relationship starts getting more complicted, than full SQLObject classes for the relation table are probably called for. Ian |
From: Ian B. <ia...@co...> - 2003-06-03 06:50:32
|
On Sat, 2003-05-31 at 21:47, Luke Opperman wrote: > > True, however, if you're going to support schema generation, it makes > > sense to at least support the most common constraints: unique, not null, > > etc. As it stands, SQLObject nearly supports the common cases. Adding > > uniqueness support would allow SQLObject to support most common cases. > > Actually, notNull and unique are both currently supported by columns. Ian's a > good documenter, there's just been a lot of changes since the last cut. :) > > Note that uniqueness is implied by "alternateID" as well, which is what > generates the byColName-type acccesses... maybe unique and alternateID should > become aliases? (only downside I can see is extra method creation overhead if > you don't actually care to access byName...) Yeah, maybe the two are rather redundant. I wouldn't worry about the method creation overhead, but more about having useless methods laying around. Maybe alternateID should be removed, and unique would imply alternateID. alternateMethodName would become fetchMethodName, and it would be required if you wanted what is now byColName. > > If I don't use a left outer join, I will get a list of people who have > > cars, and the people who don't have cars won't appear in it. > ... > > How do I achieve the same as a left outer join without writing a loop? > > Hmm. I see your example, and from an object dispatch perspective I'm not sure > how to solve it. I'm less interested from a performance viewpoint (it doesn't > matter to me immediately whether SQLObject actually uses LEFT/RIGHT join SQL), > but how I would specify this in python other than a Set (list) operation... I > suppose something explicit like Car.unionPerson / Person.unionCar could > work... ahh, there's the reason this doesn't fit into SO's model: > > It specifies results that are neither Cars nor Persons. How would you propose > to represent the return values from such a function as objects? Exactly -- all the fancy SELECT statements create results that aren't a homogeneous list of database objects. *Some* statements could be turned into a list of tuples or dictionaries of database objects, like [(p, p.car) for p in Person.select()] -- and maybe being able to product that same sort of result from a single SQL query would be useful in SQLObject (though a lot of work to implement, and usually not necessary)... > > This raise the bigger question of how to handle connections when > > multiple modules, each implementing SQLObjects are used. What is the > > best way to do this? > > Here's how we do it: Each group/registry defines a "CoreObject", an abstract > superclass for all actual objects that defines _connection. One place to > change it for an entire collection of objects. Ah... SiteSQLObject ;) SitePage annoys me sometimes, though... should there be a better way to do this? > > I have a global connection pool, so perhaps I should just have each > > SQLObject-based module import the pool and fetch a connection. Is this > > what the _connection attribute is for? Can we not use the DBConnection > > object? > > We use a global pool too, so we have a simple subclass of PostgresConnection > that imports the pool and overrides getConnection (return > globalPool.getConnection()), releaseConnection (conn.close()), and > makeConnection (a no-op). _connection (and DBConnection etc) have a confusing > name, as they define a lot more than just the connection... they also do all > the SQL access generation, so you can't just set it to your current pool's > connections. > > (Ian: thoughts on in a version or two renaming that to one of the other > possibilities like _dbDriver or ...?) I want to rename all the *Connection classes to be *Store, after this next release. Then _connection becomes _store, or maybe __store__. Is that the renaming you are thinking about? > > Oh about orderBy... how do we specify whether it is ascending or descending? > > You're the second (or third, if you count me :)) person who's asked this > recently.. not currently supported. It would need to be added to > SelectResults, and preferrably an addition to _defaultOrder would also be > made, although I didn't really like my initial naming _defaultOrderDirection, > but I can't think of a way to easily make it simply part of the _defaultOrder > var either (an optional tuple? urgh).. SQLBuilder could handle it with just a simple (but not yet existent) function, but most people aren't using SQLBuilder for orderBy (like Person.select(orderBy=(Person.q.lname, Person.q.fname)). I'll add an option to SelectResults, probably "descending", that will do this. Ian |
From: Edmund L. <el...@in...> - 2003-06-01 03:05:11
|
Hey Ian, I had a quick glance at Col.py, and it appears that you do support specification of uniqueness (as well as notNull, which you mentioned)... ...Edmund. |
From: Ian B. <ia...@co...> - 2003-06-03 06:38:19
|
On Sat, 2003-05-31 at 14:40, Edmund Lian wrote: > Ian Bicking wrote: > >>6. Need to have some way of specifying unique constraints for single and > >> groups of columns for auto schema generation. --- See (7) below.> > > > > At some point SQLObject shouldn't do every detail of schema creation. > > Maybe there should be a hook for adding your own SQL that gets executed > > on createTable/dropTable, so you can use those methods without (or in > > addition to) the automatic table definition. > > True, however, if you're going to support schema generation, it makes > sense to at least support the most common constraints: unique, not null, > etc. As it stands, SQLObject nearly supports the common cases. Adding > uniqueness support would allow SQLObject to support most common cases. That would be fine. If you can enumerate all those, and suggest how they'd be specified in Python, I'm happy to add them. > >>7. I don't use enums at all. For table driven applications, I use > >>primary keys coupled with foreign key constraints. This allows you to > >>change the allowed values of a column by varying the contents of another > >>table. This is one argument against using synthetic primary keys. But > >>using synthetic primary keys is still OK so long as unique columns can > >>be specified, and the foreign key constraint can take the name of a column. > > Speaking of enums, I notice that there's no explicit support for > booleans. No big deal unless you consider schema generation, where the > boolean column type should be generated. Yes, like Luke said, it is a problem with 't' and 'f'. You'd have to do: someBool = Col(sqlType='BOOLEAN') def _set_someBool(self, value): if value: self._SO_set_someBool(self, 't') else: self._SO_set_someBool(self, 'f') I should remember to put this recipe in the docs, since a fix won't come until the version after the upcoming version. > What I want: a list of car makes and owners. In SQL: > > select p.username, c.make > from person as p > left join car as c on p.id = c.owner > > > If I don't use a left outer join, I will get a list of people who have > cars, and the people who don't have cars won't appear in it. > > > In SQLObject: > > class Person(SQLObject): > _columns = [ > StringCol("username", length=30, notNone=True, > alternateID=True)] > > class Car(SQLObject): > _columns = [ > StringCol("make", length=30, notNone=True), > IntCol("ownerId", foreignKey="Person")] > > > To get a list of all people: Person.select() > To get a list of all car makes: Car.select() > > How do I achieve the same as a left outer join without writing a loop? Well, you would do a loop, but at least it's a pretty loop, thanks to list comprehension: [p.username, (p.car and p.car.make) for p in Person.select()] You do more SQL queries in this version, but in most cases I doubt that will be a performance problem. > > notNull=True or notNone=True as a keyword argument to your column. > > OK, it wasn't in the manual, or I'm going blind. Mmm... yeah, I don't think I've documented any of the schema generation yet. I've been redoing the documentation just in the last week or so, updating for the next release. > Regarding the _connection class attribute. How do I use it? Do I just > pass it a connection object? > > This raise the bigger question of how to handle connections when > multiple modules, each implementing SQLObjects are used. What is the > best way to do this? > > Should I just import the connection string from a global settings file > and have each module open its own connection via __connection__? Seems > wrong since I might want to change the backend DB, and don't want to > dive into each module to edit the __connection__ arguments. Mostly I'd use a single module where you define the connection, exporting a single "connection" variable, and then all the other classes would use that. I'd thought about naming connections before, so you'd say: _connection = "main" And somewhere else: PostgresConnection(name="main", ...) And then it'd be registered. Maybe that'd be useful to put back in. > I have a global connection pool, so perhaps I should just have each > SQLObject-based module import the pool and fetch a connection. Is this > what the _connection attribute is for? Can we not use the DBConnection > object? Luke gave a good solution to this. It should be generalized, though. > Oh about orderBy... how do we specify whether it is ascending or descending? Nowhere now. I should add an option to SelectResults and a function to SQLBuilder... I'll do that before I make the next release, since it should just be a small change. Ian |
From: Bud P. B. <bu...@si...> - 2003-06-02 08:21:48
|
Apologies for falling behind the discussion (and it's probably going to get worth starting from tomorrow). Looked at Modeling too and have two impressions: * Looking at other ORMs, I think we can't give enough credit to Ian for having found a way of doing things EASY and intuitive. Some of that stuff looks to complex and verbose that it would be great for Enterprise Java... Thanks, Ian! One really starts to appreciate this more and more when one looks at alternatives (which I haven't done much before..) * Apart from that, I kind of like Modelings simple dot-notation for specifying Where conditions on related objects. Will have to look into this more.. --bud On 30 May 2003 21:48:21 -0500 Ian Bicking <ia...@co...> wrote: > > 14. Modeling (which seems rather hard to get into) seems to have very > > expressive ways to get complex joins. I haven't tried it at all, but > > have browsed the docs. Might be worth stealing some ideas from them. But > > gee, theyReallyLikeLongMethodNames! Or maybe I'm sensitive to them due > > to RSI... > > I've only looked at Modeling a bit. But ouch... those functions and > methods are painful to read, not to mention write. Like: > > from Modeling.FetchSpecification import FetchSpecification > from Modeling.Qualifier import qualifierWithQualifierFormat > > qualifier=qualifierWithQualifierFormat('lastName=="Hugo"') > fetchSpec=FetchSpecification(entityName='Writer', > qualifier=qualifier) > objects=ec.objectsWithFetchSpecification(fetchSpec) > > That's a "simple" fetch. Hurts my head just having that on the page. Oh my: > > aBook.addObjectToBothSidesOfRelationshipWithKey(anAuthor, 'author') /----------------------------------------------------------------- | Bud P. Bruegger, Ph.D. | Sistema (www.sistema.it) | Via U. Bassi, 54 | 58100 Grosseto, Italy | +39-0564-411682 (voice and fax) \----------------------------------------------------------------- |
From: Ian B. <ia...@co...> - 2003-06-03 06:44:15
|
On Fri, 2003-05-30 at 21:48, Ian Bicking wrote: > I've only looked at Modeling a bit. But ouch... those functions and > methods are painful to read, not to mention write. Like: > > from Modeling.FetchSpecification import FetchSpecification > from Modeling.Qualifier import qualifierWithQualifierFormat > > qualifier=qualifierWithQualifierFormat('lastName=="Hugo"') > fetchSpec=FetchSpecification(entityName='Writer', qualifier=qualifier) > objects=ec.objectsWithFetchSpecification(fetchSpec) > > That's a "simple" fetch. Hurts my head just having that on the page. Oh my: > > aBook.addObjectToBothSidesOfRelationshipWithKey(anAuthor, 'author') After a little thought, and in Modeling's defense, I have a feeling this is because the API was taken from an Object C ORM, where these method names don't look quite so odd. More like: aBook addObjectToBothSidesOfRelationship: anAuthor withKey: "author" OK... "addObjectToBothSidesOfRelationship" is still a little long, but this sort of method name is more common in that world. It still could stand to be translated into better Python -- "withX" shouldn't be used, for instance, instead it should be key='author' -- but at least it's more understandable. Ian |
From: Edmund L. <el...@in...> - 2003-05-28 23:45:27
|
Bud P.Bruegger wrote: Apologies for taking so long to reply. As I mentioned in another message, I've got a lot of things happening at home and work right now. > I agree that "legacy" isn't a good term. What I meant is really a > pre-existing data schema that has to be accesses (in contrast to one > that is newly created following the needs of the middleware). I was just being sensitive... :-) Ian's right--legacy is loaded with unnecessarily negative connotations in the tech industry. > Do you see a strong need to live without such an OID? Is it possible > to retrofit pre-existing schemas with such an OID field? I'm going to try (see the message I just emailed to Luke). To be honest, I'd rather see synthetic primary keys banished. But, while PostgreSQL allows changes in primary keys to cascade automatically to foreign keys in other tables, Oracle and a few other databases do not. So, the need to be portable might well necessitate synthetic keys. I suppose that as long as one could still add integrity constraints to the real but non-functional primary keys, then the use of synthetic keys won't be so bad. > Is this what you need or is there more to it? Not sure yet. Since I'm in danger of being too theoretical, I'm going to try using SQLObject for a project, and will report where it comes up short. > I would believe that a concrete example would be the best way of > understanding each other. I admit to have difficulties to fully > understand the situation so far. I tried to make it more concrete > above what I have understood about constraints. I still have only a > vague understanding of the multiple joins... I'll write more as I find the problems... I'll use SQLObject from the viewpoint of traditional data modeler--full constraints, relational integrity, etc. and see where I run aground... ...Edmund. |
From: Magnus <ma...@th...> - 2003-05-29 02:11:06
|
At 19:45 2003-05-28 -0400, Edmund Lian wrote: >To be honest, I'd rather see synthetic primary keys banished. But, while >PostgreSQL allows changes in primary keys to cascade automatically to >foreign keys in other tables, Oracle and a few other databases do not. So, >the need to be portable might well necessitate synthetic keys. Does *any* database but PostgreSQL support cascaded key updates like that? Anyway, this is *not* the big problem with "natural" primary keys. The main problem is that the business rules change, and making big changes in large, actively used databases is very disruptive and expensive. We want to isolate each needed change as much as possible. It's far from unique that the set of columns that was the ideal key yesterday isn't that any longer. Yesterday it was a solid business rule that only one row in this table could be created per person each day, so personId + registryDate seemed to be the ideal combo. The business experts are so sure that they would bet their arm on it. Today things changed, and in a few cases, it might actually happen that there have to be two rows for one person on certain dates. With synthetic keys, this means that we drop a unique index. With natural keys, our primary key breaks, and with that maybe a whole tree of keys for detail and sub-detail tables that use these fields as parts of their primary keys. Yuk! Been there, done that... I solved it ugly, and cheated with the date in those rare cases, using the next day, if the current was used, but I'd rather do things cleanly. Using the primary key of one table as part of the primary key for another table just because there is a parent-child relationship between those tables mean that we have a much tighter coupling between the tables than we have with synthetic keys. While this evil, tight coupling can sometimes be avoided in the *primary* keys of dependent tables, we can't avoid having the entire primary key (which contains business information for another table) as foreign key fields in the dependent table. Not only the data model is hampered by this. The classes in the application logic, whether it's implemented in Python, Java or C++ will carry along attributes that really belong to another class! This also means that the cost of implementing a business rule change is multiplied. By using a uniform primary key type, we will also always have a uniform foreign key type, and it will be much easier to change the table structure. After all we sometimes realize the the X objects aren't really atttibutes of the Y object, but rather of the Z objects. Natural keys create some kind of software structure cement, and makes changes in business rules very hard. Often, these structures get cemented long before the product is even launched. It's one thing that it's difficult to migrate loads of important business data, but it *should* at least be swift to change the system if it isn't running yet. For a different problem that brought me to the same conclusion, see http://www.thinkware.se/cgi-bin/thinki.cgi/CaseStudyUnbrokenSeries -- Magnus Lycka (It's really Lyckå), ma...@th... Thinkware AB, Sweden, www.thinkware.se I code Python ~ The shortest path from thought to working program |
From: Bud P. B. <bu...@si...> - 2003-06-02 10:11:06
|
Magnus, I very much agree with your approach to primary keys. This is why I suggested to use GUID or UUID in SQLObject some time ago and voted against the possibility to use multi-column primary (and foreign) keys. In my prototype ORM, I use a guid implementation that I found in an ASPN cookbook and it seems to work just fine. I believe I was originally convinced of this by a paper by Scott Ambler--but I didn't find the refernece anymore.. -b On Thu, 29 May 2003 03:56:30 +0200 Magnus Lyckå <ma...@th...> wrote: > At 19:45 2003-05-28 -0400, Edmund Lian wrote: > >To be honest, I'd rather see synthetic primary keys banished. But, while > >PostgreSQL allows changes in primary keys to cascade automatically to > >foreign keys in other tables, Oracle and a few other databases do not. So, > >the need to be portable might well necessitate synthetic keys. > > Does *any* database but PostgreSQL support cascaded key updates > like that? > > Anyway, this is *not* the big problem with "natural" primary keys. > > The main problem is that the business rules change, and making big > changes in large, actively used databases is very disruptive and > expensive. We want to isolate each needed change as much as possible. > > It's far from unique that the set of columns that was the ideal key > yesterday isn't that any longer. Yesterday it was a solid business > rule that only one row in this table could be created per person > each day, so personId + registryDate seemed to be the ideal combo. > The business experts are so sure that they would bet their arm on it. > Today things changed, and in a few cases, it might actually happen > that there have to be two rows for one person on certain dates. > > With synthetic keys, this means that we drop a unique index. With > natural keys, our primary key breaks, and with that maybe a whole > tree of keys for detail and sub-detail tables that use these > fields as parts of their primary keys. Yuk! Been there, done that... > I solved it ugly, and cheated with the date in those rare cases, > using the next day, if the current was used, but I'd rather do > things cleanly. > > Using the primary key of one table as part of the primary key for > another table just because there is a parent-child relationship > between those tables mean that we have a much tighter coupling > between the tables than we have with synthetic keys. While this > evil, tight coupling can sometimes be avoided in the *primary* > keys of dependent tables, we can't avoid having the entire > primary key (which contains business information for another table) > as foreign key fields in the dependent table. > > Not only the data model is hampered by this. The classes in the > application logic, whether it's implemented in Python, Java or > C++ will carry along attributes that really belong to another > class! This also means that the cost of implementing a business > rule change is multiplied. > > By using a uniform primary key type, we will also always have a > uniform foreign key type, and it will be much easier to change > the table structure. After all we sometimes realize the the X > objects aren't really atttibutes of the Y object, but rather of > the Z objects. Natural keys create some kind of software structure > cement, and makes changes in business rules very hard. Often, > these structures get cemented long before the product is even > launched. It's one thing that it's difficult to migrate loads of > important business data, but it *should* at least be swift to > change the system if it isn't running yet. > > For a different problem that brought me to the same conclusion, see > http://www.thinkware.se/cgi-bin/thinki.cgi/CaseStudyUnbrokenSeries > > > > -- > Magnus Lycka (It's really Lyckå), ma...@th... > Thinkware AB, Sweden, www.thinkware.se > I code Python ~ The shortest path from thought to working program > > > > ------------------------------------------------------- > This SF.net email is sponsored by: eBay > Get office equipment for less on eBay! > http://adfarm.mediaplex.com/ad/ck/711-11697-6916-5 > _______________________________________________ > sqlobject-discuss mailing list > sql...@li... > https://lists.sourceforge.net/lists/listinfo/sqlobject-discuss > /----------------------------------------------------------------- | Bud P. Bruegger, Ph.D. | Sistema (www.sistema.it) | Via U. Bassi, 54 | 58100 Grosseto, Italy | +39-0564-411682 (voice and fax) \----------------------------------------------------------------- |
From: Magnus <ma...@th...> - 2003-06-02 10:43:49
|
At 12:09 2003-06-02 +0200, Bud P. Bruegger wrote: >Magnus, > >I very much agree with your approach to primary keys. This is why I >suggested to use GUID or UUID in SQLObject some time ago and voted >against the possibility to use multi-column primary (and foreign) >keys. On the other hand, I can understand the need to intergrate with Legacy databases, so allowing multi-column keys could certainly be useful for *that*. Another issue to consider is that SQL tables are much more limited than Python classes. If we have a class X with an attribute Y which is a dict containing string => string, that can obviously not be a column in table X, but on the other hand, we would not assign an object identity in Python for each dict element. The most straight forward solution seems for me to be a table X_Y with columns x_id int, key varchar(...), value varchar(...). Now, it seems I neet to either obscure X.Y as a pickle etc, or to give each dict element an object identity. One makes the database less clear and searchable, the other makes the db much heavier than the pure Python implementation. >I believe I was originally convinced of this by a paper by Scott >Ambler--but I didn't find the refernece anymore.. Me too. They are in my wiki I think. http://www.thinkware.se/cgi-bin/thinki.cgi/DatabaseDesign -- Magnus Lycka (It's really Lyckå), ma...@th... Thinkware AB, Sweden, www.thinkware.se I code Python ~ The shortest path from thought to working program |