From: Ian B. <ia...@co...> - 2005-05-04 15:57:58
|
Ben Bangert wrote: > On May 3, 2005, at 9:04 PM, Ian Bicking wrote: > >>> On May 3, 2005, at 8:07 PM, Ian Bicking wrote: >>> >>>> Well, it means factoring out some of the SELECT stuff. Which might >>>> not be that hard; it's not like there's a lot of code points that >>>> are actually generating the stuff. In this case, there'd have to >>>> be some way to unpack multiple objects from a SELECT query, and a >>>> way to get that SELECT to include multiple tables. > > > Ok, trying to piece this together, the cache is used to speed up the > joins. So if during the join, there was some indicator to fetch related > objects right there, and the objects were cached, would they be used > during the lookup on them? Would that solve the problem? > > ie, if I have a Person table and a Address table, and I want all the > people and their addresses in one query, it'd be something like this: > > people = list(Person.select(include == Address)) # or however the > syntax is worked to include the other table > for person in people: > print person.firstName + ' lives in these cities: ' + > ','.join([address.city for address in person.addresss]) > > In the person.addresss fetch, it'd see the existing objects in the cache > and not query the database? No, a join like person.addresses always produces a query. address.person would not produce a query if the person was in memory. I've found it far too difficult to keep cache consistency of joins like this. So this will result in 1+(number of people selected) queries. >>>> The obviousish way to do it would be by adding an option to .select >>>> ()/SelectResult that would eagerly include tables (either the given >>>> tables, or all the tables in the join). Then you'd construct the >>>> other objects, but throw them away, relying on the cache. Which is >>>> a little iffy; presumably you really want to keep them around, but >>>> right now joins are only optimized through caching. That is, an >>>> object only knows the id of a joined table, but OtherClass.get(id) >>>> is fast when the object is in memory. But it can be garbage >>>> collected if it goes out of the cache. > > > This is what I'm hoping my example usage would do, assuming I understand > how the caching is working. I'm also assuming I need to do list(), > otherwise as I iterate through it would run a separate SELECT query each > time, right? Without list() the people are pulled in lazily from the connection, but there is still just one query. If you turn on debugging in your connection (?debug=t) then all the queries will be printed on stdout. >> When you do a select and fetch a bunch of rows, if some of those rows >> describe objects already in memory, then those objects will be reused. >> There's a method (um, not documented and I can't remember it) that >> will expire all objects, which you might want to do between requests. >> You can't do that for a single thread unless you are using per-thread >> connections; if you are doing process-wide connections in a threaded >> environment, all threads will share the same set of objects. > > > Hmm, I'm running in Apache 2 with prefork, so each SQLObject cache is > hanging out in its own process. So I won't get any benefit from the > object caching beyond the current request as a different process is > going to pick up the next request. I'm guessing this means I should > search for that method to wipe out the object cache, so I can clear it > at the beginning of the request from the last request (Plus I don't know > if the db got updated in a different process). > > I don't suppose there's any way to stash the objects into a shared > memory cache that another apache child process can get to? I do most of my programming in threads, so I don't really know. I'd have to ask you: is there a way to stash objects into a shared memory cache? -- Ian Bicking / ia...@co... / http://blog.ianbicking.org |