On Thu, 2002-04-04 at 13:07, Edmund Lian wrote:
> Ian wrote:
> >> 2. Support for transactions
> >I didn't do this, but it would be nice... what kind of interface do you
> >use for this? In a multi-threaded environment, I'd be unsure about how
> >to lump together transactions... unless, I suppose, you have a
> >connection object associated with each thread's accesses.
> Yes. This seems to be the only way to do it since the DB API does
> transactions at the connection object level rather than say, the cursor
> >I started with the thinest of wrappers and have filled it out some as I
> >feel inspired... I haven't felt inspired enough to do joins, but I'm
> >getting closer. I'd want to do a lot of caching, but keeping the cache
> >valid is a significant problem. The many-to-many joins, or other
> >non-obvious joins, are a bit more difficult to properly abstract... but
> >then those joins maybe should be left up to the implementor.
> I think the cacheing problem is insurmountable in a DBMS independent way.
> With PostgreSQL, you could have the table notify you when it is updated,
> but no such facility exists with say, Oracle unless you use a trigger
> coupled to a stored procedure.
Yes, for the most part I've only approached a situation when my code is
the only code accessing the database -- in fact, that it is the only
process accessing the database. This is a significant limitation. If I
was to do large imports from another tool, I'd add something to lock
access and invalidate all caches until the external process was
finished. This is, of course, rather inefficient. I suppose it would
be sufficient to invalidate all caches when the external process is
finished, without locking access... but I'll try to avoid it.
This does make the normal schema less important, though. I still feel
like the database becomes a good, portable storage format -- nearly as
good as XML for archiving purposes, if not casual information
I'm starting to realize that I'm mirroring MK in many ways :) If it
hadn't been for the CSV database descriptions, maybe I'd never have come
this direction... well, I prefer my style of instantiation as well, but
again, it's probably a more aesthetic than structural distinction.
> After struggling with the semantics of a join, I've come to the conclusion
> that joins cannot be handled using the same kind of object as one
> representing a single table. In any case, an object representing two or
> more joined tables (really just a view) can't accept row modification or
> deletion methods since current DBMS products don't allow you to
> write-through views. Well, maybe Oracle EE does since it has materialized
> views, but who can afford this anyway?
> >Have you looked at the query-generation wrapper I wrote? I'm not
> >entirely sure how successful the idea was, but it could create queries
> >from Python expressions. You had to use & and | instead of "and" and
> >"or", since and/or can't be overloaded, but it was otherwise fairly
> >similar to regular Python expressions -- unfortunately, & and | don't
> >have the right order of precedence.
> Hmmm... never got the message containing the code, and SF doesn't store
> attachments (your post is here:
> Could you email them to me directly so I can look at them?
The code I wrote recently is at:
But the query generation is separate (I haven't actually used it, it was
just an experiment). It's at:
> >Is each row a distinct object, with accessors and all? How do you add
> >extra methods to these objects?
> No, in the examples I gave, x = DBTable('table_name'), which just
> instantiates an object to mediate between the table in question and my
> code. Whether the .delete(), .update() methods operate on an individual row
> or a bunch of rows depends on the specificity of the conditions named in
> the .update() and .delete() methods.
> I did it this way because if you treat each row as a distinct object with
> accessors and methods, you will devolve down to procedural code when you
> need to modify a bunch of similar rows (e.g., update user_table set
> group_name = 'apple' where group_name = 'orange') When you do this, you
> lose the advantages of using declarative languages like SQL to begin with.
How do you add extra methods to your results? For instance, I keep
extra information associated with a row in files quite often -- for
instance, image files which are inefficient to store in the database.
However, I'd like to treat that extra information the same way I treat
the in-DB information. All the wrappers that use dictionary-like access
bugged me because they made me make that distinction explicitly in the
I've forgone the advantages of grouped updates, but I don't think my
code would use a grouped update in most cases -- or, if it does, it's
only occasionally, and the groups are on the range of 2-5, so the
overhead of repetitive updates is relatively low. It's just so
incredibly convenient to be able to pass database objects around.
> >> # Return just col_one and col_two of a certain subset:
> >> x.columnList = ['col_one', 'col_two']
> >> x.select(where="id=1 and name='apple'")
> >Is x like a connection, and columnList effects the next .select
> >statement? That seems rather unfunctional -- why not just put it in the
> >select method call?
> x is just an object representing one table. columnList does affect the next
> .select statement right now. I did this because x, as an object, has state
> information. It knows what columns it has, and what columns to retrieve. I
> want to separate the "I know what columns to retrieve" from the "I know
> which rows to retrieve" behavior to ease the syntax in making calls--I'm
> trying to make it as painless as possible to interact with the tables.
Maybe it's that I started programming in functional languages, or that I
like the set-like analogies in an RDBMS, but I can't stand stateful
APIs, especially with databases -- ADO is like this, as I remember, as
well as DBAPI and others. I just don't understand why you'd want to do
it that way -- perhaps if you have another object, like a
selection/querying object, which holds this state separate from the
connection. So, if you have a connection object conn, you might do:
sel = conn.selection()
## sel.rows() will now be all rows
sel.where = "id=1 and name='apple'"
sel.columnList = ['col_one', 'col_two']
## sel.rows() now would be equivalent to:
## sel.rows(where="id=1 and name='apple'", columnList=['col_one', 'col_two']
## but you can still do...
sel2 = conn.selection()
sel2.columnList = ['col_one', 'col_two']
## And then you can still access sel.rows() and get that result set.
If you did happen to use the SQLBuilder code, you could potentially
treat the where clause more dynamically, like:
sel.where = column.id == 1
sel.where &= column.name == "apple"
I find this is actually a fairly common task.