sqlobject-discuss Mailing List for SQLObject (Page 423)
SQLObject is a Python ORM.
Brought to you by:
ianbicking,
phd
You can subscribe to this list here.
2003 |
Jan
|
Feb
(2) |
Mar
(43) |
Apr
(204) |
May
(208) |
Jun
(102) |
Jul
(113) |
Aug
(63) |
Sep
(88) |
Oct
(85) |
Nov
(95) |
Dec
(62) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(38) |
Feb
(93) |
Mar
(125) |
Apr
(89) |
May
(66) |
Jun
(65) |
Jul
(53) |
Aug
(65) |
Sep
(79) |
Oct
(60) |
Nov
(171) |
Dec
(176) |
2005 |
Jan
(264) |
Feb
(260) |
Mar
(145) |
Apr
(153) |
May
(192) |
Jun
(166) |
Jul
(265) |
Aug
(340) |
Sep
(300) |
Oct
(469) |
Nov
(316) |
Dec
(235) |
2006 |
Jan
(236) |
Feb
(156) |
Mar
(229) |
Apr
(221) |
May
(257) |
Jun
(161) |
Jul
(97) |
Aug
(169) |
Sep
(159) |
Oct
(400) |
Nov
(136) |
Dec
(134) |
2007 |
Jan
(152) |
Feb
(101) |
Mar
(115) |
Apr
(120) |
May
(129) |
Jun
(82) |
Jul
(118) |
Aug
(82) |
Sep
(30) |
Oct
(101) |
Nov
(137) |
Dec
(53) |
2008 |
Jan
(83) |
Feb
(139) |
Mar
(55) |
Apr
(69) |
May
(82) |
Jun
(31) |
Jul
(66) |
Aug
(30) |
Sep
(21) |
Oct
(37) |
Nov
(41) |
Dec
(65) |
2009 |
Jan
(69) |
Feb
(46) |
Mar
(22) |
Apr
(20) |
May
(39) |
Jun
(30) |
Jul
(36) |
Aug
(58) |
Sep
(38) |
Oct
(20) |
Nov
(10) |
Dec
(11) |
2010 |
Jan
(24) |
Feb
(63) |
Mar
(22) |
Apr
(72) |
May
(8) |
Jun
(13) |
Jul
(35) |
Aug
(23) |
Sep
(12) |
Oct
(26) |
Nov
(11) |
Dec
(30) |
2011 |
Jan
(15) |
Feb
(44) |
Mar
(36) |
Apr
(26) |
May
(27) |
Jun
(10) |
Jul
(28) |
Aug
(12) |
Sep
|
Oct
|
Nov
(17) |
Dec
(16) |
2012 |
Jan
(12) |
Feb
(31) |
Mar
(23) |
Apr
(14) |
May
(10) |
Jun
(26) |
Jul
|
Aug
(2) |
Sep
(2) |
Oct
(1) |
Nov
|
Dec
(6) |
2013 |
Jan
(4) |
Feb
(5) |
Mar
|
Apr
(4) |
May
(13) |
Jun
(7) |
Jul
(5) |
Aug
(15) |
Sep
(25) |
Oct
(18) |
Nov
(7) |
Dec
(3) |
2014 |
Jan
(1) |
Feb
(5) |
Mar
|
Apr
(3) |
May
(3) |
Jun
(2) |
Jul
(4) |
Aug
(5) |
Sep
|
Oct
(11) |
Nov
|
Dec
(62) |
2015 |
Jan
(8) |
Feb
(3) |
Mar
(15) |
Apr
|
May
|
Jun
(6) |
Jul
|
Aug
(6) |
Sep
|
Oct
|
Nov
|
Dec
(19) |
2016 |
Jan
(2) |
Feb
|
Mar
(2) |
Apr
(4) |
May
(3) |
Jun
(7) |
Jul
(14) |
Aug
(13) |
Sep
(6) |
Oct
(2) |
Nov
(3) |
Dec
|
2017 |
Jan
(6) |
Feb
(14) |
Mar
(2) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
(4) |
Nov
(3) |
Dec
|
2018 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2019 |
Jan
|
Feb
(1) |
Mar
|
Apr
(44) |
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(1) |
2021 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2023 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
(1) |
Nov
(2) |
Dec
|
2024 |
Jan
|
Feb
|
Mar
|
Apr
(4) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2025 |
Jan
|
Feb
(1) |
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Bud P. B. <bu...@si...> - 2003-05-28 09:29:36
|
On Tue, 27 May 2003 09:43:25 +0200 Magnus Lyckå <ma...@th...> wrote: > I think the db driver should look for a standardized attribute in > each parameter it gets, and if it finds it, it should call it as > a method, and then recieve a string which is the representation > of data that we want to send, and a description of what type it > should become in the backend. I think this combo is needed. > > If it gets '123' and DECIMAL it knows that it should replace the > ? with 123, but if it gets '123' and VARCHAR it should replace ? > with '123'. I haven't tried to implement that though. The issue of types and marshalling in DB-API modules is definitely an interesting one. I see it as more complex than what you describe above. It is really that of type-casting that goes in both directions, to and from the db. Some DB-API modules (eg, psycopg) seem to support user-defined type-casts. But that is not standardized and I believe that psycopg's casting support is currently being redesigned and rewritten. So since this was too terse to be understandable, here an explanation. In python, arguments are of given types. Types are the same as classes, ie. encapsulated state and behavior. There is no restriction of what types there are since users usually define their own. In contrast, in the dbms, types are restricted to "basic types" for which state is definitely in the foreground and behavior is secondary. Any kind of middleware has to bridge the conceptual gap by marshalling/unmarshalling or type-casting (or whatever you call it). Marshalling gets from a python attribute to (in general) a number of db columns of simple type. In other words, it maps the encapsulated state to a list of simple types. Unmarshalling converts a list of simple types into an object of the given type with reconstructed state (and obviously, all the behavior). In my understanding, the solution you line out above is only partial. It supports marshalling to a single db column. But it does not seem to support the unmarshalling, ie., casting the column back to the original python type. Here an example that illustrates this more: Assume we have an attribute of type "Location" that is a class with behavior such as "distance" or "containedInPolyon" and is physically represented by a tuple of numbers (such as lat/long, or UTM northing, UTM easting). Marshalling would then take a single location attribute to two columns (eg., lat and long) and unmarshalling would reconstruct a Location instance from a pair of coordinate values. While DB-API's support some base types, there is no way they can support all possible types since they are user defined and cannot be enumerated. So in my understanding, any middleware solution that supports the general case, needs to have a layer that marshals between python classes and tuples of basic values. Ian and I have discussed this kind of additional layer (that is currently lacking in SQLObject) earlier under the subject "higher-level types". If you are interested in (much more) detail on this, I can send you some experimental code (or alpha quality code) that does this for single objects without relationship. cheers --b /----------------------------------------------------------------- | Bud P. Bruegger, Ph.D. | Sistema (www.sistema.it) | Via U. Bassi, 54 | 58100 Grosseto, Italy | +39-0564-411682 (voice and fax) \----------------------------------------------------------------- |
From: Ian B. <ia...@co...> - 2003-05-27 20:33:34
|
On Tue, 2003-05-27 at 02:43, Magnus Lyck=E5 wrote: > At 15:09 2003-05-26 -0500, Ian Bicking wrote: > >I don't think the DB-API is not complete enough to be considered "righ= t" > >for this. AFAIK, there is no standard way to indicate how non-standar= d > >types should be quoted, and I cannot see what the *actual* SQL sent is= . >=20 > Right. I've brought up both issues on the db-sig mailing list recently. >=20 > I think the db driver should look for a standardized attribute in > each parameter it gets, and if it finds it, it should call it as > a method, and then recieve a string which is the representation > of data that we want to send, and a description of what type it > should become in the backend. I think this combo is needed. >=20 > If it gets '123' and DECIMAL it knows that it should replace the > ? with 123, but if it gets '123' and VARCHAR it should replace ? > with '123'. I haven't tried to implement that though. >=20 > I'd also like to see some kind of debug mode where the final SQL > could be viewed before sending to the backend. It would also be > useful if this string would pop up in tracebacks. *If* those changes were made, then I'd feel much differently about using parameters. If you want to write up a PEP, to maybe take it past the discussion that's gone on for so long, then that'd be great, and I'd be happy to help proofread it or whatever... but it's really the authors of the database adapters who need convincing. Which may be easy, I don't know. > >By giving up control to the underlying adapter, I fear that other > >compatibility issues will be harder to fix later. >=20 > Maybe... But if we keep the parameters separate as long as > possible, we have a lower enthropy in our code. It's always > fairly simple to pass in parameters, but it's not so simple > to get them out of the string. Right now I'm not constructing SQL until it's just about to be sent off, so it's not really a problem. > >Additionally, the underlying vendor's driver *cannot* be the most > >authoritative participant. That driver is written in C, with C types = -- > >the Python type system is much richer. At the Python level we can off= er > >a much more complete interface (with things like SQLBuilder.func.NOW()= ). >=20 > As I mentioned above, I think that could be fixed. >=20 > >Parameter passing will solve this date quoting problem, but at some > >later date I expect a problem that parameter passing *won't* solve, an= d > >in fact will be made much more difficult to solve because of parameter > >passing. >=20 > Perhaps...but I'm not so certain, and as I said it's easier to for > *from* sepatare parameters than to go *to* separate parameters. I would still disagree, at least for SQLObject. Or rather, I would say that separate parameters is a decision that in most places is delayed quite a ways (until your connection is just about to send the SQL), so it's not a big deal either way. > >Perhaps -- I don't even know, because I'm not using parameters. >=20 > No backend uses %s. That's a DB-API invention, so no driver with > paramstyle format or pyformat has backend support for parameter > passing (and vice versa I think). >=20 > >AFAIK, prepared statements and whatnot > >only offer advantages in a certain context, a context which will not > >exist in SQLObject. >=20 > True. That makes SQLObject a bit different. It still seems that most > people prefer parameter passing, but your tradeoff is not quite that > of an ordinary application coder. Well, prepared statements can reduce SQL parse time, but since there isn't any interface to prepare a statement well ahead of using that statement, it's not so important to us. > >Can you change the ?'s reasonably efficiently, while protecting agains= t > >literals that contain ?'s ? >=20 > It's just a few lines of code, and my thought was to cache these SQL > strings in a dictionary, so we don't need to rerun the translation if > it had been done before. Maybe that's overkill; it would cause some > new problems. On my 700MHz Duron, I transform 4 simple SQL statements > 1000 times in about 320 millisecs. I.e. 80 microsecs per call. Dictiona= ry > lookup in a caching version is about 45 millisecs for the 1000 lookups > each of the four strings. About 11 microsecs per call. I guess that's > nothing compared to the overhead in calling the sql server. I should > try though... (After my bookkeeping is with my accountant... That's wha= t > I should be doing now.) >=20 > quote_separation =3D re.compile(r"(.*?)('.*?')", re.DOTALL) > def parseQmark(sql): > '''parseQmark(sql_with_?) -> sql_with_%s >=20 > Parse SQL written with qmark parameter style (i.e. Direct SQL > or SQL CLI style) so that the python %-operator can be used to > pass in parameters. This means that % needs to be changed to %%, > and ? needs to be changed to %s if it's not inside quotes. > ''' > sql =3D sql.replace('%', '%%') > # Extra "''" needed to make my re happy if sql doesn't end with ' > notQuoted_quoted =3D quote_separation.findall(sql+"''") > replaced =3D [nq.replace('?', '%s')+q for (nq, q) in notQuoted_quo= ted] > return "".join(replaced)[:-2] # remove extra '' Well, for instance, MySQL allows strings like 'that\'s it', which would break this. But it also allows 'that''s it', so maybe people will just have to use the one. I don't know how some databases treat things like " -- is "user?" a valid column name in Postgres? I don't know... doing ad hoc SQL parsing seems like trouble to me. %s avoids that, since it doesn't do parsing.=20 The other idea, like cursor.execute("INSERT INTO foo (bar, baz) VALUES (", foo, ", ", baz")") Would also work without doing parsing, even if it is a bit annoying.=20 Personally I don't mind %s at all -- so long as you aren't hardcoding string literals into your queries, having to quote % isn't that big a deal. |
From: Magnus <ma...@th...> - 2003-05-27 12:26:39
|
At 15:09 2003-05-26 -0500, Ian Bicking wrote: >I don't think the DB-API is not complete enough to be considered "right" >for this. AFAIK, there is no standard way to indicate how non-standard >types should be quoted, and I cannot see what the *actual* SQL sent is. Right. I've brought up both issues on the db-sig mailing list recently. I think the db driver should look for a standardized attribute in each parameter it gets, and if it finds it, it should call it as a method, and then recieve a string which is the representation of data that we want to send, and a description of what type it should become in the backend. I think this combo is needed. If it gets '123' and DECIMAL it knows that it should replace the ? with 123, but if it gets '123' and VARCHAR it should replace ? with '123'. I haven't tried to implement that though. I'd also like to see some kind of debug mode where the final SQL could be viewed before sending to the backend. It would also be useful if this string would pop up in tracebacks. >By giving up control to the underlying adapter, I fear that other >compatibility issues will be harder to fix later. Maybe... But if we keep the parameters separate as long as possible, we have a lower enthropy in our code. It's always fairly simple to pass in parameters, but it's not so simple to get them out of the string. >Additionally, the underlying vendor's driver *cannot* be the most >authoritative participant. That driver is written in C, with C types -- >the Python type system is much richer. At the Python level we can offer >a much more complete interface (with things like SQLBuilder.func.NOW()). As I mentioned above, I think that could be fixed. >Parameter passing will solve this date quoting problem, but at some >later date I expect a problem that parameter passing *won't* solve, and >in fact will be made much more difficult to solve because of parameter >passing. Perhaps...but I'm not so certain, and as I said it's easier to for *from* sepatare parameters than to go *to* separate parameters. >Yes, that implementation would work. You should be able to do that all >in a subclass, without changing any current classes, though I imagine >you'll have to override most of DBAPIConnection. Well, all concrete drivers subclass DBAPIConnection... This will just override a few more methods... :) >Perhaps -- I don't even know, because I'm not using parameters. No backend uses %s. That's a DB-API invention, so no driver with paramstyle format or pyformat has backend support for parameter passing (and vice versa I think). >AFAIK, prepared statements and whatnot >only offer advantages in a certain context, a context which will not >exist in SQLObject. True. That makes SQLObject a bit different. It still seems that most people prefer parameter passing, but your tradeoff is not quite that of an ordinary application coder. >Can you change the ?'s reasonably efficiently, while protecting against >literals that contain ?'s ? It's just a few lines of code, and my thought was to cache these SQL strings in a dictionary, so we don't need to rerun the translation if it had been done before. Maybe that's overkill; it would cause some new problems. On my 700MHz Duron, I transform 4 simple SQL statements 1000 times in about 320 millisecs. I.e. 80 microsecs per call. Dictionary lookup in a caching version is about 45 millisecs for the 1000 lookups each of the four strings. About 11 microsecs per call. I guess that's nothing compared to the overhead in calling the sql server. I should try though... (After my bookkeeping is with my accountant... That's what I should be doing now.) quote_separation = re.compile(r"(.*?)('.*?')", re.DOTALL) def parseQmark(sql): '''parseQmark(sql_with_?) -> sql_with_%s Parse SQL written with qmark parameter style (i.e. Direct SQL or SQL CLI style) so that the python %-operator can be used to pass in parameters. This means that % needs to be changed to %%, and ? needs to be changed to %s if it's not inside quotes. ''' sql = sql.replace('%', '%%') # Extra "''" needed to make my re happy if sql doesn't end with ' notQuoted_quoted = quote_separation.findall(sql+"''") replaced = [nq.replace('?', '%s')+q for (nq, q) in notQuoted_quoted] return "".join(replaced)[:-2] # remove extra '' in cursor.execute(), we'd have something like if sql in statement_cache: sql = statement_cache[sql] else: tmp = parseQmark(sql) statement_cache[sql]= tmp sql = tmp This would enable drivers such as the ones for PostgreSQL, MySQL and SQLite to use paramstyle 'qmark'. Taking this one step further, we could then pass in :1, :2, :3 or :a, :b, :c etc instead of the values to get support for the drivers that use numbered or named parameter styles, or even @a, @b... to support Sybase. It's no big deal. Disregarding the issue of SQLObject for a moment, I think it would be great if we managed to run all DB-API drivers with a single parmstyle. Maybe there should be a "compatible mode" in DB-API that enforced paramstyle qmark at the possible expense of some performance. "module.set_paramstyle('qmark')" ? -- Magnus Lycka (It's really Lyckå), ma...@th... Thinkware AB, Sweden, www.thinkware.se I code Python ~ The shortest path from thought to working program |
From: Ian B. <ia...@co...> - 2003-05-26 20:08:24
|
On Mon, 2003-05-26 at 05:15, Magnus Lyck=E5 wrote: > But Date is really just an example... The thing is that if I use > something like "INSERT INTO X (a,b,c) VALUES (?,?,?)" or > "SELECT * FROM X WHERE A < ?", and pass in the literals in a params > tuple, the problem goes away, regadless of what kind of data this > might be. (Of course, I still have to provide data in the right > format, but I don't have to solve quoting or escaping.) >=20 > After all, the point with the second parameter to cursor.execute() > is there to promote this kind of coding. As far as I understand, neithe= r > MySQL, PostGreSQL or SQLite supports Dynamic SQL or the SQL CLI properl= y, > so there %s is used, and the parameters are passed into the SQL string = in > the interface, but for interfaces to databases with proper Dynamic SQL > support, both the SQL string with question marks and the parameter tupl= e > are just passed to a lower layer, and handled by the database vendor's > driver. I think this is the right way to do it. >=20 > There are certinaly lots of other problems in various SQL dialects that > we can't solve with this fix. Contatenation operators and join syntax e= tc > makes it difficult to always get SQL to work cross platform, but for > handling of quoting and escaping of literals used in SQL queries, there > *is* a right way to do it, so trying to solve it again in the Python la= yer > seems...well, non-optimal. I don't think the DB-API is not complete enough to be considered "right" for this. AFAIK, there is no standard way to indicate how non-standard types should be quoted, and I cannot see what the *actual* SQL sent is.=20 By giving up control to the underlying adapter, I fear that other compatibility issues will be harder to fix later. Additionally, the underlying vendor's driver *cannot* be the most authoritative participant. That driver is written in C, with C types -- the Python type system is much richer. At the Python level we can offer a much more complete interface (with things like SQLBuilder.func.NOW()). Parameter passing will solve this date quoting problem, but at some later date I expect a problem that parameter passing *won't* solve, and in fact will be made much more difficult to solve because of parameter passing. > As far as I understand, it's only the code for INSERT and UPDATE that > I *need* to fix. For INSERT it seems that the easiest way is to fix it > in _queryInsertID and not even call _insertSQL. Just call execute with > the values as a separate parameter, and pass in ",".join(['?']*len(valu= es)) > in the second parenthesis. >=20 > For updates I need to override _SO_update so that I just insert > "SET A =3D ?, B =3D ?" and pass the parameters to self.query. This in > turn means that I need to add an optional param parameter to query > and _query, but it seems like a reasonable thing to do. >=20 > Then I still don't have perfect support for selects, but I can do > without that for now I think. Yes, that implementation would work. You should be able to do that all in a subclass, without changing any current classes, though I imagine you'll have to override most of DBAPIConnection. =20 > >Well, for select in particular, maybe params would be fine. However, > >SQLObject has maintained database independence fairly well, and the > >annoying paramstyle issue would make code that uses params less > >portable. >=20 > So far, you ony seem to support backends that don't support parameter > passing though. Perhaps -- I don't even know, because I'm not using parameters. That ignorance makes me happy ;) I don't see a big advantage to using parameters besides quoting -- AFAIK, prepared statements and whatnot only offer advantages in a certain context, a context which will not exist in SQLObject. =20 > > Though I'm certainly okay with: > > > > Person.select('birthdate < %s', params=3D(Date(1999, 4, 12),)) > > > >Where any non-param %'s have to be quoted (like %%), then the database > >connection changes %s to ? or whatever the paramstyle is. Should be > >easy enough to implement. >=20 > As you might have seen in the db-sig mailing list, I recently wrote > code that goes the other way: ? to %s. This has the advantage that > you can write your SQL queries just the way a proper SQL book describes > them, i.e. you can write "SELECT * FROM X WHERE A =3D ? AND B LIKE 'X%'= " > without having to think about doubling % or using any other non-SQL > construct in the SQL code. It's also fairly simple to convert qmark > style SQL to named, numbered or even Sybase @-style quoting if that > is needed. Can you change the ?'s reasonably efficiently, while protecting against literals that contain ?'s ? > >I don't use params, and don't plan to, for other SQL construction. > >Mostly it's because I like having complete control over the SQL, which= I > >do when I do the string substitution myself. >=20 > I think you are doing more work than you have to that way. As > I said, for MySQL, PostgreSQL and SQLite it's no big difference, > since they do that in a simple way in the interface code anyway, > but for most commercial it might well be. At least for interfaces > like ADO and ODBC where you use one interface but can have any of > many backends behind that interface. >=20 > This also means that you place a burden on the application developer > to solve this right for filtering in the select statement. >=20 > I do realize that changing this now is quite some work though... It's not so much the work of changing this -- it wouldn't be all that difficult, as I've tried to keep SQL generation isolated -- but I'm not convinced this is the right thing to do. Ian |
From: Magnus <ma...@th...> - 2003-05-26 14:41:00
|
At 05:07 2003-05-25 -0500, Ian Bicking wrote: >On Sun, 2003-05-25 at 04:43, Magnus Lyck=E5 wrote: > > That will at least cause problems with dates for my current > > backend, MS Access 97, since the warped SQL dialect there uses > > dates like #1999-04-12# instead of '1999-04-12'. :( > >That's really dumb, but I guess it's not your fault ;) I agree on both points! :) >You could mess around with SQLBuilder.sqlRepr if you don't mind making >incompatible modifications. That's where the current quoting of >mxDateTime values goes. I've thought about hooks for database-specific >quoting, exactly like this situation, but I haven't done it. There's >only a couple cases, since there's a reasonable lowest-common- >denominator for most databases and data types. The problem is that then I somehow have to figure out what database I have behind my ODBC source. I'm not even sure the problem exists in all Access version. As far as I understand, more modern versions can be coerced into being more SQL92-like in their behaviour. I don't want to mess with this at all, I think the really correct way to handle this is to use parameter passing the way it's intended and supported by the interface, and let the lower layers pass in the parameters. Why should we solve a problem that is already solved in the ODBC drivers? >You probably won't be able to use your date object directly, but if an >instance defines a sqlRepr method, that will be used. So it could be >something like: But Date is really just an example... The thing is that if I use something like "INSERT INTO X (a,b,c) VALUES (?,?,?)" or "SELECT * FROM X WHERE A < ?", and pass in the literals in a params tuple, the problem goes away, regadless of what kind of data this might be. (Of course, I still have to provide data in the right format, but I don't have to solve quoting or escaping.) After all, the point with the second parameter to cursor.execute() is there to promote this kind of coding. As far as I understand, neither MySQL, PostGreSQL or SQLite supports Dynamic SQL or the SQL CLI properly, so there %s is used, and the parameters are passed into the SQL string in the interface, but for interfaces to databases with proper Dynamic SQL support, both the SQL string with question marks and the parameter tuple are just passed to a lower layer, and handled by the database vendor's driver. I think this is the right way to do it. There are certinaly lots of other problems in various SQL dialects that we can't solve with this fix. Contatenation operators and join syntax etc makes it difficult to always get SQL to work cross platform, but for handling of quoting and escaping of literals used in SQL queries, there *is* a right way to do it, so trying to solve it again in the Python layer seems...well, non-optimal. As far as I understand, it's only the code for INSERT and UPDATE that I *need* to fix. For INSERT it seems that the easiest way is to fix it in _queryInsertID and not even call _insertSQL. Just call execute with the values as a separate parameter, and pass in ",".join(['?']*len(values)) in the second parenthesis. For updates I need to override _SO_update so that I just insert "SET A =3D ?, B =3D ?" and pass the parameters to self.query. This in turn means that I need to add an optional param parameter to query and _query, but it seems like a reasonable thing to do. Then I still don't have perfect support for selects, but I can do without that for now I think. >Well, for select in particular, maybe params would be fine. However, >SQLObject has maintained database independence fairly well, and the >annoying paramstyle issue would make code that uses params less >portable. So far, you ony seem to support backends that don't support parameter passing though. > Though I'm certainly okay with: > > Person.select('birthdate < %s', params=3D(Date(1999, 4, 12),)) > >Where any non-param %'s have to be quoted (like %%), then the database >connection changes %s to ? or whatever the paramstyle is. Should be >easy enough to implement. As you might have seen in the db-sig mailing list, I recently wrote code that goes the other way: ? to %s. This has the advantage that you can write your SQL queries just the way a proper SQL book describes them, i.e. you can write "SELECT * FROM X WHERE A =3D ? AND B LIKE 'X%'" without having to think about doubling % or using any other non-SQL construct in the SQL code. It's also fairly simple to convert qmark style SQL to named, numbered or even Sybase @-style quoting if that is needed. >I don't use params, and don't plan to, for other SQL construction. >Mostly it's because I like having complete control over the SQL, which I >do when I do the string substitution myself. I think you are doing more work than you have to that way. As I said, for MySQL, PostgreSQL and SQLite it's no big difference, since they do that in a simple way in the interface code anyway, but for most commercial it might well be. At least for interfaces like ADO and ODBC where you use one interface but can have any of many backends behind that interface. This also means that you place a burden on the application developer to solve this right for filtering in the select statement. I do realize that changing this now is quite some work though... -- Magnus Lycka (It's really Lyckå), ma...@th... Thinkware AB, Sweden, www.thinkware.se I code Python ~ The shortest path from thought to working program =20 |
From: Bud P. B. <bu...@si...> - 2003-05-26 09:47:31
|
On 25 May 2003 02:05:34 -0500 Ian Bicking <ia...@co...> wrote: > On Thu, 2003-05-22 at 16:10, Bud P.Bruegger wrote: > > I'm wondering what pickle does--it's kind of the same problem. Does > > it call __init__ during unpickling or just __new__??? While I haven't > > looked into the details, I'm quite positive the latter holds.. > > I don't know what exactly it does, but it doesn't call either __init__ > or __new__ -- I believe it calls __setstate__, if you want to change how > pickle works on your object. I just looked it up and by default, it saves and restores the values of __dict__ without calling __init__. If one wants __init__ to be called, __getinitargs__() needs to be provided and the resulting tuple is stored as part of the pickle. __getstate__ and __setstate__ seem to be used if other values than the object's __dict__ need to be stored... ... > The reality is that there is a disconnect between database objects and > Python objects. SQLObject doesn't provide an object store, ala ZODB. > It maps, but there's no expectation that it will be perfect -- so > creating a Python object from the database row *is* a > creation/construction operation, it's not just like we've just brung the > object into memory from a different storage mechanism (as with pickle or > ZODB). I think I agree with this. The Attribute layer that in my proposed architecture sits between a class and the columns actually makes this process explicit (I have toSQL and fromSQL functions). Still, in most cases, I believe these will not be necessary and the default functions really just say to store and restore the __dict__. > Personally I find properties a little funny to work with anyway. ... I agree. My incentive is to impose as few limitations on classes that can be made persistent as possible.. > Using *Python* convention, if you want to ensure consistency you do so > *as each attribute is set*. People don't write Python objects that have > some uncommitted/committed state, where verification is delayed until > the commit. > > If you need to check consistency among several attributes, you set them > all at once. I don't see a problem with that -- it's what I do for > non-database objects already. hmm. you are probably right here unless the app has a need for long transactons... thanks for your feedback --b |
From: Ian B. <ia...@co...> - 2003-05-25 10:06:24
|
On Sun, 2003-05-25 at 04:43, Magnus Lyck=E5 wrote: > I've been thinking of adding support for adodbapi to SQLObject. > Has anyone done that? Any comments? >=20 > The problem I see is that SQLObject doesn't use the param part > of cursor.execute(sql, param). >=20 > That will at least cause problems with dates for my current > backend, MS Access 97, since the warped SQL dialect there uses > dates like #1999-04-12# instead of '1999-04-12'. :( That's really dumb, but I guess it's not your fault ;) You could mess around with SQLBuilder.sqlRepr if you don't mind making incompatible modifications. That's where the current quoting of mxDateTime values goes. I've thought about hooks for database-specific quoting, exactly like this situation, but I haven't done it. There's only a couple cases, since there's a reasonable lowest-common- denominator for most databases and data types. > But it seems I can get basic support by just overriding the > insert or update methods. After all, dates are never key > values, and I suppose I can live with the limitations in > selects for the time being. You probably won't be able to use your date object directly, but if an instance defines a sqlRepr method, that will be used. So it could be something like: class MyDate: def __init__(self, date): self.date =3D date def sqlRepr(self): return '#%s#' % self.date.strftime('%Y-%m-%d') Again, obviously not portable among databases, and perhaps only moderately better than modifying the source of SQLBuilder. Someday DateTimeCol will be able to handle the necessary quoting, though that won't address the select issues. > But in the long run, It's like to be able to do > Person.select('birthdate < ?', (Date(1999,4,12),)) though... >=20 > What's the rationale behind not using params today? Would > a change have any negative consequences (beyond some labour)? Well, for select in particular, maybe params would be fine. However, SQLObject has maintained database independence fairly well, and the annoying paramstyle issue would make code that uses params less portable. Though I'm certainly okay with: Person.select('birthdate < %s', params=3D(Date(1999, 4, 12),)) Where any non-param %'s have to be quoted (like %%), then the database connection changes %s to ? or whatever the paramstyle is. Should be easy enough to implement. I don't use params, and don't plan to, for other SQL construction.=20 Mostly it's because I like having complete control over the SQL, which I do when I do the string substitution myself. Ian |
From: Magnus <ma...@th...> - 2003-05-25 09:40:44
|
I've been thinking of adding support for adodbapi to SQLObject. Has anyone done that? Any comments? The problem I see is that SQLObject doesn't use the param part of cursor.execute(sql, param). That will at least cause problems with dates for my current backend, MS Access 97, since the warped SQL dialect there uses dates like #1999-04-12# instead of '1999-04-12'. :( But it seems I can get basic support by just overriding the insert or update methods. After all, dates are never key values, and I suppose I can live with the limitations in selects for the time being. But in the long run, It's like to be able to do Person.select('birthdate < ?', (Date(1999,4,12),)) though... What's the rationale behind not using params today? Would a change have any negative consequences (beyond some labour)? -- Magnus Lycka (It's really Lyckå), ma...@th... Thinkware AB, Sweden, www.thinkware.se I code Python ~ The shortest path from thought to working program |
From: Ian B. <ia...@co...> - 2003-05-25 07:04:50
|
On Thu, 2003-05-22 at 16:10, Bud P.Bruegger wrote: > I'm wondering what pickle does--it's kind of the same problem. Does > it call __init__ during unpickling or just __new__??? While I haven't > looked into the details, I'm quite positive the latter holds.. I don't know what exactly it does, but it doesn't call either __init__ or __new__ -- I believe it calls __setstate__, if you want to change how pickle works on your object. > For what it's worth, some free flowing thoughts on the issue of > Person(12). It seems that our discussion so far has furthered each > others understanding (open source at work), so here some more > provocative thoughts: > > A python class has only a single "constructor", ie. a combination of > __new__ and __init__. If you want multiple constructors, you have to > subclass. I do this regularly to provide different __init__ with a > signature specialized for different purposes. (And the rest of the > subclass is unchanged). > > Applying this thought to Person(12), you really need an __init__ with > a different signature (12, instead of name=..., age=...), so a > subclass may be one solution. Is there a good naming convention for > this? Maybe GetPerson(12) where GetPerson is a subclass of Person? > > But then, why does the act of retrieving an existing person from some > storage (or namespace, to think more generally) need to be a class > constructor? Why not Person.retrieve(12) or SQLObject.person(12) (both > class methods)? Or actually Persons.get(12) may be a good way of > putting it. (At least if the natural language of programming in > English and application programmers don't chose to end their class > names with and 's'). > > If you went that latter way, you could have Person('Ian', 'Bicking', > 'autor of SQLObject') as a constructor instead of a classmethod > Person.new('Ian', 'Bicking', '...'). The former way seems to be what > I expect from plain Python. Is there any specific rational why you > reversed this? Well, I don't have the entire reason formulated, but I'll give it a go. __new__/_init is called on every *Python* object creation. It'd be a pain to avoid that, and I don't think it *should* be avoided. The reality is that there is a disconnect between database objects and Python objects. SQLObject doesn't provide an object store, ala ZODB. It maps, but there's no expectation that it will be perfect -- so creating a Python object from the database row *is* a creation/construction operation, it's not just like we've just brung the object into memory from a different storage mechanism (as with pickle or ZODB). >From a purely pragmatic point of view, fetching is much more common than constructing, so it's nice that it's shorter too. > > Oh, and I'm not sure what we're talking about as far as properties go. When I > > said 'manually defining properties will be ignored' i meant SQLObject doesn't > > care about them, and they'll work fine. > > Not sure whether I misunderstand, so to find out here is how I think > about the issue. Every column corrsponds to a property whose setter > method is overloaded to store the object in the dbms. If application > programmers have a reason to (independently of SQLObject) make the > same attribute into a property, they have to possibly call SQLObject's > setter method from their setter method (in the same way as the > __init__ of a subclass usually calls the __init__ of the superclass). > One possible example of such as thing would be to log all state > changes to a file. > > So if this understanding of what is happening is accurate, application > programmer need to take special steps when defining properties related > to columns. Or to avoid that, use the non-standard-python shortcut > method to define properties. So in any case, the way to define > properties (related to columns) is different from that of plain pyton. > In contrast, if column values are simply represented in the __dict__ > of the instance, everything is as in plain python. Personally I find properties a little funny to work with anyway. Hmmm... I guess using super() you can access properties in a superclass, but the old techniques don't apply. But anyway, if you want to add something to a property's getter or setter, you'd normally do it by subclassing, etc, but there isn't really a superclass available with the way SQLObject works. If you *do* subclass a SQLObject instance, you probably could do what you want. Anyway, the current situation just means you have to understand the way some methods work, like _SO_get_columnName, or _SO_getValue/_SO_setValue. > > Agree with your summary that some form of transaction handling is required, > > and hence delayed updates, if you want to sanely handle attribute > > constraints/collisions. > > Yes, you're right with the constraints--haven't thought of these > before. Moving from one consistent state of an object to another will > often require the change of more than one attribute. Same idea as > transactions... > > The other reason why I feel unconfortable about updating on every > single attribute change is performance. Single attribute getter and > setter methods are an antipattern of distributed computing... > > While SQLObject gives the possiblity to change several attributes at > once (with the set method), in my view this is quite awkward: it kind > of requires you to shadow the object with an object or an equivalent > unstructured collection of local variables to prepare the new state. > Usually, you would want to call several methods on the object in order > to prepare a new consistent state. If you allow attribute assignments > without committing, you can do that. The set method condemns you to > do the same without using any class methods and thus gets in the way.. Using *Python* convention, if you want to ensure consistency you do so *as each attribute is set*. People don't write Python objects that have some uncommitted/committed state, where verification is delayed until the commit. If you need to check consistency among several attributes, you set them all at once. I don't see a problem with that -- it's what I do for non-database objects already. Ian |
From: Ian B. <ia...@co...> - 2003-05-25 02:41:43
|
Can anyone confirm that SQLite works okay when looping on .select(), then doing database operations inside? I never felt comfortable about the nature of the problem, or the solution. Ian |
From: Ian B. <ia...@co...> - 2003-05-24 22:23:11
|
On Tue, 2003-05-20 at 06:55, Bud P.Bruegger wrote: > I've been thinking about the API for dealing with relations. > > person.roles is just perfect. > > In addition, what about person.rolesWhere(<some condition>) Well, you can create a select query to do this, but you have to specify the join explicitly. Maybe with the interface below, it could look like person.roles.select(condition). > And instead of person.addRole(r), what about person.roles.append(r) > and similarly, r.removePerson(p) as r.persons.remove(p) I started on this path, where joins returned a result of some sort, subclassing lists. But I realized I was getting to distracted at what was an early point of development, so I didn't do it. I might make that change yet. However, the result would look more like SelectResults, and not a subclass of list. > To be pythonic, everything should behave as much as a list as > possible. So I started to play around with inheriting from list and > looking at the hooks (see attached file). I was expecting that there > was a single interface to setting and getting list items (namely, > __getitem__ and __setitem__), but it seems that other ways of > modifying the list such as l.extend() or l.append() don't internally > fall back on __getitem__ and __setitem__. I think it's easier to make it an iterable object, with list-like methods, like SelectResults. Then if you want a list that doesn't change underneith you, you just call list(person.roles). Joins are also more like sets than lists, so that would be a better interface model. http://www.python.org/dev/doc/devel/lib/set-objects.html > Also, someone could "recalculate" relationships and assign them: > p.roles = throttleBack(p.roles) I'm not sure I understand what you're thinking here. Ian |
From: Ian B. <ia...@co...> - 2003-05-24 21:34:49
|
On Thu, 2003-05-22 at 10:18, Luke Opperman wrote: > My thoughts for post-0.4: > > * MemoryTransaction or an equivalent. Sure. > * Some decision on concurrency control/resolution (should SQLObject offer any, > and if so then let's do it.) If we can come to decisions ;) And implementations, of course. > * Actual enforcement of a constraints/validators system, at least providing > toSQL/toPython functionality. (Ian, I know you've been busy on FormEncode, and > that will be a help here :) Yes, that's distracted me from SQLObject stuff recently. I'd also like to do some cleanup of names: sqlRepr -> __sqlrepr__ *Connection -> *Store Maybe others. _columns -> __columns__ ? I don't like the _setting style, but maybe lots of __setting__ names just get excessive, and they shouldn't be public (i.e., no underscores). The old names would still work for a version or two, with warnings being printed maybe in 0.6. sqlRepr might be more quickly deprecated, but none of the others. The constraints/conversion stuff is getting closer. I want to do one more renaming on Validator, and change the class hierarchy slightly (so there's an option for a more trimmed-down validators), but that's it. So it should be able to go in fine. Joins will be redone, definitely. They'll be put in their own module, like Col, and a Join/SOJoin distinction will be made. That should make the code significantly cleaner -- not so much shorter, but it will be clear what code gets invoked at what point of the process. I'll probably add a one-to-one join, and maybe a more generic join for annotated relations (like the Worker-Customer + rate example). Maybe implement orderBy in the database too. Open to consideration is join results becomimg iterators, ala select results... I think that's all that's been on my mind, at least short-term. FormEncode will start to become more involved at some point, but the coupling will be loose -- I'm going to try to create an interface for publishing objects (and classes) via FormEncode, and SQLObject should be able to implement that fairly trivially (once I figure out what "that" is). Ian |
From: Ian B. <ia...@co...> - 2003-05-24 21:20:48
|
On Thu, 2003-05-22 at 10:07, Luke Opperman wrote: > Ian et al - > > Following on Mohan's problems with 0.3, and restarting a thread from a few > weeks ago, what do you have in mind before releasing 0.4. Haven't seen any > major CVS updates since I submitted the transaction implementation, and we're > using CVS+MemoryTransaction+MultiRegistry+odds&ends in a couple apps about to > go live here. So at least for my use cases, CVS has not revealed any bugs in > the last week or two. How about an 0.4b after the holiday weekend? The other big thing that's been holding me up is that I'd like to rework the documentation. Actually, I have to, since there's a number of stylistic things that have also changed. Documentation... sigh... > Here's my list of maintained changes against CVS, just for the record (and if > anyone thinks they should go in for a new release...): > > * Multiple classRegistry's, via optional class-level "_registry" string. Okay, this has been committed. > * Converting _defaultOrder to dbcolumn via column lookup if it's in > python-style. (Select orderBy expects db-style, but it makes more sense to me > to specify in python-style.) Okay. I implemented it differently than you, you might want to check it in CVS. (It's implemented in SelectResults) > * Use of _defaultOrder in Joins performJoin(), for now a naive python sort. Yes, but you should be able to turn it off as well. I added an orderBy argument to joins, where None means no ordering, unspecified means pick it up from the appropriate class. > * Minor changes in new (setting inst._connection if passed in), and joins > (using inst's _connection) that make logical sense, and allow > MemoryTransaction to work. I think this is already correct. The connection gets passed from new to __new__, then from __new__ to _init, and in _init it gets set. > * SQLObject.copyToConnection() I think your implementation only does something for the particular case of MemoryTransaction -- it doesn't do a general copy between connections. Maybe it could go in, but undocumented and with a warning comment. > * aliasing of destroy to destroySelf for backwards compatibility until destroy > is meaningful.. I don't think destroy will ever be meaningful. I'm still thinking of getting rid of destroySelf in lieu of the delete class method. > * A few miscellaneous columns. Most important, BooleanCol. (However, BooleanCol > doesn't actually work properly with Postgres, as Postgres refuses to accept > integers for BOOL fields; it's a long story. Need to implement toSQL/toPython > Column functionality to fix this. But that's probably for 0.5 :) Yeah, I know, it sucks. Right now I'll just document using the sqlType argument to Col, and how to create the right _set_ method for a boolean column. BooleanCol can go in properly when it works without hacks. > * MemoryTransaction, although I'd understand if this were left out of this > release. :) Related note: I need to write some docs on MemoryTransaction > still. Yes, docs are important. Given those it can go in, marked experimental for now. Ian |
From: Bud P. B. <bu...@si...> - 2003-05-23 08:31:56
|
Actually, I've thought more about the issue below. The problem aren't properties, the problem are when you define them: either at class creation time (via __metaclass__ and inheritance from SQLObject) or at runtime after the class is created: In the former case, you define property setters and getters before the user can and the user has to take you setters and getters into consideration--in the latter case, you have to take possible setters and getters defined by the user into consideration. This should be possible through introspection...??? --b On Thu, 22 May 2003 23:10:33 +0200 "Bud P. Bruegger" <bu...@si...> wrote: > > Oh, and I'm not sure what we're talking about as far as properties go. When > > I said 'manually defining properties will be ignored' i meant SQLObject > > doesn't care about them, and they'll work fine. > > Not sure whether I misunderstand, so to find out here is how I think > about the issue. Every column corrsponds to a property whose setter > method is overloaded to store the object in the dbms. If application > programmers have a reason to (independently of SQLObject) make the > same attribute into a property, they have to possibly call SQLObject's > setter method from their setter method (in the same way as the > __init__ of a subclass usually calls the __init__ of the superclass). > One possible example of such as thing would be to log all state > changes to a file. > > So if this understanding of what is happening is accurate, application > programmer need to take special steps when defining properties related > to columns. Or to avoid that, use the non-standard-python shortcut > method to define properties. So in any case, the way to define > properties (related to columns) is different from that of plain pyton. > In contrast, if column values are simply represented in the __dict__ > of the instance, everything is as in plain python. /----------------------------------------------------------------- | Bud P. Bruegger, Ph.D. | Sistema (www.sistema.it) | Via U. Bassi, 54 | 58100 Grosseto, Italy | +39-0564-411682 (voice and fax) \----------------------------------------------------------------- |
From: Brad B. <br...@bb...> - 2003-05-23 01:58:49
|
On 05/22/03 16:49, Luke Opperman wrote: > > > 1. Why isn't connection = conn making this Do The Right Thing? I > > thought the semantics I've shown above would be creating a new DB > > connection every time, and thus hopefully have the child avoid > > interfering with the parent. > > I'm going to make a wild guess, and noting that CVS is still missing certain > fixes that were pointed out when I last posted MemoryTransaction...: > > around line 683, in new(), add the following line: > if kw.has_key('connection'): > inst = cls(None, connection=kw['connection']) > + inst._connection = kw['connection'] > del kw['connection'] > > There are related bugs in RelatedJoin.add and RelatedJoin.remove, where they > should use the connection from inst and not self.callingClass. > > > 2. Is there any way to use SQLObject safely in a threaded environment > > like the os.fork'ing example I have above? > > So long as you don't have any of the potential concurrency problems of trying > to update the same object from separate threads etc, I think the connection > problem above will solve it. And even in those cases, SQLObject shouldn't blow > up, you just don't know what data you'll end up with. :) Well, I tried your solution but it didn't fix my problem. This isn't because there was anything wrong with your solution, but more because I found other problems in my own code (e.g. I hadn't quite covered passing connections to *all* of my SQLObject-based classes being hit in the child process :/). For simplicty's sake, I'm doing this now: pid = os.fork() if not pid: # child process -- process the batch file whilst control # has been returned to the browser try: conn = PostgresConnection( host='localhost', db='foo', user='bar', passwd='baz' ) MerchantAccount._connection = conn MerchantProxyAccount._connection = conn MerchantUser._connection = conn PurchaseTransaction._connection = conn try: batch_file.process() except TooManyErrors, err: pass finally: os._exit(0) This also fixed another bug where the child would never return in my Webware servlet (three cheers for os._exit()! :). Thanks for helping me see what I was doing wrong. :) -- Brad Bollenbach BBnet.ca |
From: Luke O. <lu...@me...> - 2003-05-22 22:03:52
|
> 1. Why isn't connection = conn making this Do The Right Thing? I > thought the semantics I've shown above would be creating a new DB > connection every time, and thus hopefully have the child avoid > interfering with the parent. I'm going to make a wild guess, and noting that CVS is still missing certain fixes that were pointed out when I last posted MemoryTransaction...: around line 683, in new(), add the following line: if kw.has_key('connection'): inst = cls(None, connection=kw['connection']) + inst._connection = kw['connection'] del kw['connection'] There are related bugs in RelatedJoin.add and RelatedJoin.remove, where they should use the connection from inst and not self.callingClass. > 2. Is there any way to use SQLObject safely in a threaded environment > like the os.fork'ing example I have above? So long as you don't have any of the potential concurrency problems of trying to update the same object from separate threads etc, I think the connection problem above will solve it. And even in those cases, SQLObject shouldn't blow up, you just don't know what data you'll end up with. :) - Luke |
From: Brad B. <br...@bb...> - 2003-05-22 21:45:01
|
I have a Webware servlet that has code like this: pid = os.fork() if not pid: try: batch_file.process() except TooManyErrors, err: pass At some point in the code that batch_file.process() is calling, I instantiate one of my SQLObject-based classes which is mapped to a database table that acts as a logger for purchase transactions: conn = PostgresConnection( host='localhost', db='foo', user='bar', passwd='somepass' ) self.purchase_transaction = PurchaseTransaction.new( merchantProxyID = self.merchant_proxy_id, transactionID = self.txID, transactionDate = now(), amount = self.amount, productDesc = self.description, cardholderName = self.cardholder_name, firstTwoCCDigits = self.card_number[0:2], lastFourCCDigits = self.card_number[-4:], expireMonth = self.card_expire_month, expireYear = self.card_expire_year, emailAddress = self.email, connection = conn ) I'm doing this because I want to return control to the web browser whilst I go off and process lots of credit card transactions from the batch_file in the background. The problem is that if this batch_file.process() stuff is executing in the background and I continue to do database retrievals and such through the web, it's quite easy for me to create various kinds of exceptions; Error: no results to fetch, OperationalError: not connected to server, etc. The error messages aren't consistent, but what *is* consistent is that it's always dying on something to do with getting data from the database whilst this child process is executing in the background. It seems like the child and parent processes are trying to read from the same socket, with obvious trouble happening from there. Two questions then: 1. Why isn't connection = conn making this Do The Right Thing? I thought the semantics I've shown above would be creating a new DB connection every time, and thus hopefully have the child avoid interfering with the parent. 2. Is there any way to use SQLObject safely in a threaded environment like the os.fork'ing example I have above? -- Brad Bollenbach BBnet.ca |
From: Bud P. B. <bu...@si...> - 2003-05-22 21:11:45
|
On Thu, 22 May 2003 13:50:22 -0500 Luke Opperman <lu...@me...> wrote: > Indeed, my information was incomplete. Here's the real story: > > class instantiation: > __init__ will be called if __new__ returns an instance of the same class. This is when you call the class as a constructor. Just calling __new__ gets all the lower level allocation in order and __init__ usually assigns values to __dict__. So since these values were already assigned and we made them persistant in some way, they don't have to be reassigned... Also consistency of the values was checked during initial assignment so we can assume that the values we have are consistent... I'm wondering what pickle does--it's kind of the same problem. Does it call __init__ during unpickling or just __new__??? While I haven't looked into the details, I'm quite positive the latter holds.. > 2.2.1 has a bug that disobeys this, and will call __init__ regardless. > > SQLObject always does full instantiation, not calling just __new__. I can't > find any direct documentation against just calling __new__, so I suppose > that's one way to avoid __init__. But it means you can't have something like > Person(12). For what it's worth, some free flowing thoughts on the issue of Person(12). It seems that our discussion so far has furthered each others understanding (open source at work), so here some more provocative thoughts: A python class has only a single "constructor", ie. a combination of __new__ and __init__. If you want multiple constructors, you have to subclass. I do this regularly to provide different __init__ with a signature specialized for different purposes. (And the rest of the subclass is unchanged). Applying this thought to Person(12), you really need an __init__ with a different signature (12, instead of name=..., age=...), so a subclass may be one solution. Is there a good naming convention for this? Maybe GetPerson(12) where GetPerson is a subclass of Person? But then, why does the act of retrieving an existing person from some storage (or namespace, to think more generally) need to be a class constructor? Why not Person.retrieve(12) or SQLObject.person(12) (both class methods)? Or actually Persons.get(12) may be a good way of putting it. (At least if the natural language of programming in English and application programmers don't chose to end their class names with and 's'). If you went that latter way, you could have Person('Ian', 'Bicking', 'autor of SQLObject') as a constructor instead of a classmethod Person.new('Ian', 'Bicking', '...'). The former way seems to be what I expect from plain Python. Is there any specific rational why you reversed this? > Oh, and I'm not sure what we're talking about as far as properties go. When I > said 'manually defining properties will be ignored' i meant SQLObject doesn't > care about them, and they'll work fine. Not sure whether I misunderstand, so to find out here is how I think about the issue. Every column corrsponds to a property whose setter method is overloaded to store the object in the dbms. If application programmers have a reason to (independently of SQLObject) make the same attribute into a property, they have to possibly call SQLObject's setter method from their setter method (in the same way as the __init__ of a subclass usually calls the __init__ of the superclass). One possible example of such as thing would be to log all state changes to a file. So if this understanding of what is happening is accurate, application programmer need to take special steps when defining properties related to columns. Or to avoid that, use the non-standard-python shortcut method to define properties. So in any case, the way to define properties (related to columns) is different from that of plain pyton. In contrast, if column values are simply represented in the __dict__ of the instance, everything is as in plain python. > Agree with your summary that some form of transaction handling is required, > and hence delayed updates, if you want to sanely handle attribute > constraints/collisions. Yes, you're right with the constraints--haven't thought of these before. Moving from one consistent state of an object to another will often require the change of more than one attribute. Same idea as transactions... The other reason why I feel unconfortable about updating on every single attribute change is performance. Single attribute getter and setter methods are an antipattern of distributed computing... While SQLObject gives the possiblity to change several attributes at once (with the set method), in my view this is quite awkward: it kind of requires you to shadow the object with an object or an equivalent unstructured collection of local variables to prepare the new state. Usually, you would want to call several methods on the object in order to prepare a new consistent state. If you allow attribute assignments without committing, you can do that. The set method condemns you to do the same without using any class methods and thus gets in the way.. Enough for today... cheers --bud |
From: Ian B. <ia...@co...> - 2003-05-22 19:07:42
|
Huh... that is weird. Maybe I misinterpreted __new__/__init__ documentation, and I guess I never did really test it out. So I'm guessing my use of _init in __new__ can be replaced by __init__, and all will be well... but now I want to know what the real story is too, so I'll have to look at the Python docs again. On Thu, 2003-05-22 at 11:40, Bud P.Bruegger wrote: > On Thu, 22 May 2003 11:02:36 -0500 > Luke Opperman <lu...@me...> wrote: > > > See above, it's impossible in python to call __new__ and avoid __init__ to my > > knowledge. > > While I'm surely not a Python wizzard, this kind of surprises me. So I > ran a little test: > > #------------------------------------------------- > #!/usr/local/bin/python > > class C(object): > def __init__(self): > self.bud="was here" > > c1 = C() > print c1.bud > c2 = C.__new__(C) > try: > print c2.bud > except: > print "c2 has no attribute 'bud'" > #------------------------------------------------- > > That produces the following output: > > #------------------------------------------------- > was here > c2 has no attribute 'bud' > #------------------------------------------------- > > > So I don't think __init__ ran in any way! > > Also, if __init__ was to run implicitly, where would it get the > parameters from? I avoided parameters other than self above, but this > seems weird to me... > > --b > > > > ------------------------------------------------------- > This SF.net email is sponsored by: ObjectStore. > If flattening out C++ or Java code to make your application fit in a > relational database is painful, don't do it! Check out ObjectStore. > Now part of Progress Software. http://www.objectstore.net/sourceforge > _______________________________________________ > sqlobject-discuss mailing list > sql...@li... > https://lists.sourceforge.net/lists/listinfo/sqlobject-discuss |
From: Luke O. <lu...@me...> - 2003-05-22 19:05:12
|
> While I'm surely not a Python wizzard, this kind of surprises me. So I > ran a little test: Indeed, my information was incomplete. Here's the real story: class instantiation: __init__ will be called if __new__ returns an instance of the same class. 2.2.1 has a bug that disobeys this, and will call __init__ regardless. SQLObject always does full instantiation, not calling just __new__. I can't find any direct documentation against just calling __new__, so I suppose that's one way to avoid __init__. But it means you can't have something like Person(12). Other options are to return a different class (an SQLObjectCached?) in __new__, as mentioned above, or to redefine __init__ to check for a cached version. Thanks to exarkun on #python for the initial idea (some maybe-code here, regarding cache syntax): class Cacheable(type): cached = Cache() def __new__(klass, name, bases, attrs): init = attrs.get('__init__', None) def __init__(self, *args, **kw): if init and not self.__class__.cached.get(id): init(self, *args, **kw) attrs['__init__'] = __init__ return type.__new__(klass, name, bases, attrs) class C(object): __metaclass__ = Cacheable def __new__(cls, id=None): inst = cls.cached.get(id,None): if not inst: inst = object.__new__(cls) cls.cached.put(inst) return inst This is almost exactly what SQLObject does, except it doesn't override __init__ today. So it looks like it would be easy enough to make SQLObject behave 'more' appropriately... Oh, and I'm not sure what we're talking about as far as properties go. When I said 'manually defining properties will be ignored' i meant SQLObject doesn't care about them, and they'll work fine. Agree with your summary that some form of transaction handling is required, and hence delayed updates, if you want to sanely handle attribute constraints/collisions. Enjoy, - Luke |
From: Ian B. <ia...@co...> - 2003-05-22 19:03:23
|
On Thu, 2003-05-22 at 11:02, Luke Opperman wrote: > > What in plain python classes normally goes in __init__ should go in > > the _init method for SQLObject subclasses. _init seems to be called > > by SQLObject's new method. > > A little misunderstanding here. Off the top of my head this problem will occur > for any system that may transparently return an internally cached object on > 'creation'. You're free to do whatever you please in an SQLObject __init__ > method, so long as you realize it will be called whether this is a new object > (Person.new()), an exisiting row (Person(12)), or an existing object > (Person(12) that returns a cached value). It seems to me that 'typical' python > __init__ stuff is only relevant to the first case, and that any ORM system > should provide ways to override the first two separately (for SO, .new and > ._init), and probably probably hide the third case from users. But we can > probably agree that this behavior is not 'normal' python behavior, that it > would be nice if __init__ could be used for the first case... but what do you > do for the second case? Yes, _init is just a workaround for __init__ being called on a cache hit. I don't know why Python is designed like this -- it seems totally dumb to me. object.__new__ should call __init__, though maybe for some reason people wanted to get at the object before __init__, but after instantiation (which is what you currently can do). It's just dumb and annoying, and this is the workaround. The only alternative is to rewrite __init__ to _init in the metaclass, but then (for instance), calling the superclass's __init__ will fail. > Note that the call cycle in python (SQLObject) is: > > for Person.new(): > new > __new__ (explicit in new) > __init__ (implicit after __new__) > set (explicit in new) > _init (explicit in new/finishCreate) Hmmm... maybe that's a little funny. > > 2. Possible Alternative Approach > > -------------------------------- > > > > A possible alternative approach makes a major change in the look and > > feel inassuch that persistence methods (insert, update, ..) need to be > > called explicitly. > > > > * same goes for resolution of collisions in concurrancy control. I > > don't see a way of hiding these issues from the application > > programmer > > Agreed. I suppose this directly relates to my problem with any foreignKey > access possibly causing an SQLObjectNoFound error; with constraints and > collisions, any attribute access could conceivably need to be safe-guarded. If you turn off column caching, yes. With caching (_cacheValues), then potentially all input could be validated/converted when it was fetched. But I would expect the checks run on input from the database will be pretty light -- it's setting attributes that's more likely to cause an exception. It would be possible to batch setting and getting. Setting, for instance, is cached on a call to new() creation (based on _SO_creating). That could be generalized slightly, so that there was a sync() method (or update() or insert() or something). That wouldn't be a big change. It might look like: obj = SomeObject(12) obj.batch() obj.a = 10 obj.c = 20 obj.update() Maybe batch (or probably another name) could take arguments that would also get it to handle locking. For getting, the equivalent to obj.update() might be obj.sync(), which would refetch the object from the database. This all makes sense mostly if you are using _cacheValues, but not if you are caching instances. Ian |
From: Brad B. <br...@bb...> - 2003-05-22 18:52:31
|
On 05/22/03 11:04, Brad Bollenbach wrote: > On 05/22/03 10:18, Luke Opperman wrote: > > To start a separate thread, I'd like to open up discussion for what new things > > people are working on or considering for 0.5 (since 0.4 will be mostly a > > maintenance release, as I count it new features are just improved createSQL > > stuff and changes to Col to allow SQLObject inheritance to work properly.) > > > > My thoughts for post-0.4: > > > > * MemoryTransaction or an equivalent. > > > > * Some decision on concurrency control/resolution (should SQLObject offer any, > > and if so then let's do it.) > > > > * Actual enforcement of a constraints/validators system, at least providing > > toSQL/toPython functionality. (Ian, I know you've been busy on FormEncode, and > > that will be a help here :) > > > > That's all I've got for now. > > The framework needs a solid test suite. I can see there are some bugs > that have crept into SQLObject that would probably never have been > there in the first place with test-driven development (e.g. dropTable > not dropping sequences for PostgreSQL backends). > > I've been insanely busy with work and talks in the past few weeks > (spent six hours on the road last night going to and returning from > giving a talk to the Brandon Unix User group!), am moving to Quebec > City on May 30, and have to prepare for the EuroPython talk late next > month about SQLObject. > > But I'm definitely interested in retrofitting a test suite onto the > framework (basically by writing unit tests against the documented API) > and hope that future SQLObject contributions will submit unit tests > with their code. Just updated from CVS, and I can see that tests/test.py is growing. Cool. :) -- Brad Bollenbach BBnet.ca |
From: Bud P. B. <bu...@si...> - 2003-05-22 17:26:11
|
Hi Luke, thanks for the explanations. Some responses below.. On Thu, 22 May 2003 11:02:36 -0500 Luke Opperman <lu...@me...> wrote: > A little misunderstanding here. Off the top of my head this problem will occur > for any system that may transparently return an internally cached object on > 'creation'. I see cases where a _init hook is necessary. Say you store a derived attribute (that is redundant) but you don't want to put it in the db--then, everytime you get the object from the db, you have to recompute that derived one. But apart from these special cases, I don't necessarily believe that you really need it... Normal __init__ should do. > You're free to do whatever you please in an SQLObject __init__ > method, so long as you realize it will be called whether this is a new object > (Person.new()), an exisiting row (Person(12)), or an existing object > (Person(12) that returns a cached value). But what happens if I give it arguments other than self? The only example that comes to mind quickly is class Balanace(object): def __init__(self, amount, currency): amountInLocalCurrancy=amount * factor(currency) self.amount = amountInLocalCurrancy While amount would typically be stored in the db, currency is just for one time use and thrown out.. > It seems to me that 'typical' python > __init__ stuff is only relevant to the first case, and that any ORM system > should provide ways to override the first two separately (for SO, .new and > ._init), and probably probably hide the third case from users. But we can > probably agree that this behavior is not 'normal' python behavior, that it > would be nice if __init__ could be used for the first case... but what do you > do for the second case? In the second case I do s.th. along the following lines: curs.execute('SELECT id, name, age, ... FROM person_table ' +\ 'WHERE id = %s', requestedId) columnVals = curs.fetchone() p = instanceOfDesiredClass.__new__(instanceOfDesiredClass) p.name = columnVals[1] p.age = columnVals[2] etc. As I wrote in the separate message, I don't think this calls __init__ > Note that the call cycle in python (SQLObject) is: > > for Person.new(): > new > __new__ (explicit in new) > __init__ (implicit after __new__) > set (explicit in new) > _init (explicit in new/finishCreate) > > > for Person(12) (not cached): > __new__ > _init (explicit in __new__) > __init__ (implicit after __new__) > > for Person(12) (cached): > __new__ > __init__ (implicit after __new__) I must say I'm puzzled about this implicit __init__. Can you enlighten me on that? > > 2. Possible Alternative Approach > > -------------------------------- > > > > A possible alternative approach makes a major change in the look and > > feel inassuch that persistence methods (insert, update, ..) need to be > > called explicitly. > > > > * same goes for resolution of collisions in concurrancy control. I > > don't see a way of hiding these issues from the application > > programmer > > Agreed. I suppose this directly relates to my problem with any foreignKey > access possibly causing an SQLObjectNoFound error; with constraints and > collisions, any attribute access could conceivably need to be safe-guarded. > Not sure how I feel about that. While I initially don't like explicit storage > commands, I'm also not excited about how to deal with the above. I imagine that a typical use case is that some piece of code assigns vlues attribute by attribute and once done, commits the changes. Since there is no way in knowing automatically, when an application programmer wants to commit, then there needs to be a persistence-related command anyhow. Call it storeNcommit or whatever--but I don't see a way of doing this all implicitly. This is to say that if you want to do a dbms app without transactions, it would work. But that kind of beats the purpose of using a dbms in the first place (at least for me). > > * I believe (haven't verified) that it should be possible for users > > to define properties in lieu of just normal __dict__ items without > > any adverse effects. (I reason that the use of getattr and > > setattr by the middleware works equally well) > > Just to be clear, there's no danger in setting your own properties in > SQLObject. SQLObject tries to help by letting you just define _get/_set and > get a property generated, but manually created properties will just be > ignored. Agreed, you can do that. But the way you do that is not the normal python way--and this is exactly my point. If we manage to get SQLObject work as much as possible in the normal python way, it is easy to use (some people would call that "no magic") and learn and it is easy to use SQLObject with all kind of pre-existing code that now needs persistence. I personally believe that business logic should be written without a (at least specific) persistence method in mind and then made persistent in a second step. The more "magic", the more rewriting and surprises.. > I think the key thing to think about here is how collisions or constraints > affect using SQLObject at the attribute level. And probably transactions (see my point above that I don't see that you get away without explicit persistence commands when using them). cheers --b |
From: Bud P. B. <bu...@si...> - 2003-05-22 16:41:32
|
On Thu, 22 May 2003 11:02:36 -0500 Luke Opperman <lu...@me...> wrote: > See above, it's impossible in python to call __new__ and avoid __init__ to my > knowledge. While I'm surely not a Python wizzard, this kind of surprises me. So I ran a little test: #------------------------------------------------- #!/usr/local/bin/python class C(object): def __init__(self): self.bud="was here" c1 = C() print c1.bud c2 = C.__new__(C) try: print c2.bud except: print "c2 has no attribute 'bud'" #------------------------------------------------- That produces the following output: #------------------------------------------------- was here c2 has no attribute 'bud' #------------------------------------------------- So I don't think __init__ ran in any way! Also, if __init__ was to run implicitly, where would it get the parameters from? I avoided parameters other than self above, but this seems weird to me... --b |
From: Luke O. <lu...@me...> - 2003-05-22 16:17:26
|
Quoting "Bud P. Bruegger" <bu...@si...>: > Important is probably that there are no explicit database methods such > as insert (store, save) and update. This is done implicitly while > using the object (i.e., attribute access). Yep, and this is the main selling point for SQLObject for me. A class acts (in use - i know you disagree about defining classes) just like a python object. > When SQLObject retrieves an object from the db, it first gets a list > of column values and then calls the objects class (eg., p=Person()) to > create an object instance and then sets all the values. [I derive > this from the statement from the manual that "__init__ is called > everytime an object is just fetched from the cache"]. > > What in plain python classes normally goes in __init__ should go in > the _init method for SQLObject subclasses. _init seems to be called > by SQLObject's new method. A little misunderstanding here. Off the top of my head this problem will occur for any system that may transparently return an internally cached object on 'creation'. You're free to do whatever you please in an SQLObject __init__ method, so long as you realize it will be called whether this is a new object (Person.new()), an exisiting row (Person(12)), or an existing object (Person(12) that returns a cached value). It seems to me that 'typical' python __init__ stuff is only relevant to the first case, and that any ORM system should provide ways to override the first two separately (for SO, .new and ._init), and probably probably hide the third case from users. But we can probably agree that this behavior is not 'normal' python behavior, that it would be nice if __init__ could be used for the first case... but what do you do for the second case? Note that the call cycle in python (SQLObject) is: for Person.new(): new __new__ (explicit in new) __init__ (implicit after __new__) set (explicit in new) _init (explicit in new/finishCreate) for Person(12) (not cached): __new__ _init (explicit in __new__) __init__ (implicit after __new__) for Person(12) (cached): __new__ __init__ (implicit after __new__) > 2. Possible Alternative Approach > -------------------------------- > > A possible alternative approach makes a major change in the look and > feel inassuch that persistence methods (insert, update, ..) need to be > called explicitly. > > * same goes for resolution of collisions in concurrancy control. I > don't see a way of hiding these issues from the application > programmer Agreed. I suppose this directly relates to my problem with any foreignKey access possibly causing an SQLObjectNoFound error; with constraints and collisions, any attribute access could conceivably need to be safe-guarded. Not sure how I feel about that. While I initially don't like explicit storage commands, I'm also not excited about how to deal with the above. > So in this case, column values could be stored in normal attributes > (in the __dict__) without the need for properties that overload the > setter method to update the db. > > To retrieve an object from the db, a first step retrieves the column > values, then creates an empty object using <class>.__new__(<class>) > (thus avoiding __init__) and then assign the (unmarshelled) values to > the __dict__ (or better through use of normal setattr). See above, it's impossible in python to call __new__ and avoid __init__ to my knowledge. > * I believe (haven't verified) that it should be possible for users > to define properties in lieu of just normal __dict__ items without > any adverse effects. (I reason that the use of getattr and > setattr by the middleware works equally well) Just to be clear, there's no danger in setting your own properties in SQLObject. SQLObject tries to help by letting you just define _get/_set and get a property generated, but manually created properties will just be ignored. I think the key thing to think about here is how collisions or constraints affect using SQLObject at the attribute level. - Luke |