modeling-users Mailing List for Object-Relational Bridge for python (Page 29)
Status: Abandoned
Brought to you by:
sbigaret
You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(19) |
Feb
(55) |
Mar
(54) |
Apr
(48) |
May
(41) |
Jun
(40) |
Jul
(156) |
Aug
(56) |
Sep
(90) |
Oct
(14) |
Nov
(41) |
Dec
(32) |
2004 |
Jan
(6) |
Feb
(57) |
Mar
(38) |
Apr
(23) |
May
(3) |
Jun
(40) |
Jul
(39) |
Aug
(82) |
Sep
(31) |
Oct
(14) |
Nov
|
Dec
(9) |
2005 |
Jan
|
Feb
(4) |
Mar
(13) |
Apr
|
May
(5) |
Jun
(2) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2006 |
Jan
(1) |
Feb
(1) |
Mar
(9) |
Apr
(1) |
May
|
Jun
(1) |
Jul
(5) |
Aug
|
Sep
(5) |
Oct
(1) |
Nov
|
Dec
|
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
(1) |
Dec
|
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(4) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Yannick G. <yan...@sa...> - 2003-07-21 14:28:27
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I'd like to set my foreign keys by hand ex.: i18n =3D I18N() ec.insert(i18n) i18n.setMasterId(masterSnapshot["id"]) ec.saveChanges() Is there a way to do it without breaking anything ? Since I switched to raw fetches, I end-up with a dict that is unusable with= =20 addObjectToBothSidesOfRelationship(). In fact, having=20 addObjectToBothSidesOfRelationship() accept snapshots would be nice. : ) =2D --=20 Yannick Gingras Byte Gardener, Savoir-faire Linux inc. (514) 276-5468 =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE/G/QLrhy5Fqn/MRARAk+sAKCPb/V7taerW4PV+EANdlJO5THjjACdFLIF NXkbx6p/zR6iOiXGWLi+hc4=3D =3DueUZ =2D----END PGP SIGNATURE----- |
From: Yannick G. <yan...@sa...> - 2003-07-21 13:27:33
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On July 21, 2003 09:01 am, Sebastien Bigaret wrote: > Does nobody uses CustomObject.snapshot()? Is it safe to make the > modification suggested in my previous post (included below)? I used to but I prefer the raw fetch. =2D --=20 Yannick Gingras Byte Gardener, Savoir-faire Linux inc. (514) 276-5468 =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE/G+oprhy5Fqn/MRARAkoqAJ9+2wrBZbTpGqcAF1stK29+wFQidQCaA/Rr hHeXO9n6fKqSD7/Sct9INhk=3D =3DIy7w =2D----END PGP SIGNATURE----- |
From: Sebastien B. <sbi...@us...> - 2003-07-21 13:01:07
|
Hi, Does nobody uses CustomObject.snapshot()? Is it safe to make the modification suggested in my previous post (included below)? You can either reply to this post, or comment bug item #774989 at https://sourceforge.net/tracker/index.php?func=3Ddetail&aid=3D774989&group_= id=3D58935&atid=3D489335 -- S=E9bastien. Note: AccessArrayFaultHandler instances are proxy-like objects handling to-many relationships that are not fetched yet (they are responsible for lazy initialization of to-many rels). I wrote: > Working on the ability to fetch raw rows, I incidentally found out > that the snapshot method is not behaving the way I thought it was > (this case was not tested and left undetermined). >=20 > The current behaviour is (using the StoreEmployees model & test data): >=20 > >>> ec=3DEditingContext() > >>> circus=3Dec.fetch('Store', 'corporateName =3D=3D "Flying Circus"')[0] > >>> circus.getEmployees().isFault() > 1 > >>> circus.snapshot() > {'corporateName': 'Flying Circus', 'employees': None} > >>> len(circus.getEmployees()) # clears the fault and fetches the array > 3 > >>> pprint.pprint(circus.snapshot()) > {'corporateName': 'Flying Circus', > 'employees': [<Modeling.GlobalID.KeyGlobalID instance at 0x8539884>, > <Modeling.GlobalID.KeyGlobalID instance at 0x85271f4>, > <Modeling.GlobalID.KeyGlobalID instance at 0x851ccfc>]} >=20 >=20 > As you can see, when the array of 'employees' is faulted (it has not > been fetched yet), it appears as None in the snapshot. It seems quite > weird to me. I'd highly prefer to return something like this: >=20 >=20 > >>> pprint.pprint(circus.snapshot()) > {'corporateName': 'Flying Circus', > 'employees': <Modeling.FaultHandler.AccessArrayFaultHandler instance at = 0x8539034>} >=20 > I'll probably change this, but I'd like to get your opinion on that, > especially I'd like to know how you handle this if you're already > using CustomObject.snapshot() in your own projects, and if such a > modification could fit your needs. >=20 > -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-07-21 09:40:21
|
Hi all, I've finally installed an oracle db, and made the corresponding database layer. You'll find it at https://sourceforge.net/tracker/index.php?func=3Ddetail&aid=3D774894&grou= p_id=3D58935&atid=3D489337 It is made of: a patch, and a tarball archive. This is for 0.9-pre-10. The installation procedure is described there. It has been tested on linux/debian woody (3.0) with Oracle 8i (v.8.1.7.0.0) and Oracle 9i (v9.2.0.1.0). Dependencies: python db-adaptor DCOracle2 at http://www.zope.org/Members/matt/dco2. Supported sql datatypes: char, varchar, varchar2, nchar, nvarchar2 number, decimal, int, integer, smallint, float, numeric, real, date, time with the additional datatypes for 9i: timestamp, timestamp with time zone, timestamp with local time zone Note 1: not all datatypes were tested, esp. timestamp[l]tz, I had no time for these. Note 2: the framework does not mess with the returned python attribute. For example, if you have a DATE column, you'll get a DCOracle1.OracleDate object. Refer to tests/testPackages/AuthorBooks/Book.py for an example on how they can be automatically converted into mx.DateTime objects (and back). Note 3: because of the format DCOracle2.OracleDate uses in a string context, every session is automatically altered with: alter session set nls_date_format =3D 'YYYY-MM-DD HH24:mi:ss' The only limitation wrt the other db layers is that it's not capable yet to create a database (i.e. create a user), so you'll have to do it by hand or use one of the default (system e.g.). If you test it, I'd like to hear for you here! Regards, -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-07-18 15:57:22
|
Hi all, Release 0.9-pre-10 is out! The most proeminent additions are: * the ability to fetch raw rows (instead of fully initialized objects), * two new qualifier operators have been introduced: IN and NOT IN, * parsing of qualifier strings has been made significantly faster. The documentation has been updated to document the new features, as well as how the framework reacts to fetch when objects have been inserted, updated or deleted. Please refer to section 4.5 for details, and in particular: - http://modeling.sf.net/UserGuide/ec-fetch-reflect-changes.html - http://modeling.sf.net/UserGuide/ec-fetch-raw-rows.html It also contains some bug fixes, one of which (ticket #772997) concerns the cancellation of an object's deletion within an EditingContext, which was mishandled by the framework. Among the different stuff we discussed already here, you must note that: - DatabaseContext.batchFetchRelationship() is not included yet, - the PyModel branch has been merged into the trunk but is not officially announced since it lacks documentation for now. This will be for the next release. Best regards, -- S=E9bastien. ------------------------------------------------------------------------ 0.9-pre-10 (2003/07/18) ----------------------- * Fixed bug #772997: deleted then re-insert objects not correctly handled. * Added the ability to fetch raw rows (dictionaries instead of fully intialized objects) --see FetchSpecification.setFetchesRawRows() and EditingContext.fetch() 's parameter 'rawRows'. Also added the possibili= ty to turn these rows into real objects --see EditingContext.faultForRawRo= w() Documentation updated. * Added CustomObject.snapshot_raw(), support for the future ability to fe= tch raw rows (see above) * rewrote trace() statements in QualifierParser to avoid the unnecessary formatting of its arguments when it is not enabled. On my machine this speeds up the parsing of qualifiers strings up to x7. * Added operator 'in' and 'not in' for fetch qualifiers. Operators 'AND', 'OR' and 'NOT' can now be written with lower-case characters. [Merged branch brch-0_9pre7-1-PyModel] Note: PyModel are not officially announced w/ this release, because there= 's no documentation yet. See mailing-list archives for details, or go there and ask. =20=20 * Fixed: adaptorModel() could raise instead of returning None when model's adaptorName is not set * Model: added updateModelWithCFG(), loadModel(), searchModel() ModelSet: DEPRECATED method: updateModelWithCFG() --moved in Model, will be removed in v0.9.1 * Added Modeling.PyModel and Modeling.tests.test_PyModel * Added tests/testPackages/StoreEmployees/pymodel_StoreEmployees.py and updated StoreEmployees/__init__.py: now loads the model from the PyModel * Changed ClassDescription's delete rules: constants DELETE_CASCADE, DELETE_DENY, DELETE_NULLIFY and DELETE_NOACTION are now strings (were: integers) --> Relationship.setDeleteRule() updated to accept old integer values (backward compatibility) ------------------------------------------------------------------------ |
From: Yannick G. <yan...@sa...> - 2003-07-18 13:26:24
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On July 18, 2003 07:00 am, Sebastien Bigaret wrote: > Yes, thanks for pointing this out. This is documented there: > http://psyco.sourceforge.net/psycoguide/metaclass.html > http://psyco.sourceforge.net/psycoguide/node19.html#l2h-19 > > The nice thing here is that you do not even have to change your class > declaration, just add 'from psyco.classe import *'. I still have to > study the effect of changing to classes of the framework to new-style. > > However a quick try with this enabled an EditingContext, > DatabaseContext and DatabaseChannel does not show any amelioration > (perf. are in fact slightly worse). My experience with Psyco shows that the code optimization does not always kick in on the 1st try. Enabling Psyco in profiling mode (which is unfortunately incompatible with the profiler in the standard library) will generate code optimization here and there even a few minutes after startup. As usual, I never experienced the kind of performance boost reported in the documentation. 40% faster max with OBB (http://openbeatbox.org) but since most of the hard work (raster op.)=20 is already done in C++ (PyQt) it was expected. =2D --=20 Yannick Gingras Byte Gardener, Savoir-faire Linux inc. (514) 276-5468 =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE/F/V9rhy5Fqn/MRARAq9BAKCVlVlzkLaf/mDZ3Nz9jlsBi2Yy2QCferdQ J8tb8gj/jpo221MTK7I35Rk=3D =3Du5qg =2D----END PGP SIGNATURE----- |
From: Sebastien B. <sbi...@us...> - 2003-07-18 12:41:29
|
I wrote: [...] > [raw row] 1st fetch (ec empty): 0.618131995201 > [raw row] 2nd fetch (objects already loaded): 0.61448597908 > [raw row] 3rd fetch (objects already loaded): 2.24008309841 [...] > 2. You probably already noticed that fetching raw rows is > significantly slower when the objects are loaded. The reason is > that objects already loaded are checked for modification, because > as I explained it in my previous post we have to return the > modifications, not the fetched data, if an object has been > modified. >=20 > I'm currently studying this, I have an implementation that does not > consume more time when no objects are modified: >=20 > [raw row] 1st fetch (ec empty): 0.595005989075 > [raw row] 2nd fetch : 0.585139036179 > [raw row] 3rd fetch (objects already loaded): 0.607128024101 >=20 > However, still, we cannot avoid the additional payload when some > objects are modified, and the more modified objects we have, the > slower will be the fetch (first figure, 2.24, would be the upper > limit here, when all objects are modified). >=20 > Second, I do not want to commit this quicker implementation now, > because a problem remains: if the database has been changed in the > mean time, you can get raw rows whose value are not the same then > _unmodified_ objects in the EditingContext. I'm not sure if this is > a significant problem, or put it differently, if we have to pay for > extra cpu-time for ensuring that this does not happen. But I feel a > little touchy in making a exception in the general rule. Okay, I thought I solved this and committed it in cvs [DatabaseChannel.fetchObject() v1.15]. It gave the following figures, on the same basis (py2.2 -O / no psyco): [raw row] 1st fetch (ec empty): 0.60 [raw row] 2nd fetch : 0.59 [raw row] 3rd fetch (objects already loaded): 0.66 Alas, I tried it then with all objects modified, and it took... about 1 minute for 5000 objects. I thought testing if the object was modified was a good idea, and it was when no objects were modified. But when all 5000 objects are modified, looking in a list of len(5000) if an object is there takes avg. 2500 look-ups, hence 2500 calls for __eq__. For 5000 objects, that 5000*2500=3D12.5e6 calls to __eq__! So back to the old behaviour, and the following figures (py2.2 -O) [raw row] 1st fetch (ec empty): 0.522832036018 [raw row] 2nd fetch : 0.516697049141 [raw row] 3rd fetch (objects already loaded): 1.80115604401 [raw row] 4th fetch (all objects modified): 1.70906305313 I'll commit this soon. And I guess it's time for me to stop annoying you stop w/ all these performance considerations. -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-07-18 12:18:45
|
> >> Attempting, and failing, to keep up with you... ;) > > > > Do you mean that you tested it and it failed? I've committed today > > afternoon slight changes, because of an unhandled situation when using > > nested ECs. However it shouldn't have failed in a "standard" situation. > > Could you be more explicit? >=20 > No, did not try it. Failing -- in terms of keeping up with your dev rhyth= m ;) Okay... I've sometimes problem at correctly interpreting english, especially after twilight! >=20 > Thanks for the numbers! It helps with the overall picture of what goes on= ... > However, just a question about the raw fetch: >=20 > > [raw row] 1st fetch (ec empty): 0.618131995201 > > [raw row] 2nd fetch (objects already loaded): 0.61448597908 > > [raw row] 3rd fetch (objects already loaded): 2.24008309841 > ... > > 2. You probably already noticed that fetching raw rows is > > significantly slower when the objects are loaded. The reason is > > that objects already loaded are checked for modification, because > > as I explained it in my previous post we have to return the > > modifications, not the fetched data, if an object has been > > modified. >=20 > OK, i see why the 2nd fetch in this case could be significantly longer. > But, why the 3rd? Barring any modifications to the loaded objects, > there should not be any differences between 2nd and subsequent > raw fetches? Sorry, a raw copy-paste between two functions and you get the wrong message. it should be read: 1st fetch (ec empty) 2nd fetch (ec stills empty) 3rd fetch (objects already loaded) So you get what was explained: slower if all objects are already loaded. > I only meant have two versions of a sample qualifier for a specific query, > and use it in an internal test/profiling script... nothing at all for the > public api! Seems like I was at least half asleep yesterday's evening! -- S=E9bastien. |
From: Mario R. <ma...@ru...> - 2003-07-18 11:58:13
|
On Jeudi, juil 17, 2003, at 19:23 Europe/Valletta, Sebastien Bigaret=20 wrote: > > Mario Ruggier <ma...@ru...> wrote: >> On jeudi, juil 17, 2003, at 16:43 Europe/Amsterdam, Sebastien Bigaret=20= >> wrote: > > Hey, you probably meant: "at 16:43 Europe/Paris" ;) Ha! It's annoying isn't it? It insists on adding the time and place of=20= the message being replied to, but the place is the receiving end (or, worse, where=20= it thinks the receiving end is -- when setting is Geneva, it uses Zurich! Hey, i'd=20 rather be in Amsterdam than in Zurich ;). Plus, there is some language interference=20= between my settings and the system settings... Anyway, it is Apple Mail 1.2.5=20 client, and actually if anyone knows how to configure the leading text that is auto=20= added to replies, i'd like to know! (not found how to do it yet) >>> Full functionality has been integrated in cvs yesterday evening,=20= >>> and I've >>> completed the documentation today. All this will be in the next=20 >>> release. >> >> Attempting, and failing, to keep up with you... ;) > > Do you mean that you tested it and it failed? I've committed today > afternoon slight changes, because of an unhandled situation when using > nested ECs. However it shouldn't have failed in a "standard" = situation. > Could you be more explicit? No, did not try it. Failing -- in terms of keeping up with your dev=20 rhythm ;) ... > I've a test db w/ 5000 simple objects, I'll try some test and > report. This could be added in the unittests, right. Thanks for the numbers! It helps with the overall picture of what goes=20= on... However, just a question about the raw fetch: > [raw row] 1st fetch (ec empty): 0.618131995201 > [raw row] 2nd fetch (objects already loaded): 0.61448597908 > [raw row] 3rd fetch (objects already loaded): 2.24008309841 ... > 2. You probably already noticed that fetching raw rows is > significantly slower when the objects are loaded. The reason is > that objects already loaded are checked for modification, because > as I explained it in my previous post we have to return the > modifications, not the fetched data, if an object has been > modified. OK, i see why the 2nd fetch in this case could be significantly longer. But, why the 3rd? Barring any modifications to the loaded objects, there should not be any differences between 2nd and subsequent raw fetches? >> And, the classic fetch may be further broken up into two, one built=20= >> with >> Qualifiers and the other with RawQualifiers (as per recent thread),=20= >> to keep >> an eye on this known possible bottleneck on the system. > > Sorry, I think I won't do this: we've just clarified the API, I do=20= > not > feel like splitting the fetch in two. However, what I *will* do is: > > 1. document the alternate (and less cpu-consuming) way of building > qualifiers, > > 2. add a section in the user's guide dealing with performance = tuning, > and explaining this particular point among others. > > I think this should be enough, isn't it? I only meant have two versions of a sample qualifier for a specific=20 query, and use it in an internal test/profiling script... nothing at all for=20 the public api! > -- S=E9bastien. mario |
From: Sebastien B. <sbi...@us...> - 2003-07-18 11:01:15
|
Yannick Gingras <ygi...@yg...> wrote: > Another trick is that Psyco is *much* more effective with the new style > classes (derived from object). Since you want to keep Zope support it > may be a problem but there is an other trick that most people ignore. > The parent in a class declaration is an expression. So this is > valid : [snipped] Yes, thanks for pointing this out. This is documented there: http://psyco.sourceforge.net/psycoguide/metaclass.html http://psyco.sourceforge.net/psycoguide/node19.html#l2h-19 The nice thing here is that you do not even have to change your class declaration, just add 'from psyco.classe import *'. I still have to study the effect of changing to classes of the framework to new-style. However a quick try with this enabled an EditingContext, DatabaseContext and DatabaseChannel does not show any amelioration (perf. are in fact slightly worse). -- S=E9bastien. |
From: Yannick G. <ygi...@yg...> - 2003-07-18 10:35:31
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday 18 July 2003 04:19, Sebastien Bigaret wrote: > > And if you read carefully the python cookbook you will find some > > really impressive performance boost while using some map.. and > > other in list or dict. But there is a drawback code isn't really > > eye candy after this tweak > > I'll check that however. Another trick is that Psyco is *much* more effective with the new style classes (derived from object). Since you want to keep Zope support it may be a problem but there is an other trick that most people ignore. The parent in a class declaration is an expression. So this is valid : class Object: pass def topLevelClass(): try: return object except NameError: return Object class MySuperFastClass(topLevelClass()): pass - -- Yannick Gingras Coder for OBB : Optimum Brawny Buspirone http://OpenBeatBox.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (GNU/Linux) iD8DBQE/F1iXrhy5Fqn/MRARAoPeAKCSnbp5IvdDeHoFrrS7Ux8r7+520wCfTrgU dLN9tQ3/jMw2di2AT+8MKoo= =QwvZ -----END PGP SIGNATURE----- |
From: Sebastien B. <sbi...@us...> - 2003-07-18 08:19:45
|
Jerome Kerdreux <Jer...@fi...> wrote: > Yep .. 2.2 is faster than 2.1 but the big performance boost is in > 2.3. (I read this a couple of times over the net) . While this is still experimental (py2.3 is not officially supported yet), you're definitely right: Python2.3 / +psyco ------------------ [std] 1st fetch : 4.89 / 3.57 [std] 2nd fetch : 0.73 / 0.64 Python2.3 -O / +psyco --------------------- [std] 1st fetch : 4.65 / 3.55 [std] 2nd fetch : 0.72 / 0.64 > Another thing : > Do you use if __debug__ in the Modeling (I don't have the code > since i'm at work) because the most important thing w/ -0 is that > it doesn't marshall the code inside if __debug__ so this can really > be a big improvement. No, I don't use this, thanks for the tip. > And if you read carefully the python cookbook you will find some > really impressive performance boost while using some map.. and > other in list or dict. But there is a drawback code isn't really > eye candy after this tweak I'll check that however. -- S=E9bastien. |
From: Jerome K. <Jer...@fi...> - 2003-07-18 07:00:28
|
Sebastien Bigaret wrote: > >That's more meat for the forthcoming 'tuning performance' section in the >guide. As a general conclusion, py2.2 is faster than py2.1, and the '-O' >option is definitively worth the try. > =20 > Yep .. 2.2 is faster than 2.1 but the big performance boost is in 2.3. (I read this a couple of times over the net) . Another thing : Do you use if __debug__ in the Modeling (I don't have the code since i'm at work) because the most important thing w/ -0 is that it doesn't marshall the code inside if __debug__ so this can really be a big improvement. And if you read carefully the python cookbook you will find some really impressive performance boost while using some map.. and other in list or dict. But there is a drawback code isn't really eye candy after this tweak > BTW: who is using the framework w/ python2.1 alone? And with py2.1 and > (because of) zope? > >-- S=E9bastien. > > =20 > I don't |
From: Sebastien B. <sbi...@us...> - 2003-07-17 23:35:22
|
Sebastien Bigaret <sbi...@us...> wrote: > [std] 1st fetch : 7.20251297951 > [std] 2nd fetch : 1.03094005585 >=20 > [raw row] 1st fetch (ec empty): 0.618131995201 > [raw row] 2nd fetch (objects already loaded): 0.61448597908 > [raw row] 3rd fetch (objects already loaded): 2.24008309841 >=20 > [psycopg] 1st fetch: 0.038547039032 > [psycopg] 2nd fetch: 0.0789960622787 I guess I had no chance when running [std], i.e. standard EditingContext.fetch() on 5000 objects. Here are corrected figures, average for 5 executions: Python2.1 --------- normal: [std] 1st fetch : 6.74 [std] 2nd fetch : 0.96 =20=20=20=20=20=20=20=20 python -O: [std] 1st fetch : 6.25 [std] 2nd fetch : 0.91 Python2.2 --------- normal: [std] 1st fetch : 5.80 [std] 2nd fetch : 0.90 =20=20=20=20=20=20=20=20 python -O: [std] 1st fetch : 5.30 [std] 2nd fetch : 0.83 By curiosity, I played ~20 minutes w/ psyco. After some tries, with the following code on top on the script: ------------------------------------------------------------------------ from threading import _RLock from Modeling.DatabaseContext import DatabaseContext from Modeling.DatabaseChannel import DatabaseChannel from Modeling.ClassDescription import classDescriptionForName psyco.bind(DatabaseContext.initializeObject) psyco.bind(DatabaseChannel.fetchObject) psyco.bind(classDescriptionForName) psyco.bind(_RLock.acquire) psyco.bind(_RLock.release) ------------------------------------------------------------------------ I got the following figure: Python2.1+psycho ---------------- normal: [std] 1st fetch : 5.70 [std] 2nd fetch : 0.81 python -O: [std] 1st fetch : 5.52 [std] 2nd fetch : 0.79 Python2.2+psycho ---------------- normal: [std] 1st fetch : 4.32 [std] 2nd fetch : 0.72 python -O: [std] 1st fetch : 4.21 [std] 2nd fetch : 0.68 Note on psycho: this was just tuned for fetching, and for fetching with very simple objects, that's all. This would need further investigation, obviously. But since I tried it, I thought I could share, I find it amazing to be able to get a significant performance imrpovement just a few minutes after having installed it. That's more meat for the forthcoming 'tuning performance' section in the guide. As a general conclusion, py2.2 is faster than py2.1, and the '-O' option is definitively worth the try. BTW: who is using the framework w/ python2.1 alone? And with py2.1 and (because of) zope? -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-07-17 18:36:52
|
Mario Ruggier <ma...@ru...> wrote: > Given the same db, what is the performance difference between > the following fetches (for logically equivalent queries)? > - 1st time "classic" fetch (in empty EC) > - 2nd time "classic" fetch (objects in resultset already known to EC) > - 1st time raw fetch (in empty EC) > - 2nd time raw fetch (objects in resultset already known to EC) > - 1st time dbapi2.0 execute query (direct via python adaptor) > - 2nd time dbapi2.0 execute query (direct via python adaptor) Here are the figures. Test database: 5000 objects with 3 attributes: a PK, a FK (to-one relation to an other object of a different type), and a text field. [std] 1st fetch : 7.20251297951 [std] 2nd fetch : 1.03094005585 [raw row] 1st fetch (ec empty): 0.618131995201 [raw row] 2nd fetch (objects already loaded): 0.61448597908 [raw row] 3rd fetch (objects already loaded): 2.24008309841 [psycopg] 1st fetch: 0.038547039032 [psycopg] 2nd fetch: 0.0789960622787 Comments:=20 1. no surprise, fetching real objects really takes much more time than simple fetch w/ psycopg, and raw psycopg fetch is the fastest. Maybe it's time for me to study the fetching process in details, to see where it can be enhanced. This could be done after 0.9-pre-10, i.e. after finishing the documentation for PyModels first. 2. You probably already noticed that fetching raw rows is significantly slower when the objects are loaded. The reason is that objects already loaded are checked for modification, because as I explained it in my previous post we have to return the modifications, not the fetched data, if an object has been modified. I'm currently studying this, I have an implementation that does not consume more time when no objects are modified: [raw row] 1st fetch (ec empty): 0.595005989075 [raw row] 2nd fetch : 0.585139036179 [raw row] 3rd fetch (objects already loaded): 0.607128024101 However, still, we cannot avoid the additional payload when some objects are modified, and the more modified objects we have, the slower will be the fetch (first figure, 2.24, would be the upper limit here, when all objects are modified). Second, I do not want to commit this quicker implementation now, because a problem remains: if the database has been changed in the mean time, you can get raw rows whose value are not the same then _unmodified_ objects in the EditingContext. I'm not sure if this is a significant problem, or put it differently, if we have to pay for extra cpu-time for ensuring that this does not happen. But I feel a little touchy in making a exception in the general rule. > It would be interesting to keep an eye on these values, for a particular > setup, thus when changes to the system are made, unexpected performance > side effects may still be observed. Maybe such a script can be added to > the tests? Sorry, I did not read you right: the idea of observing these figures to detect the impacts of changes on performance is indeed a very good idea. That will be done, sure. Cheers, -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-07-17 17:23:42
|
Mario Ruggier <ma...@ru...> wrote: > On jeudi, juil 17, 2003, at 16:43 Europe/Amsterdam, Sebastien Bigaret wro= te: Hey, you probably meant: "at 16:43 Europe/Paris" ;) > > I wrote: > >> Okay, back to the functionality: as it is made in the patch, > >> fetchesRawRows > >> misses two important functionalities: > >> > >> 1. it must behave the way a normal fetch behaves. This means the ins= erted > >> objects must be present, while deleted objects shouldn't be retur= ned. > >> > >> 2. It does not work at all for nested ECs. > >> > >> I thought that those of you who are already using the patch should be > >> aware of this. > >> > >> I'm currently working on both problems. Unittests are written, now I= 'm > >> on the code itself. When integrating this into the CVS, it will beha= ve > >> as expected in both situations. I'll report then here. > > > > Full functionality has been integrated in cvs yesterday evening, and = I've > > completed the documentation today. All this will be in the next relea= se. >=20 > Attempting, and failing, to keep up with you... ;) Do you mean that you tested it and it failed? I've committed today afternoon slight changes, because of an unhandled situation when using nested ECs. However it shouldn't have failed in a "standard" situation. Could you be more explicit? > Anyhow, just a small clarification: this means that raw fetches must > in any case exist within an editing context? I.e if the raw fetch > implies objects not yet in the EC, these objects are loaded, and > "CustomObject initialised", into the EC? It does not matter whether the objects are already loaded or not. And fetching raw rows does not create any object, never. When I said: > > 1. it must behave the way a normal fetch behaves. This means the > > inserted objects must be present, while deleted objects > > shouldn't be returned. I meant that inserted objects should appear when a fetch for raw rows was made (at least, if the object would appear in a normal fetch: remember that if you insert, say, a Book, then fetch all books, you'll get the new one along with the others --even if the new object is not saved in the database yet. I've just added a section in the User's Guide about this today! BTW same for deleted objects: they shouldn't appear in the result set of a fetch (objects or raw rows).=20 For completeness, I'll add that when a object is modified, it should appear in its modified state in the result set. > Given the same db, what is the performance difference between > the following fetches (for logically equivalent queries)? > - 1st time "classic" fetch (in empty EC) > - 2nd time "classic" fetch (objects in resultset already known to EC) > - 1st time raw fetch (in empty EC) > - 2nd time raw fetch (objects in resultset already known to EC) > - 1st time dbapi2.0 execute query (direct via python adaptor) > - 2nd time dbapi2.0 execute query (direct via python adaptor) >=20 > It would be interesting to keep an eye on these values, for a particular > setup, thus when changes to the system are made, unexpected performance > side effects may still be observed. Maybe such a script can be added to > the tests? I've a test db w/ 5000 simple objects, I'll try some test and report. This could be added in the unittests, right. > And, the classic fetch may be further broken up into two, one built with > Qualifiers and the other with RawQualifiers (as per recent thread), to ke= ep > an eye on this known possible bottleneck on the system. Sorry, I think I won't do this: we've just clarified the API, I do not feel like splitting the fetch in two. However, what I *will* do is: 1. document the alternate (and less cpu-consuming) way of building qualifiers, 2. add a section in the user's guide dealing with performance tuning, and explaining this particular point among others. I think this should be enough, isn't it? -- S=E9bastien. |
From: Mario R. <ma...@ru...> - 2003-07-17 16:55:59
|
On jeudi, juil 17, 2003, at 16:43 Europe/Amsterdam, Sebastien Bigaret wrote: > > I wrote: >> Okay, back to the functionality: as it is made in the patch, >> fetchesRawRows >> misses two important functionalities: >> >> 1. it must behave the way a normal fetch behaves. This means the >> inserted >> objects must be present, while deleted objects shouldn't be >> returned. >> >> 2. It does not work at all for nested ECs. >> >> I thought that those of you who are already using the patch should be >> aware of this. >> >> I'm currently working on both problems. Unittests are written, now >> I'm >> on the code itself. When integrating this into the CVS, it will >> behave >> as expected in both situations. I'll report then here. > > Full functionality has been integrated in cvs yesterday evening, and > I've > completed the documentation today. All this will be in the next > release. Attempting, and failing, to keep up with you... ;) Anyhow, just a small clarification: this means that raw fetches must in any case exist within an editing context? I.e if the raw fetch implies objects not yet in the EC, these objects are loaded, and "CustomObject initialised", into the EC? Given the same db, what is the performance difference between the following fetches (for logically equivalent queries)? - 1st time "classic" fetch (in empty EC) - 2nd time "classic" fetch (objects in resultset already known to EC) - 1st time raw fetch (in empty EC) - 2nd time raw fetch (objects in resultset already known to EC) - 1st time dbapi2.0 execute query (direct via python adaptor) - 2nd time dbapi2.0 execute query (direct via python adaptor) It would be interesting to keep an eye on these values, for a particular setup, thus when changes to the system are made, unexpected performance side effects may still be observed. Maybe such a script can be added to the tests? And, the classic fetch may be further broken up into two, one built with Qualifiers and the other with RawQualifiers (as per recent thread), to keep an eye on this known possible bottleneck on the system. mario |
From: Sebastien B. <sbi...@us...> - 2003-07-17 14:44:03
|
I wrote: > Okay, back to the functionality: as it is made in the patch, fetchesRaw= Rows > misses two important functionalities: >=20 > 1. it must behave the way a normal fetch behaves. This means the insert= ed > objects must be present, while deleted objects shouldn't be returned. >=20 > 2. It does not work at all for nested ECs. >=20 > I thought that those of you who are already using the patch should be > aware of this. >=20 > I'm currently working on both problems. Unittests are written, now I'm > on the code itself. When integrating this into the CVS, it will behave > as expected in both situations. I'll report then here. Full functionality has been integrated in cvs yesterday evening, and I've completed the documentation today. All this will be in the next release. -- S=E9bastien. |
From: Yannick G. <yan...@sa...> - 2003-07-17 13:43:53
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On July 17, 2003 12:33 am, you wrote: > Yannick Gingras <ygi...@yg...> wrote: > > No problem. And as I asked in private why LaTex instead of DocBook ? > > DocBook is a really simple mark up with implied support for unicode > > since it's XML. 30 different tags max in a regular document. KDE, > > Gnome, Python and Linux are all moving to DocBook. I would have put > > it the mark up myself but this LaTex thing is beyond my mere mortal > > capabilities. > > [laziness speaking] I couldn't find a reference for python/docbook, do > you have any? Hummm... it seems that the Python DocBook doc is no-where to be found now... The project was put on hold for some reason... But DocBook is still a nice markup, you can find more info in the on-line version of the ultimate book : http://docbook.sourceforge.net/ or a quick introduction on the Linux Documentation Project : http://tldp.org/LDP/LDP-Author-Guide/usingdocbooktags.html > Why latex? Because I know latex for years and have never tried docbook, > that's as simple. Moreover, I've found the python documentation process, > based on latex, quite handy to generate docs in html/pdf. And well, > before considering a migration, I'd better have a look at docbook > itself. Ok I must admin that since the official Python doc is LaTex and that they supply templates it's a better move to stick with LaTex. > > No the doc is not that bad. This is just a reminder that code is > > meant to be read by both human and computers. Furthermore, a good > > programming language will be more expressive that english. > > [...] > > > The case of the unit tests is really special. The fact that those are > > your quality assurance kit means that they are guaranteed to work. > > There is in those tests an amazing amount of working example and we need > > just that : a picture. I think that the doc should mention them, they > > are more valuable than you think. > > You're probably right. > > However this is already there in the doc, admittedly in little > characters, at http://modeling.sf.net/UserGuide/ec-object-uniquing.html, > I'll see how I can make this more proeminent. You should make it hard to miss it :=20 WARNING WARNING WARNING If you don't read the unit tests, you will miss all the fun ! WARNING WARNING WARNING ; ) =2D --=20 Yannick Gingras Byte Gardener, Savoir-faire Linux inc. (514) 276-5468 =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE/FqgWrhy5Fqn/MRARAjzeAJ9pCJUckayRNJqmtt411IXb9KWNPgCcCyco hXhufOq6PQr+UuWqaBaL7H0=3D =3DUhRs =2D----END PGP SIGNATURE----- |
From: Mario R. <ma...@ru...> - 2003-07-17 13:17:51
|
On mercredi, juil 16, 2003, at 21:35 Europe/Amsterdam, Sebastien =20 Bigaret wrote: > Yannick Gingras <yan...@sa...> wrote: ... >> Indeed it does the job but as it was discussed some time ago on the >> mailling list, it does not enable case insensitive match. A case >> insentivice match with u"=E9=E9" encoded in utf-8 with look for = "=C3=A9=C3=A9", >> "=E3=A9=C3=A9", "=C3=A9=E3=A9" and "=E3=A9=E3=A9" wich does not make = any sens once put back in >> unicode. "=E9=E9", "=C9=E9" and "=C9=C9" are respectivly "=C3=A9=C3=A9= ", "=C3=C3=A9" and "=C3=C3" >> once encoded. >> >> So it may be wise to let the user make the utf-8 trick. That way he >> won't blame you for the weird result of case insensitive match. On >> the other hand, some databases like Postgresql detect encoding and >> perform a descent case insitive match with utf-8 data. > > This needs investigation. If some of you could provide working python > code with unicode and psycopg/pypgsql/pgdb/mysqldb/sqlitedb, please > share. I've not time for this now. The option I had adopted when I came across this problem was to work completely in utf-8, from front to back. [For web, this is easily done =20= by encoding the pages in utf-8, i.e. setting the response header: setHeader('Content-Type', 'text/html; charset=3Dutf-8').] Thus, the modeling layer always gets utf-8, which is exchanged with db as is. This, however, means that case insensitive searches do not work -- if you case-insensitive search for =E9 or =C9, you will only get one or the other. This is not nice, but can live with it, at least for now. But, it is possible to work in unicode on the client side, and Postgres allows unicode queries, i.e. the sql query itself is in unciode. The data in the DB being in utf-8, I would expect that u"select * from sometable where upper(someprop) like upper('%=E9%')" would give all rows where someprop contains =E9 or =C9. But, no, does not seem to work... at least I have not figured it out yet. If anyone wants to play, I have tried to understand this with the code below (working without modeling): ''' Assume an i18nText table, that contains at least the 2 rows: en fr ... key cl=E9 KEY CL=C9 ''' # dbname =3D dbuser=3D dbpass =3D # from pyPgSQL import PgSQL con =3D =20 PgSQL.connect(database=3Ddbname,user=3Ddbuser,password=3Ddbpass,client_enc= odin=20 g=3D('utf-8','replace'),unicode_results=3D1) cur =3D con.cursor() cur.execute('SET CLIENT_ENCODING TO UNICODE') cur.execute(u"SELECT fr FROM i18nText WHERE en =3D 'key' ") _dbrset =3D cur.fetchall() match_on =3D _dbrset[0][0] # evaluates to unicode string "cl=E9" cur.execute(u"SELECT * FROM i18nText WHERE upper(fr) LIKE upper('%" =20 +match_on+ "%') " ) dbrset =3D cur.fetchall() cur.close() # However, this only returns the row for fr=3D'cl=E9'. If i change to match on en=3D'KEY', the uppercase row for fr=3D'CL=C9' = is =20 returned. Is this the behaviour that should I should expect? How should the upper function behave on unicode strings? (Note that I have sys.setdefaultencoding('utf-8') in my =20 sitecustomize.py, and I am not sure about all effects that this does. As for any other special settings on the DB, i do not remember any.) mario |
From: Sebastien B. <sbi...@us...> - 2003-07-17 12:38:39
|
Yannick Gingras <ygi...@yg...> wrote: > No problem. And as I asked in private why LaTex instead of DocBook ? > DocBook is a really simple mark up with implied support for unicode > since it's XML. 30 different tags max in a regular document. KDE, > Gnome, Python and Linux are all moving to DocBook. I would have put > it the mark up myself but this LaTex thing is beyond my mere mortal > capabilities. [laziness speaking] I couldn't find a reference for python/docbook, do you have any? Why latex? Because I know latex for years and have never tried docbook, that's as simple. Moreover, I've found the python documentation process, based on latex, quite handy to generate docs in html/pdf. And well, before considering a migration, I'd better have a look at docbook itself. > No the doc is not that bad. This is just a reminder that code is > meant to be read by both human and computers. Furthermore, a good > programming language will be more expressive that english. [...] > The case of the unit tests is really special. The fact that those are > your quality assurance kit means that they are guaranteed to work. > There is in those tests an amazing amount of working example and we need > just that : a picture. I think that the doc should mention them, they > are more valuable than you think. You're probably right. However this is already there in the doc, admittedly in little characters, at http://modeling.sf.net/UserGuide/ec-object-uniquing.html, I'll see how I can make this more proeminent. Regards, -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-07-17 11:45:41
|
Hi, Working on the ability to fetch raw rows, I incidentally found out that the snapshot method is not behaving the way I thought it was (this case was not tested and left undetermined). The current behaviour is (using the StoreEmployees model & test data): >>> ec=3DEditingContext() >>> circus=3Dec.fetch('Store', 'corporateName =3D=3D "Flying Circus"')[0] >>> circus.getEmployees().isFault() 1 >>> circus.snapshot() {'corporateName': 'Flying Circus', 'employees': None} >>> len(circus.getEmployees()) # clears the fault and fetches the array 3 >>> pprint.pprint(circus.snapshot()) {'corporateName': 'Flying Circus', 'employees': [<Modeling.GlobalID.KeyGlobalID instance at 0x8539884>, <Modeling.GlobalID.KeyGlobalID instance at 0x85271f4>, <Modeling.GlobalID.KeyGlobalID instance at 0x851ccfc>]} As you can see, when the array of 'employees' is faulted (it has not been fetched yet), it appears as None in the snapshot. It seems quite weird to me. I'd highly prefer to return something like this: >>> pprint.pprint(circus.snapshot()) {'corporateName': 'Flying Circus', 'employees': <Modeling.FaultHandler.AccessArrayFaultHandler instance at 0x= 8539034>} I'll probably change this, but I'd like to get your opinion on that, especially I'd like to know how you handle this if you're already using CustomObject.snapshot() in your own projects, and if such a modification could fit your needs. -- S=E9bastien. |
From: Yannick G. <ygi...@yg...> - 2003-07-16 22:05:27
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday 16 July 2003 17:46, you wrote: > Before I appear too stupid when trying things: UTF-8 is just a > specific way of encoding unicode in eight bits, right, i.e. one > character is translated to a serie of one to many characters, right? > It has nothing to do w/ ISO-8859-1, ISO-8859-15 etc. which are just > correspondance table between a number coded in one byte and a given > character, right? Yup, ISO-8859-X are a one byte encoding with a limited subset of characters where UTF-8 put a unicode char on one or more 8 bits byte (hence the 8 in it's name). A nice property of UTF-8 is that the 1st 127 chars are encoded one a single char and keep the same char code as it's ASCII equivalent. So a pure ASCII string (No ISO-8859-X here) is 100% valid UTF-8. UTF-8 is not the most efficient encoding for unicode but this particular compatibility makes it the most wide spread. UTF-8 is NOT compatible with Latin-1 (aka ISO-8859-1). Most Latin-1 chars with char code over 127 are encore on 2 bytes in UTF-8. Why UTF-8 and not pure unicode ? Because everything is damn too buggy to handle it ! The C++ coder will in fact want to die converting all those std::string declarations and the C coder simply can't use pointer arithmetic anymore (the width is not fixed). Luckily we use Python ! : D - -- Yannick Gingras Coder for OBB : Offside Bumptious Bastnaesite http://OpenBeatBox.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (GNU/Linux) iD8DBQE/FVdJrhy5Fqn/MRARAi6fAJ43CankWZ3TxHzm+Tymmi0cEL2gtACcDcFE BRa45X6SPEGU4Y1RaSWRBWk= =BBLu -----END PGP SIGNATURE----- |
From: SoaF at H. <so...@la...> - 2003-07-16 21:48:15
|
Sebastien Bigaret wrote: > >This needs investigation. If some of you could provide working python >code with unicode and psycopg/pypgsql/pgdb/mysqldb/sqlitedb, please >share. I've not time for this now. > =20 > I don't have some working code since I resolve this by calling a=20 XMLUtils.unicode2Str() .. before doing the request .. :( >However, speaking of case-insensitive match: if postgresql supports it, >then it should work, since the SQL WHERE clause behind is UPPER(...) >LIKE UPPER(...) --pure theory and not tested, so if someone feels like >testing it, go ahead :) > > >-- S=E9bastien. > =20 > In fact I think this is a MySQL dependant since I don't manage to put=20 anything unicode in the database (even without modeling and with latest MySQLDB ) .=20 Something else be carefull that "=E9=E9" isn't a unicode only . So i think the DBA do s= ome=20 translation to utf-8 on the fly ( so it works) Bye Bye |
From: Sebastien B. <sbi...@us...> - 2003-07-16 21:46:56
|
Yannick Gingras <yan...@sa...> writes: > Couldn't evaluate expression SELECT t0.gl_id, t0.account, t0.control_acct, > t0.uom_id, t0.acct_type, t0.is_active, t0.gvt_code FROM GL t0 WHERE > (t0.is_active <> -255 AND t0.account LIKE '%=C3=A9=C3=A9%' AND t0.gvt_cod= e LIKE > '%%'). Reason: exceptions.UnicodeError:ASCII encoding error: ordinal not = in > range(128) > Traceback (most recent call last): > File "/usr/lib/python2.2/site-packages/Modeling/DatabaseChannel.py", li= ne > 381, in selectObjectsWithFetchSpecification > entity) > File > "/usr/lib/python2.2/site-packages/Modeling/DatabaseAdaptors/AbstractDBAPI= 2AdaptorLayer/AbstractDBAPI2AdaptorChannel.py", > line 297, in selectAttributes > raise GeneralAdaptorException, msg > Modeling.Adaptor.GeneralAdaptorException >=20 [...] Okay, that comes from the database then, if we're at this point. Data was encoded w/ utf-8 before being sent. The fact is the framework is definitely not ready to accept raw unicode strings (the validation fails, then if we solve this SQLExpression fails to build the sql statement, etc.). Before I appear too stupid when trying things: UTF-8 is just a specific way of encoding unicode in eight bits, right, i.e. one character is translated to a serie of one to many characters, right? It has nothing to do w/ ISO-8859-1, ISO-8859-15 etc. which are just correspondance table between a number coded in one byte and a given character, right? > I'll see what I can do but unicode support is not with MySQL 4.0, it's wi= th > 4.1 which is still alpha... >=20 > http://www.mysql.com/doc/en/Nutshell_4.1_features.html mysql is known to be quite active, do you know when this will become the production release? Postgresql supports unicode/utf-8:=20 http://developer.postgresql.org/docs/postgres/multibyte.html SQLite supports it also (w/ some precautions for ORDER BY, LIKE, etc.): http://groups.yahoo.com/group/sqlite/message/1675 These links are for the archives. I /think/ I begin to see the light in this "mess" :) -- S=E9bastien. |