Thread: Re: [Modeling-users] Double Result
Status: Abandoned
Brought to you by:
sbigaret
From: Yannick G. <yan...@sa...> - 2003-07-30 13:39:58
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On July 29, 2003 11:09 pm, you wrote: > As a quick workaround, try typing the same at mysql prompt, but with > SELECT DISTINCT: does it help? Yes ! : ) > If it does, you can add the following method to > DatabaseAdaptors.MySQLAdaptorLayer.MySQLSQLExpression: > [sniped] Problem fixed ! : ) Thanks ! =2D --=20 Yannick Gingras Byte Gardener, Savoir-faire Linux inc. (514) 276-5468 =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE/J8qrrhy5Fqn/MRARAhRgAJ9yxvTCCRiy/j+CGl/uWivMaBnJNACfW44d QzcVOOUTLtXWl80RAJhQnhQ=3D =3DzpIN =2D----END PGP SIGNATURE----- |
From: <Jer...@fi...> - 2003-07-30 16:27:27
|
On Wednesday 30 July 2003 16:52, Sebastien Bigaret wrote: > Hi, > > Good to know this fixed your problem. May I ask you to be a little > more explicit about the version of mysql you use? --possibly off-list, > if for any reasons you don't want to expose it publicly. > > I'm asking this because I'm trying to know whether this should be > included in the core, or if I just need to add a warning (along w/ the > patch) for versions<x.y.z in the User's Guide (I cannot possibly have > all versions of mysql installed here to test ;). Moreover, I would > like to ask you to try some sample sql statements on your db, if > possible, against the tests databases (author_books and > store_employees). > > Thanks in advance, > > - S=E9bastien. > Hum i'm quite sure that it isn't a MySQL bug .. but a SQL one .=20 This kind of 'too much result' usually occur while using a wrong 'LEFT OUTER JOIN' or 'GROUP' directive . Pay caution that=20 select * from t1 left outer join t2 on t1.ID =3D=3D t2.FK_t1 is different=20 from=20 select * from t2 left outer join t1 on t1.ID =3D=3D t2.FK_t1.=20 Even thought you will get the same kind of result.=20 Bye bye ..=20 |
From: Sebastien B. <sbi...@us...> - 2003-07-30 17:44:53
|
Jer...@fi... wrote: > Hum i'm quite sure that it isn't a MySQL bug .. but a SQL one .=20 > This kind of 'too much result' usually occur while using a wrong > 'LEFT OUTER JOIN' or 'GROUP' directive . Pay caution that=20 >=20 > select * from t1 left outer join t2 on t1.ID =3D=3D t2.FK_t1 is different= =20 > from=20 > select * from t2 left outer join t1 on t1.ID =3D=3D t2.FK_t1.=20 > Even thought you will get the same kind of result.=20 Yes, I'm no sql guru, but I know such things can happen and that's why I'd like to make sure. If I could reproduce it on my msql 4.0 I'll conclude the same --but maybe I'm not trying the right thing. Yannick, could you please perform the following: - create, if it's not already done, the AUTHOR_BOOKS test database e.g.: [mysqladmin -h localhost -u root create "AUTHORBOOKS"] =20=20 - change the current directory to Modeling/tests and run python ./test_EditingContext_Global.py -d MySQL -r (you'll get errors for DROP TABLE --this is okay) =20=20 - at the mysql prompt for db AUTHOR_BOOKS copy-paste the following: INSERT INTO PK_SEQ_WRITER VALUES (18); INSERT INTO PK_SEQ_BOOK VALUES (20); INSERT INTO BOOK VALUES (NULL,'t3',20,18); INSERT INTO BOOK VALUES (NULL,'t2',19,18); INSERT INTO BOOK VALUES (NULL,'t1',18,18); INSERT INTO WRITER VALUES ('','w test1',NULL,0,NULL,18); Now, try these three select statements: select t0.id, t0.title, t0.price from WRITER t1 INNER JOIN BOOK t0 ON (t0.fk_writer_id=3Dt1.id) where (t0.title like '%' AND t0.price <> 0 AND t1.last_name like '%'); and: select t0.id, t0.title, t0.price=20 from BOOK t0=20 INNER JOIN WRITER t1 ON (t0.fk_writer_id=3Dt1.id) where (t0.title like '%' AND t0.id <> 0 AND t1.last_name like '%'); and: select t0.id, t0.title, t0.price=20 from BOOK t0=20 INNER JOIN WRITER t1 ON (t1.id=3Dt0.fk_writer_id) where (t0.title like '%' AND t0.price <> 0 AND t1.last_name like '%'); Do you see any duplicates?? NB: it seems to me that these requests corresponds to/are variant of your original query returning duplicates, which was: SELECT t0.activity_id, t0.code, t0.enterprise_id, t0.is_active=20 FROM ABCI18N t1 INNER JOIN ABC t0 ON ( t0.activity_id=3Dt1.activity_id ) WHERE (t0.code LIKE '%%' AND t0.is_active <> -255 AND t1.title LIKE '%ABC= %') -- S=E9bastien. |
From: Yannick G. <yan...@sa...> - 2003-07-30 19:05:55
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On July 30, 2003 01:45 pm, Sebastien Bigaret wrote: > Yes, I'm no sql guru, but I know such things can happen and that's why > I'd like to make sure. If I could reproduce it on my msql 4.0 I'll > conclude the same --but maybe I'm not trying the right thing. > > > Yannick, could you please perform the following: > > - create, if it's not already done, the AUTHOR_BOOKS test database > e.g.: [mysqladmin -h localhost -u root create "AUTHORBOOKS"] > =20 > - change the current directory to Modeling/tests and run python > ./test_EditingContext_Global.py -d MySQL -r (you'll get errors for > DROP TABLE --this is okay) Works perfectly > - at the mysql prompt for db AUTHOR_BOOKS copy-paste the following: > [sniped] > Do you see any duplicates?? No dupes > NB: it seems to me that these requests corresponds to/are variant of > your original query returning duplicates, which was: > > SELECT t0.activity_id, t0.code, t0.enterprise_id, t0.is_active > FROM ABCI18N t1 INNER JOIN ABC t0 ON ( t0.activity_id=3Dt1.activity_id ) > WHERE (t0.code LIKE '%%' AND t0.is_active <> -255 AND t1.title LIKE > '%ABC%') What may be different is that I have 2 records in ABCI18N that in fact all= =20 matcch "%ABC%" :=20 mysql> select * from ABCI18N where title like "%ABC%"; +----+-------------+---------+-------------------------------------------+-= =2D---------------------------------+ | id | activity_id | lang_id | description |= =20 title | +----+-------------+---------+-------------------------------------------+-= =2D---------------------------------+ | 3 | 2 | 1 | Description de l'activit=C3=A9 num=C3=A9ro A= BC123 |=20 Titre de l'activit=C3=A9 ABC123 | | 4 | 2 | 2 | Description of the activity code ABC123 |= =20 Title of the activity code ABC123 | +----+-------------+---------+-------------------------------------------+-= =2D---------------------------------+ So this may explain why the select match twice here and not on you database. =2D --=20 Yannick Gingras Byte Gardener, Savoir-faire Linux inc. http://www.savoirfairelinux.com/ =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE/KBVQrhy5Fqn/MRARAiWiAKCRV+l9eVtgV5mUUcudiZAHg4S3dgCcDTbY fKgJCsjlk1AHESpmKAhprts=3D =3DRXii =2D----END PGP SIGNATURE----- |
From: Sebastien B. <sbi...@us...> - 2003-07-30 21:04:10
|
To all: what Yannick revealed with this is a bug that impacts ALL ------ adaptors: Postgresql, MySQL, SQLite and Oracle. Summary: ec.fetch() can return duplicates in its result set. At the end of the message there's a patch that fixes the problem, until a new release is made. Yannick Gingras <yan...@sa...> writes: > > NB: it seems to me that these requests corresponds to/are variant of > > your original query returning duplicates, which was: > > > > SELECT t0.activity_id, t0.code, t0.enterprise_id, t0.is_active > > FROM ABCI18N t1 INNER JOIN ABC t0 ON ( t0.activity_id=3Dt1.activity_i= d ) > > WHERE (t0.code LIKE '%%' AND t0.is_active <> -255 AND t1.title LIKE > > '%ABC%') >=20 > What may be different is that I have 2 records in ABCI18N that in fact al= l=20 > matcch "%ABC%" :=20 > mysql> select * from ABCI18N where title like "%ABC%"; > +----+-------------+---------+-------------------------------------------= +-----------------------------------+ > | id | activity_id | lang_id | description = |=20 > title | > +----+-------------+---------+-------------------------------------------= +-----------------------------------+ > | 3 | 2 | 1 | Description de l'activit=C3=A9 num=C3=A9ro= ABC123 |=20 > Titre de l'activit=C3=A9 ABC123 | > | 4 | 2 | 2 | Description of the activity code ABC123 = |=20 > Title of the activity code ABC123 | > +----+-------------+---------+-------------------------------------------= +-----------------------------------+ >=20 > So this may explain why the select match twice here and not on you databa= se. Thanks *a lot* for reporting this and giving the details. It reveals a real bug in the framework. In fact I was not careful enough when writing my test queries. The following query is the equivalent one, this time: SELECT t0.ID,t0.FIRST_NAME,t0.LAST_NAME,t0.AGE,t0.BIRTHDAY,t0.FK_WRITER_ID FROM WRITER t0 INNER JOIN BOOK t1 ON t0.ID=3Dt1.FK_WRITER_ID WHERE ((t0.AGE < 100 OR t0.AGE > 200) AND t1.title LIKE '%'); and it returns duplicates as well, for the data inserted when initializing the test db (no insert needed): id | first_name | last_name | age | birthday | fk_writer_id=20 ----+------------+-----------+-----+---------------------+-------------- 2 | Francois | Rabelais | 508 | 1484-07-02 18:16:12 |=20=20=20=20=20= =20=20=20=20=20=20=20=20 3 | Frederic | Dard | 81 | 1921-06-29 04:56:34 | 2 3 | Frederic | Dard | 81 | 1921-06-29 04:56:34 | 2 3 | Frederic | Dard | 81 | 1921-06-29 04:56:34 | 2 So J=E9r=F4me was right and what we have here is not a mysql bug at all --it's the way SQL behaves: all JOIN results in a cartesian product so we can get duplicates. I don't know how many times I've forgotten that, but I bet it's not the last one :/ For details, see for example: http://developer.postgresql.org/docs/postgres/sql-select.html =3D> the bug is in the SQL generated by the framework itself. It makes it possible to get duplicates in the result set returned by EditingContext.fetch(). You'll find attached a simple patch solving the problem. I won't have the time to make a new release this evening, and probably not before the week-end. I plan to do it not latter than sunday evening. Regards, -- S=E9bastien. ------------------------------------------------------------------------ --- SQLExpression.py 24 Jul 2003 12:07:25 -0000 1.19 +++ SQLExpression.py 30 Jul 2003 20:36:34 -0000 @@ -491,7 +491,7 @@ lockClause): """ """ - statement=3DselectString or 'SELECT ' + statement=3DselectString or 'SELECT DISTINCT ' statement+=3DcolumnList statement+=3D' FROM '+tableList if lockClause: ------------------------------------------------------------------------ |
From: <Jer...@fi...> - 2003-07-31 00:14:29
|
On Wednesday 30 July 2003 22:50, Sebastien Bigaret wrote: > So J=E9r=F4me was right and what we have here is not a mysql bug at all > --it's the way SQL behaves: all JOIN results in a cartesian product so > we can get duplicates. I don't know how many times I've forgotten that, > but I bet it's not the last one :/ For details, see for example: > http://developer.postgresql.org/docs/postgres/sql-select.html > > =3D> the bug is in the SQL generated by the framework itself. It makes it > possible to get duplicates in the result set returned by > EditingContext.fetch(). > > You'll find attached a simple patch solving the problem. I won't have > the time to make a new release this evening, and probably not before > the week-end. I plan to do it not latter than sunday evening. > > Regards, > > -- S=E9bastien. > > > ------------------------------------------------------------------------ > --- SQLExpression.py 24 Jul 2003 12:07:25 -0000 1.19 > +++ SQLExpression.py 30 Jul 2003 20:36:34 -0000 > @@ -491,7 +491,7 @@ > lockClause): > """ > """ > - statement=3DselectString or 'SELECT ' > + statement=3DselectString or 'SELECT DISTINCT ' > statement+=3DcolumnList > statement+=3D' FROM '+tableList > if lockClause: > ------------------------------------------------------------------------ I'm thinking that you really need to check this carefully since sometimes we don't want to use disctinct, no ? hum deleting or count may have the same trouble no ?=20 I haven't work w/ modeling since 5 months right now so I may say wrong=20 things .. pay caution Bye Bye |
From: Sebastien B. <sbi...@us...> - 2003-07-31 11:30:08
|
Jer...@fi... wrote: [About SELECT DISTINCT] > I'm thinking that you really need to check this carefully since sometimes > we don't want to use disctinct, no ? hum deleting or count may have the > same trouble no ?=20 >=20 > I haven't work w/ modeling since 5 months right now so I may say wrong=20 > things .. pay caution No, you're right as far as ec.fetchCount() is concerned: it is also affected by this bug. Alas, I've no patch to propose for fetchCount() yet: I can't find a portable way of making this (see below). Deletes, inserts or updates are not affected by this bug, because the framework never joins tables for these operations. About fetch(): I can't think of any situations where you would not want the DISTINCT stuff. If you can think of such situations, I think it's time to speak ;) However, I'm still not sure that the DISTINCT keyword will always be added to any select statements, it can be better to add it only when tables are joined. Now back on fetchCount(): I've found these two forms: Form #1 ------- SELECT count(DISTINCT(t0.id)) [as count] FROM WRITER t0 INNER JOIN BOOK t1 ON t0.ID=3Dt1.FK_WRITER_ID WHERE ((t0.AGE < 100 OR t0.AGE > 200) AND t1.title LIKE '%'); Form #2 ------- SELECT count(*) from ( SELECT DISTINCT t0.ID,t0.FIRST_NAME,t0.LAST_NAME,t0.AGE,t0.BIRTHDAY,t0.FK_W= RITER_ID FROM WRITER t0 INNER JOIN BOOK t1 ON t0.ID=3Dt1.FK_WRITER_ID WHERE ((t0.AGE < 100 OR t0.AGE > 200) AND t1.title LIKE '%')) And here is how db-server accepts them: |Postgres| MySQL | SQLite | O9i | O8i | ---+--------+-------+--------+-----+-----| #1 | Y | Y | N | Y | Y | ---+--------+-------+--------+-----+-----| #2 | Y[1] | N | Y | Y | Y | ---+--------+-------+--------+-----+-----+ [1] for postgresql, 'as foo' should be appended at the end of the SQL statement Form #1 is clearly an extension of ANSI SQL-92, but I can't find whether form #2 is SQL-92 or not; I suspect it is, and that a from statement can be a sub-query (the so-called table subqueries), but I'd like to hear from you if you're sure about it. I think I'll probably make #2 the default in SQLExpression, and use #1 for MySQL. -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-08-03 11:09:29
|
Hi, A complete fix for bug #780495 (when ec.fetch() is joining two tables or more, the returned set of objects could have duplicates) has been submitted onto the main trunk. Corrected files are: SQLExpression.py v1.21 DatabaseAdaptors/MySQLAdaptorLayer/MySQLSQLExpression.py v1.6 DatabaseAdaptors/PostgresqlAdaptorLayer/PostgresqlSQLExpression.py v1.6 test_SQLExpression.py v1.7 Details: the 'DISTINCT' keywork is only added when needed, ie when two tables or more are joined. The corresponding SQL statement for counting rows / ec.fetchCount() is generated following the general rules exposed in the previous post (incl. below). -- S=E9bastien. Sebastien Bigaret <sbi...@us...> wrote: > Jer...@fi... wrote: > [About SELECT DISTINCT] > > I'm thinking that you really need to check this carefully since sometim= es > > we don't want to use disctinct, no ? hum deleting or count may have the > > same trouble no ?=20 > >=20 > > I haven't work w/ modeling since 5 months right now so I may say wrong= =20 > > things .. pay caution >=20 > No, you're right as far as ec.fetchCount() is concerned: it is also > affected by this bug. Alas, I've no patch to propose for fetchCount() > yet: I can't find a portable way of making this (see below). >=20 > Deletes, inserts or updates are not affected by this bug, because the > framework never joins tables for these operations. >=20 > About fetch(): I can't think of any situations where you would not want > the DISTINCT stuff. If you can think of such situations, I think it's > time to speak ;) However, I'm still not sure that the DISTINCT keyword > will always be added to any select statements, it can be better to add > it only when tables are joined. >=20 >=20 > Now back on fetchCount(): I've found these two forms: >=20 > Form #1 > ------- > SELECT count(DISTINCT(t0.id)) [as count] > FROM WRITER t0 > INNER JOIN BOOK t1 ON t0.ID=3Dt1.FK_WRITER_ID > WHERE ((t0.AGE < 100 OR t0.AGE > 200) AND t1.title LIKE '%'); >=20 > Form #2 > ------- > SELECT count(*) from ( > SELECT DISTINCT t0.ID,t0.FIRST_NAME,t0.LAST_NAME,t0.AGE,t0.BIRTHDAY,t0.FK= _WRITER_ID > FROM WRITER t0 > INNER JOIN BOOK t1 ON t0.ID=3Dt1.FK_WRITER_ID > WHERE ((t0.AGE < 100 OR t0.AGE > 200) AND t1.title LIKE '%')) >=20 > And here is how db-server accepts them: >=20 > |Postgres| MySQL | SQLite | O9i | O8i | > ---+--------+-------+--------+-----+-----| > #1 | Y | Y | N | Y | Y | > ---+--------+-------+--------+-----+-----| > #2 | Y[1] | N | Y | Y | Y | > ---+--------+-------+--------+-----+-----+ >=20 > [1] for postgresql, 'as foo' should be appended at the end of the SQL > statement >=20 > Form #1 is clearly an extension of ANSI SQL-92, but I can't find whether > form #2 is SQL-92 or not; I suspect it is, and that a from statement can > be a sub-query (the so-called table subqueries), but I'd like to hear > from you if you're sure about it. >=20 > I think I'll probably make #2 the default in SQLExpression, and use #1 > for MySQL. |
From: Sebastien B. <sbi...@us...> - 2003-07-30 14:51:48
|
Hi, Good to know this fixed your problem. May I ask you to be a little more explicit about the version of mysql you use? --possibly off-list, if for any reasons you don't want to expose it publicly. I'm asking this because I'm trying to know whether this should be included in the core, or if I just need to add a warning (along w/ the patch) for versions<x.y.z in the User's Guide (I cannot possibly have all versions of mysql installed here to test ;). Moreover, I would like to ask you to try some sample sql statements on your db, if possible, against the tests databases (author_books and store_employees). Thanks in advance, - S=E9bastien. Yannick Gingras <yan...@sa...> wrote: > On July 29, 2003 11:09 pm, you wrote: > > As a quick workaround, try typing the same at mysql prompt, but with > > SELECT DISTINCT: does it help? >=20 > Yes ! >=20 > : ) >=20 > > If it does, you can add the following method to > > DatabaseAdaptors.MySQLAdaptorLayer.MySQLSQLExpression: > > [sniped] >=20 > Problem fixed ! >=20 > : ) >=20 > Thanks ! >=20 > --=20 > Yannick Gingras > Byte Gardener, Savoir-faire Linux inc. > (514) 276-5468 |
From: Yannick G. <yan...@sa...> - 2003-07-30 16:18:09
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On July 30, 2003 10:52 am, Sebastien Bigaret wrote: > Good to know this fixed your problem. May I ask you to be a little > more explicit about the version of mysql you use? --possibly off-list, > if for any reasons you don't want to expose it publicly. # rpm -qa | grep mysql mysql-devel-3.23.56-1.80 php-mysql-4.2.2-8.0.7 mysql-server-3.23.56-1.80 mysql-3.23.56-1.80 It's a stock Red Hat 8.0 > I'm asking this because I'm trying to know whether this should be > included in the core, or if I just need to add a warning (along w/ the > patch) for versions<x.y.z in the User's Guide (I cannot possibly have > all versions of mysql installed here to test ;). Moreover, I would > like to ask you to try some sample sql statements on your db, if > possible, against the tests databases (author_books and > store_employees). Sure, no problem. =2D --=20 Yannick Gingras Byte Gardener, Savoir-faire Linux inc. http://www.savoirfairelinux.com/ =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE/J+FBrhy5Fqn/MRARAmeNAJsEQf58Ibj0cWsEw/4n5d8OYUJQ/gCfUS3Y OIAsjHmTcv6SncbOQWAbJ04=3D =3DPbqQ =2D----END PGP SIGNATURE----- |