Thread: [SQLObject] Patch: support for locking rows
SQLObject is a Python ORM.
Brought to you by:
ianbicking,
phd
From: David F. <df...@kl...> - 2006-02-22 00:22:09
Attachments:
sqlobject_locking.diff
|
Attached is a patch that makes SELECT statement "FOR UPDATE" so that the rows are locked within the current transaction. This is enabled by sqlmeta.= lockRows=3DTrue. As a bonus here's the documentation update, including the missing docu for = createSQL: =2D-- SQLObject.txt (revision 1598) +++ SQLObject.txt (working copy) @@ -734,6 +734,17 @@ values are: `indexes`: A list of all the indexes for this class. +`createSQL`: + SQL queries run after table creation. createSQL can be a string with a + single SQL command, a list of SQL commands, or a dictionary with keys t= hat + are dbNames and values that are either single SQL command string or a l= ist + of SQL commands. This is usually for ALTER TABLE commands. + ``%s``, if present, is substituted with the name of the table. + +`lockRows`: + A boolean (default False). If True, then any SELECT statement will be + ``FOR UPDATE`` so that the corresponding rows are write-locked until + the end of the current transaction. There is also one instance attribute: =2D-=20 David Faure -- fa...@kd..., df...@kl... KDE/KOffice developer, Qt consultancy projects Klar=C3=A4lvdalens Datakonsult AB, Platform-independent software solutions |
From: Oleg B. <ph...@ma...> - 2006-02-22 08:27:25
|
On Wed, Feb 22, 2006 at 01:21:58AM +0100, David Faure wrote: > +`createSQL`: > + SQL queries run after table creation. createSQL can be a string with a > + single SQL command, a list of SQL commands, or a dictionary with keys that > + are dbNames and values that are either single SQL command string or a list > + of SQL commands. This is usually for ALTER TABLE commands. > + ``%s``, if present, is substituted with the name of the table. Thank you! > +`lockRows`: > + A boolean (default False). If True, then any SELECT statement will be > + ``FOR UPDATE`` so that the corresponding rows are write-locked until > + the end of the current transaction. I think this should be an attribute or patameter of .select(), not sqlmeta. You don't want to write MyTable.sqlmeta.lockRows = True MyTable.select(...) MyTable.sqlmeta.lockRows = False especially in a multitreaded application... Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: David F. <df...@kl...> - 2006-02-22 12:27:02
|
On Wednesday 22 February 2006 09:27, Oleg Broytmann wrote: > On Wed, Feb 22, 2006 at 01:21:58AM +0100, David Faure wrote: > > +`createSQL`: > > + SQL queries run after table creation. createSQL can be a string wit= h a > > + single SQL command, a list of SQL commands, or a dictionary with ke= ys that > > + are dbNames and values that are either single SQL command string or= a list > > + of SQL commands. This is usually for ALTER TABLE commands. > > + ``%s``, if present, is substituted with the name of the table. >=20 > Thank you! >=20 > > +`lockRows`: > > + A boolean (default False). If True, then any SELECT statement will= be > > + ``FOR UPDATE`` so that the corresponding rows are write-locked until > > + the end of the current transaction. >=20 > I think this should be an attribute or patameter of .select(), not > sqlmeta. You don't want to write >=20 > MyTable.sqlmeta.lockRows =3D True > MyTable.select(...) > MyTable.sqlmeta.lockRows =3D False >=20 > especially in a multitreaded application... I wasn't thinking of toggling lockRows on and off, but rather as something = that would always be on in applications that use transactions. The problem with modifying select() is that it's only one case where a SELE= CT statement=20 is being made. I see 4 calls to _SO_selectOne (plus one to the Alt version)= in main.py.=20 =46or instance when creating an instance of the sqlobject-derived-class, wh= en calling sync() (not sure what that's for), and apparently also when accessi= ng since values out of an instance, which can call loadvalue or getvalue. How should we then control what happens there, if not by some kind of object setting like sqlmeta.lockRows? I think that sqlmeta.lockRows is the best way to make -all- selects lock ro= ws, independently of the operation and level of abstraction being used=20 (creating an instance, calling select() or accessing columns). =2D-=20 David Faure -- fa...@kd..., df...@kl... KDE/KOffice developer, Qt consultancy projects Klar=E4lvdalens Datakonsult AB, Platform-independent software solutions |
From: Oleg B. <ph...@ma...> - 2006-02-22 13:01:19
|
On Wed, Feb 22, 2006 at 01:26:49PM +0100, David Faure wrote: > I think that sqlmeta.lockRows is the best way to make -all- selects lock rows, I dont think there is a general need to *always* lock rows for a table. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: David F. <df...@kl...> - 2006-02-22 13:21:20
|
On Wednesday 22 February 2006 14:01, Oleg Broytmann wrote: > On Wed, Feb 22, 2006 at 01:26:49PM +0100, David Faure wrote: > > I think that sqlmeta.lockRows is the best way to make -all- selects loc= k rows, >=20 > I dont think there is a general need to *always* lock rows for a table. Well, my customer needs it, but I can understand your point of view. Could you help me a bit so that I can do the right thing then? What's the idea behind loadvalue and getvalue issueing a SELECT statement? I thought the select was done by MyObject.select(), so why is select done (= again? instead?) when accessing a given column of the object? =2D-=20 David Faure -- fa...@kd..., df...@kl... KDE/KOffice developer, Qt consultancy projects Klar=E4lvdalens Datakonsult AB, Platform-independent software solutions |
From: Oleg B. <ph...@ma...> - 2006-02-22 13:28:26
|
On Wed, Feb 22, 2006 at 02:21:06PM +0100, David Faure wrote: > Could you help me a bit so that I can do the right thing then? > What's the idea behind loadvalue and getvalue issueing a SELECT statement? > I thought the select was done by MyObject.select(), so why is select done (again? instead?) > when accessing a given column of the object? class sqlmeta: cacheValues = False Now EVERY access to an attribute causes SELECT or UPDATE. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: David F. <df...@kl...> - 2006-02-22 13:35:15
|
On Wednesday 22 February 2006 14:28, Oleg Broytmann wrote: > On Wed, Feb 22, 2006 at 02:21:06PM +0100, David Faure wrote: > > Could you help me a bit so that I can do the right thing then? > > What's the idea behind loadvalue and getvalue issueing a SELECT stateme= nt? > > I thought the select was done by MyObject.select(), so why is select do= ne (again? instead?) > > when accessing a given column of the object? >=20 > class sqlmeta: > cacheValues =3D False >=20 > Now EVERY access to an attribute causes SELECT or UPDATE. OK. Now if I want those SELECTS to lock the rows, how would I do that, if n= ot with an sqlmeta boolean? =2D-=20 David Faure -- fa...@kd..., df...@kl... KDE/KOffice developer, Qt consultancy projects Klar=E4lvdalens Datakonsult AB, Platform-independent software solutions |
From: Oleg B. <ph...@ma...> - 2006-02-22 13:40:19
|
On Wed, Feb 22, 2006 at 02:34:59PM +0100, David Faure wrote: > > class sqlmeta: > > cacheValues = False > > > > Now EVERY access to an attribute causes SELECT or UPDATE. > > OK. Now if I want those SELECTS to lock the rows, how would I do that, if not with an sqlmeta boolean? Why do you want to lock the row with cacheValues=False? Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: David F. <df...@kl...> - 2006-02-22 13:48:54
|
On Wednesday 22 February 2006 14:40, Oleg Broytmann wrote: > On Wed, Feb 22, 2006 at 02:34:59PM +0100, David Faure wrote: > > > class sqlmeta: > > > cacheValues =3D False > > >=20 > > > Now EVERY access to an attribute causes SELECT or UPDATE. > >=20 > > OK. Now if I want those SELECTS to lock the rows, how would I do that, = if not with an sqlmeta boolean? >=20 > Why do you want to lock the row with cacheValues=3DFalse? Because another process might be accessing the same table row at the same t= ime. And since it's another process, it can't share the sqlobject instance, the = concurrency happens at the SQL level. =2D-=20 David Faure -- fa...@kd..., df...@kl... KDE/KOffice developer, Qt consultancy projects Klar=E4lvdalens Datakonsult AB, Platform-independent software solutions |
From: Oleg B. <ph...@ma...> - 2006-02-22 13:52:04
|
On Wed, Feb 22, 2006 at 02:48:42PM +0100, David Faure wrote: > > Why do you want to lock the row with cacheValues=False? > > Because another process might be accessing the same table row at the same time. I do understand what locking is. I do not understand locking with cacheValues=False. I am sure you do not need cacheValues=False with SELECT FOR UPDATE. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: David F. <df...@kl...> - 2006-02-23 10:15:56
Attachments:
sqlrace.py
|
On Wednesday 22 February 2006 14:51, Oleg Broytmann wrote: > On Wed, Feb 22, 2006 at 02:48:42PM +0100, David Faure wrote: > > > Why do you want to lock the row with cacheValues=3DFalse? > >=20 > > Because another process might be accessing the same table row at the sa= me time. >=20 > I do understand what locking is. I do not understand locking with > cacheValues=3DFalse. I am sure you do not need cacheValues=3DFalse with > SELECT FOR UPDATE. OK, here's a full testcase then. See definition of the Aggregates and MyValues tables (sqlobjects) in the at= tached sqlrace.py. Each transaction is "adding a new entry to the MyValues table and updating= =20 the total in the Aggregates table (which has a single row)". Run this script once first so that it creates the tables, then run it twice= at the same time from two different shells. If the (only) row in the Aggregates table isn't = locked, then you get a race where both scripts read the same old value and, in their transac= tion, write out the same new value. But they both managed to insert a row into MyValues, so the total got out of sync. To check the results after running two concurrent sqlrace.py scripts, use t= his SQL: select total from test.aggregates select sum(value) from test.my_values The values should match, but they don't when lockRows is not set to True. SELECT FOR UPDATE is needed in order to prevent two transactions (in two di= fferent processes) from reading the same value at the same time and stepping onto e= ach other's toes when updating it. And given that the Aggregates table will obviously always be used this way (updating a total), being able to set lockRows=3Dtrue once and for all in t= he definition of that table makes sense to me; more than having to remember to pass "lock the rows" in each and every operation on that table - which woul= d be impossible anyway for "agg.total =3D agg.total + newVal". =2D-=20 David Faure -- fa...@kd..., df...@kl... KDE/KOffice developer, Qt consultancy projects Klar=E4lvdalens Datakonsult AB, Platform-independent software solutions |
From: Stuart B. <st...@st...> - 2006-02-25 10:27:41
Attachments:
signature.asc
|
David Faure wrote: > And given that the Aggregates table will obviously always be used this way > (updating a total), being able to set lockRows=true once and for all in the > definition of that table makes sense to me; more than having to remember to > pass "lock the rows" in each and every operation on that table - which would be > impossible anyway for "agg.total = agg.total + newVal". It does not seem obvious that the Aggregates table will always be used the way you describe. I would assume that the reason to store this information in the database is so that it can be retrieved from the database at some point. If the processes that retrieve this information cause row level locks to be taken out, then you are creating the following issues: - Scalability issues. Multiple readers cannot read the value from the Aggregates table at the same time, as they need to wait their turn to acquire the lock - Security issues. Connections that just need to read the value from the Aggregates table would need UPDATE permissions on the table (at least with PostgreSQL) - Deadlock issues. The more locks you take out, the more likely you are to trigger deadlocks. I think FOR UPDATE would be a good addition. I would expect it to be a parameter passed through to select. Even if people decide to go with a flag in the table definition, I think the parameter to select is still required to override the flag for the common case of only needing FOR UPDATE in rare situations. (It seems you are trying to emulate a SERIALIZABLE transaction isolation level btw. - if MySQL supports this it might be a better solution to your problem). -- Stuart Bishop <st...@st...> http://www.stuartbishop.net/ |
From: Oleg B. <ph...@ph...> - 2006-02-26 15:47:01
|
On Wed, Feb 22, 2006 at 01:21:58AM +0100, David Faure wrote: > +`createSQL`: > + SQL queries run after table creation. createSQL can be a string with a > + single SQL command, a list of SQL commands, or a dictionary with keys that > + are dbNames and values that are either single SQL command string or a list > + of SQL commands. This is usually for ALTER TABLE commands. > + ``%s``, if present, is substituted with the name of the table. Added to the documentation. Thank you! Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: David F. <df...@kl...> - 2006-03-08 14:06:32
|
On Sunday 26 February 2006 16:46, Oleg Broytmann wrote: > On Wed, Feb 22, 2006 at 01:21:58AM +0100, David Faure wrote: > > +`createSQL`: > > + SQL queries run after table creation. createSQL can be a string wit= h a > > + single SQL command, a list of SQL commands, or a dictionary with ke= ys that > > + are dbNames and values that are either single SQL command string or= a list > > + of SQL commands. This is usually for ALTER TABLE commands. > > + ``%s``, if present, is substituted with the name of the table. >=20 > Added to the documentation. Thank you! Hmm, the last line shouldn't have been added, since the support for '%s' ha= sn't been committed. =2D-- docs/SQLObject.txt (revision 1634) +++ docs/SQLObject.txt (working copy) @@ -739,7 +739,6 @@ values are: single SQL command, a list of SQL commands, or a dictionary with keys t= hat are dbNames and values that are either single SQL command string or a l= ist of SQL commands. This is usually for ALTER TABLE commands. =2D ``%s``, if present, is substituted with the name of the table. There is also one instance attribute: =2D-=20 David Faure -- fa...@kd..., df...@kl... KDE/KOffice developer, Qt consultancy projects Klar=E4lvdalens Datakonsult AB, Platform-independent software solutions |
From: Oleg B. <ph...@ma...> - 2006-03-08 14:38:33
|
On Wed, Mar 08, 2006 at 03:06:25PM +0100, David Faure wrote: > Hmm, the last line shouldn't have been added, since the support for '%s' hasn't been committed. > > --- docs/SQLObject.txt (revision 1634) > +++ docs/SQLObject.txt (working copy) > @@ -739,7 +739,6 @@ values are: > single SQL command, a list of SQL commands, or a dictionary with keys that > are dbNames and values that are either single SQL command string or a list > of SQL commands. This is usually for ALTER TABLE commands. > - ``%s``, if present, is substituted with the name of the table. Fixed. Thank you. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: David F. <df...@kl...> - 2006-03-09 13:02:04
Attachments:
select_for_update.diff
sqlrace.py
|
On Saturday 25 February 2006 11:27, you wrote: > I think FOR UPDATE would be a good addition. I would expect it to be a > parameter passed through to select. Even if people decide to go with a fl= ag > in the table definition, I think the parameter to select is still required > to override the flag for the common case of only needing FOR UPDATE in ra= re > situations. OK, I have reworked the support for "select for update" so that it now look= s like obj.select(forUpdate =3D True) Patch and testcase attached. I didn't include a documentation patch since I'm not sure where it should g= o; e.g. the connection parameter of select isn't documented in SQLObject.txt (which is entry-level docu, not complete docu so it's understandable), and select() isn't in interface.py. =2D-=20 David Faure -- fa...@kd..., df...@kl... KDE/KOffice developer, Qt consultancy projects Klar=E4lvdalens Datakonsult AB, Platform-independent software solutions |
From: Oleg B. <ph...@ph...> - 2006-09-26 12:24:18
|
Hello! On Thu, Mar 09, 2006 at 02:01:58PM +0100, David Faure wrote: > OK, I have reworked the support for "select for update" so that it now looks like > obj.select(forUpdate = True) > > Patch and testcase attached. I have applied the patch without the test (the test still needs some work) and committed in the revision 1948 in the trunk. I also added a note about in in SQLObject.txt, in the section "Transactions". Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2006-03-14 15:50:39
|
On Thu, Mar 09, 2006 at 02:01:58PM +0100, David Faure wrote: > OK, I have reworked the support for "select for update" so that it now looks like > obj.select(forUpdate = True) > > Patch and testcase attached. You have forgotten to patch (of forgot to include diff) sresults.py. And your test didn't catch that. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2006-03-21 09:25:05
|
On Tue, Mar 14, 2006 at 06:50:14PM +0300, Oleg Broytmann wrote: > On Thu, Mar 09, 2006 at 02:01:58PM +0100, David Faure wrote: > > OK, I have reworked the support for "select for update" so that it now looks like > > obj.select(forUpdate = True) > > > > Patch and testcase attached. > > You have forgotten to patch (of forgot to include diff) sresults.py. And > your test didn't catch that. Oops, sorry, my mistake. SelectResults just passess self.ops to the dbConnection. > for i in xrange(100): > print i > > try: > agg = Aggregates.select(connection = t, forUpdate = True).getOne() > except: > agg = Aggregates(total = 0, connection = t) I have a problem converting the test to SQLObject. In the beginning the test database is always empty so .select() always raises a "not found" exception. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: David F. <df...@kl...> - 2006-03-21 09:46:44
|
On Tuesday 21 March 2006 10:24, Oleg Broytmann wrote: > > for i in xrange(100): > > print i > >=20 > > try: > > agg =3D Aggregates.select(connection =3D t, forUpdate = =3D True).getOne() > > except: > > agg =3D Aggregates(total =3D 0, connection =3D t) >=20 > I have a problem converting the test to SQLObject. In the beginning the > test database is always empty so .select() always raises a "not found" > exception. =2E.. which is caught by the "except:", and then a row is created by that A= ggregates constructor, isn't it? I must be misunderstanding the issue. Do you mean that it creates trouble when two instances of the script are ru= nning in parallel? It is true that I always ran it once first, to create the row. This probably means that there should be some initialization code ran first. =2D-=20 David Faure -- fa...@kd..., df...@kl... KDE/KOffice developer, Qt consultancy projects Klar=E4lvdalens Datakonsult AB, Platform-independent software solutions |
From: Oleg B. <ph...@ma...> - 2006-03-21 10:09:12
|
On Tue, Mar 21, 2006 at 10:46:25AM +0100, David Faure wrote: > On Tuesday 21 March 2006 10:24, Oleg Broytmann wrote: > > > for i in xrange(100): > > > print i > > > > > > try: > > > agg = Aggregates.select(connection = t, forUpdate = True).getOne() > > > except: > > > agg = Aggregates(total = 0, connection = t) > > > > I have a problem converting the test to SQLObject. In the beginning the > > test database is always empty so .select() always raises a "not found" > > exception. > ... which is caught by the "except:", and then a row is created by that Aggregates constructor, isn't it? Not with the SQLObject tests. > I must be misunderstanding the issue. > Do you mean that it creates trouble when two instances of the script are running > in parallel? It is true that I always ran it once first, to create the row. > This probably means that there should be some initialization code ran first. Please add the initializations. I've converted the test to the SQLObject test and it is: from sqlobject import * from dbtest import * class BaseSQLObject(SQLObject): class sqlmeta: cacheValues = False class Aggregates(BaseSQLObject): total = IntCol() class AggValues(BaseSQLObject): value = IntCol() def test_1(): if not supports('selectForUpdate'): return setupClass([Aggregates, AggValues]) t = Aggregates._connection.transaction() for i in xrange(100): try: agg = Aggregates.select(connection=t, forUpdate=True).getOne() except: agg = Aggregates(total=0, connection=t) newVal = i % 10 + 1 agg.total = agg.total + newVal newValue = AggValues(value=newVal, connection=t) t.commit() Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |