|
From: John P. R. <ro...@cs...> - 2016-07-03 17:18:25
|
Hi Ralf:
In message <201...@ru...>,
Ralf Schlatterbeck writes:
>On Sun, Jul 03, 2016 at 01:15:40AM +0000, Mercurial Commits wrote:
>> issue2550853 - better error handling and cleanup on some postgres
>> tests by Stuart McGraw.
>>
>> +- issue2086536 - back_postgresql: fixing pg_command and prefering
>> + psycopg2. Patch done by Philipp Gortan (mephinet). His patch
>> + also improves handling of retryable errors. Applied and
>> + edited by John Rouillard.
>>
>> + if "FATAL" not in response :
>> + msgs = (
>> 'is being accessed by other users',
>> 'could not serialize access due to concurrent update',
>> + )
>> + for msg in msgs :
>> + if msg in response :
>> + time.sleep(0.1)
>> + return 0
>> + raise RuntimeError (response)
>> return 1
>
>Wouldn't that just ignore non-fatal errors?
>Or am I missing something here?
The whole context is:
try:
cursor.execute(command)
except psycopg.DatabaseError, err:
response = str(err).split('\n')[0]
if "FATAL" not in response :
msgs = (
'is being accessed by other users',
'could not serialize access due to concurrent update',
)
for msg in msgs :
if msg in response :
time.sleep(0.1)
return 0
raise RuntimeError (response)
return 1
I am reading that as:
try to execute the SQL
if we get a database error
split the response apart
if the database error is not FATAL
check to see if it matches one of the retryable errors
(in the msgs array)
if it does match a retryable error:
delay and return 0 (false) indicating that the call should be
retried by the caller
if the error is FATAL or does not match a retryable error, raise a
RuntimeError exception.
Only if the initial cursor.execute didn't raise an exception do we
return 1 indicating a successful run.
The (only) caller for this function is this loop in db_command:
try:
for n in range(10):
if pg_command(cursor, command):
return
finally:
conn.close()
raise RuntimeError('10 attempts to create database failed')
if pg_command returns 1, the function returns. If pg_command returns 0
10 times we raise a runtime error.,
The code that this patch replaces is:
- if response.find('FATAL') != -1:
- raise RuntimeError(response)
- else:
- msgs = [
'is being accessed by other users',
'could not serialize access due to concurrent update',
- ]
- can_retry = 0
- for msg in msgs:
- if response.find(msg) == -1:
- can_retry = 1
- if can_retry:
which I read (with much effort) as:
if I find FATAL I raise a runtime error
otherwise:
loop over all messages and
if the message is not found set the can_retry flag
(I think this test is wrong. It should be:
if response.find(msg) != -1:
i.e. set can_retry if the string IS found
If my interpretation is true, can_retry is set
for every non-fatal error. At least one of the
two strings will not be found resulting
in can_retry = 1.)
see if can_retry was set to 1, if so, delay and return 0
I think the patch is doing the right thing. Also to me the patched
version reads better.
Trying to understand X.find('STRING') != -1 took me a bit. Wrapping
that as X.found(STRING) where boolean true means I found STRING in X
otherwise not would have fixed this code. Pythonic: STRING not in X
is also an improvement.
Also the two errors we want to retry are not FATAL.
From:
http://www.question-defense.com/2008/11/12/postgres-error-database-dev-is-being-accessed-by-other-users
we should be seeing:
ERROR: database "dev" is being accessed by other users
Same with concurrent update. From:
https://www.postgresql.org/docs/9.1/static/transaction-iso.html
ERROR: could not serialize access due to concurrent update
So that's my analysis of the error.
Thoughts, quips, comments, evasions, questions or answers?
--
-- rouilj
John Rouillard
===========================================================================
My employers don't acknowledge my existence much less my opinions.
|