[Sqlalchemy-tickets] Issue #4368: Connection reset stale while recycling pool connection (zzzeek/sq
Brought to you by:
zzzeek
From: Carlo S. <iss...@bi...> - 2018-11-14 16:41:21
|
New issue 4368: Connection reset stale while recycling pool connection https://bitbucket.org/zzzeek/sqlalchemy/issues/4368/connection-reset-stale-while-recycling Carlo Suriano: I am experiencing long periods of stale on customer production environment due to recycling pool parameter. The architecture of the solution is the following: 1. Different concurrent tasks are launched on a job scheduler which has 4 workers. Each worker creates a session pool with SQLAlchemy `create_engine(<connection string>, pool_size=8, max_overflow=0, pool_recycle=3600, pool_timeout=30)` 2. Each task executes a complex workflow, which consists of SELECT queries in cascade; 3. The underlying DB is an Oracle RAC with failover and three available addresses for each of the two failover nodes; Randomically, during the execution of a task, the pool recycler verify the overcoming of the 3600 timeout and triggers the recycling of the connection. This causes a stale that lasts about 15 minutes (sistematically) after which SQLAlchemy succeeds to close and re-open connection and then continues the execution. This is an excerpt of the library logger: ``` [myPID] YYYY-MM-DD 12:07:59,974 pool.py: 513 INFO Connection <cx_Oracle.Connection to <my_connection_string>> exceeded timeout; recycling [myPID] YYYY-MM-DD 12:07:59,974 pool.py: 248 DEBUG Closing connection <cx_Oracle.Connection to <my_connection_string>> [myPID] YYYY-MM-DD 12:23:28,294 pool.py: 255 ERROR Exception closing connection <cx_Oracle.Connection to <my_connection_string>> Traceback (most recent call last): File "myEnv/lib/python2.7/site-packages/sqlalchemy/pool.py", line 250, in _close_connection self._dialect.do_close(connection) File "myEnv/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 412, in do_close dbapi_connection.close() DatabaseError: ORA-25408: can not safely replay call [myPID] YYYY-MM-DD 12:23:28,401 pool.py: 538 DEBUG Created new connection <cx_Oracle.Connection to <my_connection_string>> [myPID] YYYY-MM-DD 12:23:28,405 base.py: 487 INFO BEGIN (implicit) [myPID] YYYY-MM-DD 12:23:28,406 base.py: 912 INFO SELECT XXX FROM YYY WHERE ZZZ ``` Since the connection string is constant for each object of the pool, it is not possible to distinguish between the connection being handled on the logs. A quick fix is to set the recycle param at -1, but could the stale be due to wrong management of the failover on SQLAlchemy? |