[Sqlalchemy-tickets] Issue #4368: Connection reset stale while recycling pool connection (zzzeek/sq

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

New issue 4368: Connection reset stale while recycling pool connection
https://bitbucket.org/zzzeek/sqlalchemy/issues/4368/connection-reset-stale-while-recycling

Carlo Suriano:

I am experiencing long periods of stale on customer production environment due to recycling pool parameter. The architecture of the solution is the following:

1. Different concurrent tasks are launched on a job scheduler which has 4 workers. Each worker creates a session pool with SQLAlchemy `create_engine(<connection string>, pool_size=8, max_overflow=0, pool_recycle=3600, pool_timeout=30)`

2. Each task executes a complex workflow, which consists of SELECT queries in cascade;

3. The underlying DB is an Oracle RAC with failover and three available addresses for each of the two failover nodes;

Randomically, during the execution of a task, the pool recycler verify the overcoming of the 3600 timeout and triggers the recycling of the connection. This causes a stale that lasts about 15 minutes (sistematically) after which SQLAlchemy succeeds to close and re-open connection and then continues the execution.

This is an excerpt of the library logger:
```
[myPID]	YYYY-MM-DD 12:07:59,974	pool.py: 513	INFO	Connection <cx_Oracle.Connection to <my_connection_string>> exceeded timeout; recycling
[myPID]	YYYY-MM-DD 12:07:59,974	pool.py: 248	DEBUG	Closing connection <cx_Oracle.Connection to <my_connection_string>>
[myPID]	YYYY-MM-DD 12:23:28,294	pool.py: 255	ERROR	Exception closing connection <cx_Oracle.Connection to <my_connection_string>>
Traceback (most recent call last):
  File "myEnv/lib/python2.7/site-packages/sqlalchemy/pool.py", line 250, in _close_connection
    self._dialect.do_close(connection)
  File "myEnv/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 412, in do_close
    dbapi_connection.close()
DatabaseError: ORA-25408: can not safely replay call

[myPID]	YYYY-MM-DD 12:23:28,401	pool.py: 538	DEBUG	Created new connection <cx_Oracle.Connection to <my_connection_string>>
[myPID]	YYYY-MM-DD 12:23:28,405	base.py: 487	INFO	BEGIN (implicit)
[myPID]	YYYY-MM-DD 12:23:28,406	base.py: 912	INFO	
                    SELECT XXX
                    FROM YYY
                    WHERE ZZZ
```

Since the connection string is constant for each object of the pool, it is not possible to distinguish between the connection being handled on the logs.

A quick fix is to set the recycle param at -1, but could the stale be due to wrong management of the failover on SQLAlchemy?

[Sqlalchemy-tickets] Issue #4368: Connection reset stale while recycling pool connection (zzzeek/sq

[Sqlalchemy-tickets] Issue #4368: Connection reset stale while recycling pool connection (zzzeek/sqlalchemy)