[Sqlalchemy-tickets] Issue #3927: Reliability on connection error (zzzeek/sqlalchemy)
Brought to you by:
zzzeek
From: Xiaojun L. <iss...@bi...> - 2017-03-02 06:20:58
|
New issue 3927: Reliability on connection error https://bitbucket.org/zzzeek/sqlalchemy/issues/3927/reliability-on-connection-error Xiaojun Lin: I'm using postgreql with sqlalchemy 1.0.11. Some connection errors occur randomly with low possibility(once 1 or 2 days) while the DB is running normally. The traceback looks like: ``` #!python 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/site-packages/neutron/db/l3_agentschedulers_db.py", line 428, in get_l3_agents_hosting_routers 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource l3_agents = [binding.l3_agent for binding in query] 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2736, in __iter__ 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource return self._execute_and_instances(context) 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2749, in _execute_and_instances 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource close_with_result=True) 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2740, in _connection_from_session 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource **kw) 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 905, in connection 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource execution_options=execution_options) 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 912, in _connection_for_bind 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource conn = engine.contextual_connect(**kw) 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 2039, in contextual_connect 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource self._wrap_pool_connect(self.pool.connect, None), 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 2078, in _wrap_pool_connect 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource e, dialect, self) 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1401, in _handle_dbapi_exception_noconnection 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource util.raise_from_cause(newraise, exc_info) 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 200, in raise_from_cause 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource reraise(type(exception), exception, tb=exc_tb) 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 2074, in _wrap_pool_connect 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource return fn() 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 376, in connect 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource return _ConnectionFairy._checkout(self) 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 713, in _checkout 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource fairy = _ConnectionRecord.checkout(pool) 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 485, in checkout 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource rec.checkin() 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 60, in __exit__ 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource compat.reraise(exc_type, exc_value, exc_tb) 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 482, in checkout 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource dbapi_connection = rec.get_connection() 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 594, in get_connection 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource self.connection = self.__connect() 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 607, in __connect 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource connection = self.__pool._invoke_creator(self) 2017-02-07 15:49:41.843 147684 ERROR neutron.api.v2.resource File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/strategies.py", li 2017-02-07T15:49:41.889+08:00 localhost neutron-server ERROR [pid:147684] [MainThread][tid:92991152] [req-d70f28f9-7fa7-46d3-bc9a-bace54158571] [resource.py:131 resource] (client error): (psycopg2.OperationalError) SSL connection has been closed unexpectedly ``` I've been looking into this problem for a long time but was unable to find a possible cause from the DB side, so I tried to add some retries on sqlalchemy side when connection errors occur. The code may look like this: before: ``` #!python def __connect(self): try: self.starttime = time.time() connection = self.__pool._invoke_creator(self) self.__pool.logger.debug("Created new connection %r", connection) return connection except Exception as e: self.__pool.logger.debug("Error on connect(): %s", e) raise ``` after: ``` #!python def __connect(self, try_times=3): tmp_exc = Exception("try_times is zero") for i in range(try_times): try: self.starttime = time.time() connection = self.__pool._invoke_creator(self) self.__pool.logger.debug("Created new connection %r", connection) return connection except Exception as e: self.__pool.logger.debug("Error on connect(): %s", e) tmp_exc = e raise tmp_exc ``` Here my problem is: 1. Does anyone have any idea about what the "SSL connection has been closed unexpectedly" error might be caused by? 2. Is the "retry" way a proper solution to this? If not, what's the better suggestion? |