Re: [Firebird-devel] Max transaction duration

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

I'd like summarize the problem and draw possible paths we can take re. 
this issue.

First, what's the real problem that trn timeout should fix ? What we want 
is that runaway transactions and queries do not consume engine resources 
forever or in uncontrolled manner. Proposed solution is to impose a limit 
for how long transaction can live (we can also impose a limit for CPU 
usage, memory usage etc.). We hope that this would cure many reasons for 
stalled or exhausted engine resources (OAT problem is the most critical 
one): dead clients, badly written clients, malicious clients and a 
something I call as self-healing in mission critical systems. But is this 
approach really a cure for them ?

a) We acknowledged that engine can detect dead clients now, and take the 
right action. Of course, we can improve the detection system to be more 
precise, take less resources and detect more *dead* conditions. But we 
definitely don't need to impose timeout to solve dead clients problem, 
because it's already solved, in more or less satisfactory way.

b) Many said that timeout is probably not a good cure for badly written 
applications. Engine definitely should help developers to identify the 
problem and may provide a way for administrators to "fix" the *immediate* 
problem by killing runaway transaction, connection or query, but the real 
cure is definitely to fix a bad-behaving application. I liked the 
temporary system tables approach taken by IB7 that allows identify the 
occurrence of various problems and allow to kill trn, connections or 
statements. I think that we should analyze its pro's con's and seriously 
consider to implement them (or anything with equivalent capabilities) in 
Firebird. 

c) A malicious code - i.e. *intentionally* bad behaving application - is 
the real problem that we're not able to solve right now, and that timeout 
may solve. But it really can ? I have my doubts. It's clear that 
applications have different needs for system time and resources, so we 
must provide a way for fine tuning of timeout or other limits (user 
defined amount at start and renewal). We have to provide an API for that, 
but that same API would be there also for malicious code to use. If we 
don't impose (even user configurable) a hard limit for 
transaction/connection/query time, we will solve nothing that way, and 
even with a hard limit, we will throw only a small obstacle in the way of 
malicious code writers. More to that, anything that we do to solve this 
problem should not make unneeded obstacles to regular developers, and 
timeout would do.

d) Another use of timeout is to help mission-critical systems to not fall 
on their knees when unexpected problem occurs. These systems are usually 
very busy systems, and is normal that system will take some actions on 
its own to minimize the impact of any failure or take an alternate path 
because people are too slow to react. Timeout may help there (at least 
it's normal practice as David pointed out), but do we *really* need to 
use timeouts *in engine* ? Is it possible that an independent monitor 
app. (it may use temporary system tables or any other API) that would 
observe and rule out transaction/connections/queries according to user-
defined rules would be enough ? It would be definitely more flexible 
solution, but would be acceptable for such 
mission-critical systems ?

What we didn't take into account in recent discussion is an overhead of 
any timeout or keep-alive solution. Client-controlled approach seems to 
scale better than server-controlled one, but server-controlled one seems 
to be more precise. But both will impose additional overhead in network 
traffic and system resources.

Another angle that was mantioned but not very thoroughly is backward 
compatibility of any timeout solution with current applications. Solution 
that would use extended API would solve d), but do not b) and c) 
problems, other methods would be more or less incompatiblie.

So, where we want to go from here ?

Best regards
Pavel Cisar
http://www.ibphoenix.com
For all your upto date Firebird and
InterBase information

Re: [Firebird-devel] Max transaction duration

A powerful, cross platform, SQL database system

Re: [Firebird-devel] Max transaction duration