From: adem <ade...@ex...> - 2002-12-17 14:43:56
|
Hi, I dont know if this is doable but, how about putting in an option something like: "Send an eMail to <a list of eMail address> if any transaction has been running for longer than <time in seconds>" It would be nicer if this were preferably dynamically and remotely alterable. This would be similar to what is in 3ware IDE raid cards that sends an eMail to a prespecified address if it has something to say about its own operation. Our people kust love it --they want all RAID hardware to be 3ware for that (hear that Adaptec! :-). Then, I would next be asking if it would be possible to put in a tiny web server code in FB so that we could do the admin remotely without any 3rd party code, but that would be going overboard <G> -- Cheers, Adem ""Pavel Cisar"" <pc...@us...> wrote in message news:3DFF0B1F.21578.767BA1@localhost... > Hi, > > I'd like summarize the problem and draw possible paths we can take re. > this issue. > > First, what's the real problem that trn timeout should fix ? What we want > is that runaway transactions and queries do not consume engine resources > forever or in uncontrolled manner. Proposed solution is to impose a limit > for how long transaction can live (we can also impose a limit for CPU > usage, memory usage etc.). We hope that this would cure many reasons for > stalled or exhausted engine resources (OAT problem is the most critical > one): dead clients, badly written clients, malicious clients and a > something I call as self-healing in mission critical systems. But is this > approach really a cure for them ? > > a) We acknowledged that engine can detect dead clients now, and take the > right action. Of course, we can improve the detection system to be more > precise, take less resources and detect more *dead* conditions. But we > definitely don't need to impose timeout to solve dead clients problem, > because it's already solved, in more or less satisfactory way. > > b) Many said that timeout is probably not a good cure for badly written > applications. Engine definitely should help developers to identify the > problem and may provide a way for administrators to "fix" the *immediate* > problem by killing runaway transaction, connection or query, but the real > cure is definitely to fix a bad-behaving application. I liked the > temporary system tables approach taken by IB7 that allows identify the > occurrence of various problems and allow to kill trn, connections or > statements. I think that we should analyze its pro's con's and seriously > consider to implement them (or anything with equivalent capabilities) in > Firebird. > > c) A malicious code - i.e. *intentionally* bad behaving application - is > the real problem that we're not able to solve right now, and that timeout > may solve. But it really can ? I have my doubts. It's clear that > applications have different needs for system time and resources, so we > must provide a way for fine tuning of timeout or other limits (user > defined amount at start and renewal). We have to provide an API for that, > but that same API would be there also for malicious code to use. If we > don't impose (even user configurable) a hard limit for > transaction/connection/query time, we will solve nothing that way, and > even with a hard limit, we will throw only a small obstacle in the way of > malicious code writers. More to that, anything that we do to solve this > problem should not make unneeded obstacles to regular developers, and > timeout would do. > > d) Another use of timeout is to help mission-critical systems to not fall > on their knees when unexpected problem occurs. These systems are usually > very busy systems, and is normal that system will take some actions on > its own to minimize the impact of any failure or take an alternate path > because people are too slow to react. Timeout may help there (at least > it's normal practice as David pointed out), but do we *really* need to > use timeouts *in engine* ? Is it possible that an independent monitor > app. (it may use temporary system tables or any other API) that would > observe and rule out transaction/connections/queries according to user- > defined rules would be enough ? It would be definitely more flexible > solution, but would be acceptable for such > mission-critical systems ? > > What we didn't take into account in recent discussion is an overhead of > any timeout or keep-alive solution. Client-controlled approach seems to > scale better than server-controlled one, but server-controlled one seems > to be more precise. But both will impose additional overhead in network > traffic and system resources. > > Another angle that was mantioned but not very thoroughly is backward > compatibility of any timeout solution with current applications. Solution > that would use extended API would solve d), but do not b) and c) > problems, other methods would be more or less incompatiblie. > > So, where we want to go from here ? > > Best regards > Pavel Cisar > http://www.ibphoenix.com > For all your upto date Firebird and > InterBase information > > > > ------------------------------------------------------- > This sf.net email is sponsored by: > With Great Power, Comes Great Responsibility > Learn to use your power at OSDN's High Performance Computing Channel > http://hpc.devchannel.org/ > _______________________________________________ > Firebird-devel mailing list > Fir...@li... > https://lists.sourceforge.net/lists/listinfo/firebird-devel > |