|
From: Michel B. <mi...@bo...> - 2005-06-09 17:33:29
|
Le Jeudi 09 Juin 2005 18:34, Michael Storz a =E9crit : > > Therefore a possible change to the algorithm would be to incorporate th= e > relation between from_awl and domain_awl, something like: To complete what I wrote in my previous message : One of the reasons I had _not_ to combine them, but test domain_awl first= , is=20 for performance : If we find a presence in domain_awl, then we don't need= to=20 perform the query against from_awl (the and condition in perl will not=20 evaluate the following condition if the previous doesn't match), and thus= we=20 save a query against the bigger from_awl table when there is an entry in=20 domain_awl -- which is likely to be the case for big servers sending us a= lot=20 of stuff, which are more likely than others to generate a high number of=20 "legitimate entries" in connect, if their IP change for example. If we want to mix the count from domain_awl and the count from from_awl, = then=20 we would need to query both tables everytime, which could result in a=20 performance loss, which would be annoying especially for big sites... --=20 Michel Bouissou <mi...@bo...> OpenPGP ID 0xDDE8AC6E |