From: Michael S. <Mic...@lr...> - 2005-05-07 19:25:16
|
On Sat, 7 May 2005, Lionel Bouton wrote: > Michael Storz wrote the following on 07.05.2005 00:13 : > > >Analyzing our from_awl, I found the following: > > > >The table has 365.208 entries from 178.026 different ip addresses. > >>From these ip addresses > > > >- 129.210 have exactly one entry and this is with sender_domain = > > "-undef-" > >- 38.904 have only entries without sender_domain = "-undef-" > >- only 9.912 have entries with both kind of sender_domains > > > >If we split the from_awl in 2 tables > > > >- from_awl: sender_domain <> "-undef-" > >- dsn_awl: sender_domain = "-undef-" > > > >(...) > > > > > > Ok. I consider this a design bug. In the original design from_awl is the > first step towards domain_awl. But DSNs can't go into domain_awl > (obviously because of the lack of domain...). This won't make it in > 1.6.0, but it is now in my TODO. > Well, I wouldn't call it a design bug, I would call it an optimization. This sounds much more positiv :-) What I am trying is, to get all the backscatter away from from_awl. At the moment backscatter mainly results from DSNs and forwards as far as I can see. For both, I've suggested new tables. I am interested in, how stable do we get from_awl and domain_awl? Which leads us to the question how stable are the relationships of email communications. How many new communication partners are found, how many old relationships will end? What is the percentage we can expect? > >If we include connect_awl, I don't think we need a split of this table, > >because the backscatter DSNs will propagate fast into the dsn_awl, only > >normal DSNs will stay in connect_awl. > > > > > > This I don't understand. How do we make the difference between > backscatter DSNs and normal DSNs ? Sorry, my explanation was a little bit too short. What I meant was, most of the times a spammer uses one of our domains as the originator of his spams, the left side was generated. Therefore, DSNs coming back to us were directed to a lot of different recipients. Aggregation will move such DSNs very fast to dsn_awl. On the other side, if a local user makes an error with a recipient address, most of these emails will not leave our system. Only a few will be accepted by other systems and will then generate a DSN. Therefore such DSNs will stay in connect_awl, because not enough DSNs are available for aggregation. This is the reason why I said backscatter DSNs will go to dsn_awl whereas normal DSNs will stay in connect_awl. Looking at a single DSN, you are right, we can't decide if it is a backscatter DSN or a normal DSN. Michael Storz ------------------------------------------------- Leibniz-Rechenzentrum ! <mailto:St...@lr...> Barer Str. 21 ! Fax: +49 89 2809460 80333 Muenchen, Germany ! Tel: +49 89 289-28840 |