From: Andrew F. <and...@sk...> - 2006-08-11 11:11:55
|
I am testing sqlgrey and have come across a problem with the deverp_user routine. It is missing some obvious VERP addresses and is messing up some obvious non-VERP ones. Here are some examples: perl -d sqlgrey.dist ... DB<1> print deverp_user('bounce-32993-4906286','and...@sk...'); bounce-# That shows what it *should* be doing. Now for some problems: DB<2> print deverp_user('grbounce-JbS8DwUAAAC4pJBj3F9adwKL1ZerRRc_=andrew.findlay=skills-1st.co.uk', 'and...@sk...'); grbounce-JbS8DwUAAAC4pJBj3F9adwKL1ZerRRc_=RCPT The source here is a Google alert service. DB<3> print deverp_user('andrew.findlay','an...@ex...') RCPT.findlay DB<5> print deverp_user('andrew.findlay','drew@x.y') anRCPT.findlay Here we see a very simple sender address being messed up by an even simpler sender address. The appended patch for 1.7.3 fixes both issues, but may prevent the correct de-VERPing of BATV addresses. I have not found any in my logs that match draft-levine-mass-batv-02, but it would seem fairly easy to recognise the simple 'prvs' scheme reliably. I think it may be best to change deverp_user to use one simple explicit pattern for each known VERP style rather than try to build clever regexes to match several at once. Andrew -- ----------------------------------------------------------------------- | From Andrew Findlay, Skills 1st Ltd | | Consultant in large-scale systems, networks, and directory services | | http://www.skills-1st.co.uk/ +44 1628 782565 | ----------------------------------------------------------------------- --- sqlgrey.dist Thu Aug 10 13:22:04 2006 +++ sqlgrey Fri Aug 11 11:46:08 2006 @@ -1005,14 +1005,15 @@ # build pattern with the 3 alternatives to match recipient in originator # BATV implementations use third or first alternative (first by abuse.net) - my $pat = qr/$rcpt_lhs$at_sep_re$rcpt_rhs|$rcpt_rhs$at_sep_re$rcpt_lhs|$rcpt_lhs/; + # (removed third pattern - too simple) + my $pat = qr/$rcpt_lhs$at_sep_re$rcpt_rhs|$rcpt_rhs$at_sep_re$rcpt_lhs/; # replace address with capital RCPT to be safe with deletes # (MySQL matches case insensitive unfortunately) $user =~ s/(?<=[\*=\.-])$pat|$pat(?=[\*=\.-])/RCPT/; # strip frequently used bounce/return masks - $user =~ s/((bo|bounce|notice-return|notice-reply)[\._-])[0-9a-z-_\.]+$/$1#/g; # Added by JR + $user =~ s/((bo|bounce|notice-return|notice-reply)[\._-])[0-9a-zA-Z=_\.-]+$/$1#/g; # Added by JR # strip hexadecimal sequences # at the beginning only if user will contain at least 4 consecutive alpha chars |