From: cw <col...@gm...> - 2016-01-21 11:20:30
|
Hi, We're struggling with a significant issue that impacts quite heavily on the system. The issue is that a sender will start delivering a message with an attachment above the noprocessing limit and also whitelisted. The attachment comes in very slowly and the current example I have takes about 9 hours for a simple 5Mb email. Currently the MTA is set to time out at 1 hour, so the message is way past where it should have stopped. I did previously have ASSP set to time out connections at half an hour. The problem comes when for some unknown reason, after 5-10 minutes, the remote sender opens another connection and attempts to redeliver the message whilst the original message is still ongoing. My initial instinct with this was that is a serious misconfiguration on behalf of the remote sender. However, the senders affected are Gmail (including Google Apps), Yahoo Mail (including BT Internet) and SMTPRoutes (a large worldwide spam filtering solution run by GFI.com). It does not seem to affect all connections from them and only seems to affect specific messages (ie other messages will come from the same IPs fine whilst this message is "stuck). This screenshot shows what I am talking about, in this case we have about 250 connections open from that IP and all seem to be receiving data. This causes obvious slowness on our mail platform and wastes quite a lot of bandwidth. http://www.dolphinict.co.uk/connections.png The message concerned started at 17:09 on the 19th and is still going. What is even more odd is that the person concerned received the message at around 21:00 on the 19th successfully As all these platforms have many different IP addresses it is very difficult to troubleshoot. I am trying to get in touch with the support team for SMTPRoutes as we use some of their other products. I also have some connection debug files that are revelant and an example of the maillog below. Our datacentre has checked the transit links to the relevant datacentres and sees no issue. I'm struggling with how else to troubleshoot this. Am I the only one seeing this kind of behaviour? Our servers have been fully updated more than once (including perl modules) since this issue started. Our MD is starting to get twitchy and I can see him pushing us to consider third party hosted services instead of our own if this can't be tracked down. 2016-01-19 20:31:15 m1-35475-10020 [Worker_6] [TLS-in] 94.186.192.136 <su...@do...d> info: found message size announcement: 5.36 MByte 2016-01-19 20:31:15 m1-35475-10020 [Worker_6] [TLS-in] 94.186.192.136 <su...@do...d> IP 94.186.192.136 matches whiteListedIPs - with 94.186.128.0/17 2016-01-19 20:31:15 m1-35475-10020 [Worker_6] [TLS-in] 94.186.192.136 <su...@do...d> message proxied without processing - message size (5623467) is above 512000 (npSize). 2016-01-19 20:31:15 m1-35475-10020 [Worker_6] [TLS-in] 94.186.192.136 <su...@do...d> [SMTP Reply] 250 OK 2016-01-19 20:31:16 m1-35475-10020 [Worker_6] [TLS-in] 94.186.192.136 <su...@do...d> to: rec...@do...d [SMTP Reply] 250 Accepted 2016-01-19 20:31:16 m1-35475-10020 [Worker_6] [TLS-in] 94.186.192.136 <su...@do...d> to: rec...@do...d [SMTP Reply] 354 Enter message, ending with "." on a line by itself 2016-01-19 20:31:22 m1-35475-10020 [Worker_6] [TLS-in] 94.186.192.136 <su...@do...d> to: rec...@do...d Whitelisted sender address: su...@do...d for recipient rec...@do...d 2016-01-20 04:08:23 m1-35475-10020 [Worker_6] [TLS-in] 94.186.192.136 <se...@do...d> to: rec...@do...d [Plugin] calling plugin ASSP_AFC 2016-01-20 04:08:24 m1-35475-10020 [Worker_6] [TLS-in] 94.186.192.136 <se...@do...d> to: rec...@do...d info: 1 attachment found for Level-1 2016-01-20 04:08:24 m1-35475-10020 [Worker_6] [TLS-in] 94.186.192.136 <se...@do...d> to: rec...@do...d message proxied without processing (no bad attachments) 2016-01-20 04:08:24 m1-35475-10020 [Worker_6] [TLS-in] [MessageOK] 94.186.192.136 <se...@do...d> to: rec...@do...d message ok - (noProcessing - message size (5623467) is above 512000 (npSize)) - [KFI] -> /usr/local/assp/store/okmail/KFI--1513900.eml 2016-01-20 04:08:24 m1-35475-10020 [Worker_6] [TLS-in] 94.186.192.136 <se...@do...d> to: rec...@do...d info: received all data - all data moved to send queue (8) 2016-01-20 04:08:25 m1-35475-10020 [Worker_6] [TLS-in] 94.186.192.136 <se...@do...d> to: rec...@do...d info: no (more) data readable from 94.186.192.136 (connection closed by peer) - last command was 'DATA' |