Menu

#14 gmail disconnecting mid send, getting ssl disconnect

v1.0 (example)
closed
nobody
None
5
2018-06-20
2015-01-14
Scott Mann
No

I'm thinking this is probably a gmail specific issue, and perhaps not an issue with emailrelay, but was hoping to see if I was the only one experiencing it.

After a year of successfully running emailrelay as a proxy to gmail, I'm now repeatedly seeing something like this snippet from the logs:
emailrelay: 20150113.234752: info: rx<<: "354 Go ahead wt4sm19089014pab.4 - gsmtp"
emailrelay: 20150113.234752: info: tx>>: [248 line(s) of content]
emailrelay: 20150113.234752: info: tx>>: "."
emailrelay: 20150113.234811: warning: exception: read error: disconnected: ssl read
emailrelay: 20150113.234811: info: tx>>: "452 message processing failed: read error: disconnected: ssl read"
emailrelay: 20150113.234811: info: smtp client error: "read error: disconnected: ssl read"
emailrelay: 20150113.234811: info: failing file: "emailrelay.5704.220491.19.envelope.busy" -> "emailrelay.5704.220491.19.envelope.bad"

Which looks to me like Google's SMTP (tls port 587) is disconnecting immediately after DATA payload is sent.

At first I suspected Google was doing some really restrictive content filtering, and if it saw something it didn't like, would just abort. The same emails seem to fail when retried. Some go through. But, utterly benign emails are failing. None of the emails have attachments, even.

Anyone else experiencing this?

Discussion

  • Scott Mann

    Scott Mann - 2015-01-14

    I believe I've figured out my problem. Gmail was timing out waiting for the data to finish transmitting, because it was expecting crlf line endings, and was instead getting a single linefeed ending on the last line, because the python filter was writing out the content as binary. Neither the filter, nor emailrelay has changed in a year, so I suspect gsmtp has become less permissive about line endings.

    I consider this issue closed, unless emailrelay is expected to transform line endings.

     

    Last edit: Scott Mann 2015-01-14
  • Scott Mann

    Scott Mann - 2015-01-16

    Correction -- still having issues. I'm almost certain Google has introduced some bug into smtp parsing, that my particular configuration is triggering. (Outlook 2010 -> emailrelay -> python script MIME processing -> gmail smtp)

    Some emails still don't go through when using \r\n as line endings in the content file, while some do. They will successfully transmit IF I strip all the carriage returns, leaving just the line feeds, which I believe is a violation of RFC 2822 (and also RFC 2821, which dictates 998 character max length, \r\n terminated lines for DATA payload.)

    emailrelay interprets this as one single line, as shown int the logs:

    emailrelay: 20150115.172215: info: GSmtp::ClientProtocol: rx<<: "354 Go ahead y2sm2510594pdm.31 - gsmtp"
    emailrelay: 20150115.172215: info: GSmtp::ClientProtocol: tx>>: [1 line(s) of content]
    emailrelay: 20150115.172215: info: GSmtp::ClientProtocol: tx>>: "."
    emailrelay: 20150115.172216: info: GSmtp::ClientProtocol: rx<<: "250 2.0.0 OK 1421371330 y2sm2510594pdm.31 - gsmtp"

    However, it works. I'm not sure it should, but it does. I have not been able to isolate the presumed sequence of characters that Google is choking on. I feel like stripping carriage returns is the wrong solution.

     

    Last edit: Scott Mann 2015-01-16
  • Graeme Walker

    Graeme Walker - 2015-01-16

    It looks like the receiver is not seeing your "dot" terminator at the end of the DATA stream and then timing out after 30s of inactivity. That might be explained if you have a mixture of line endings, most importantly on the last line. As you say, the RFCs require CR LF line endings but many receivers will tolerate LF but then expect "LF . LF" as a terminator. If you have LF on the last line then emailrelay will be sending "LF . CR LF", which is not what the receiver will be waiting for.

     
  • Scott Mann

    Scott Mann - 2015-01-17

    Thanks so much, Graeme -- that makes total sense. I remain mystified, however. I have some failed content/envelopes.bad file pairs that absolutely do not have any bare linefeeds in them, just CRLFs, and Google's still choking on them. I made sure all characters are single byte, 7 bit, ascii, and the emails otherwise appear valid. (I can provide a sample.) The last four bytes of the content in hex are 0d 0a 0d 0a, which based on your description, should work fine.

    My next step is probably a packet capture to see if something goofy is happening on the wire.

    I'm using emailrelay compiled unix-style under Cygwin, rather than a Windows build -- do you think might be contributing?

    In the mean time, using LFs instead of CRLF's seems to work fine, so maybe I should just accept that and move on. =)

     

    Last edit: Scott Mann 2015-01-17
  • Scott Mann

    Scott Mann - 2015-01-31

    Just an update. While things seemed to work fine for a week or two, they're now having issues again. Interestingly, if I change line endings back to CRLF, most messages send. Some do not send regardless of type of linefeed, and they fail with an ssl write error.

    I actually disabled my filter entirely, thinking it was mucking things up, but the same messages resent from the client still fail, in the same way.

    emailrelay: 20150130.163704: info: Main::Run::doForwarding: poll: still busy from last time
    emailrelay: 20150130.163704: info: GSmtp::ClientProtocol: tx>>: [18 line(s) of content]
    emailrelay: 20150130.163704: info: GSmtp::ClientProtocol: tx>>: [54 line(s) of content]
    emailrelay: 20150130.163723: warning: GNet::HeapClient::onException: exception: peer disconnected: ssl write
    emailrelay: 20150130.163723: error: Main::Run::pollingClientDone: polling: peer disconnected: ssl write
    emailrelay: 20150130.163723: info: GSmtp::Client: smtp client error: "peer disconnected: ssl write"

    Considering I didn't have any such problem for a year, with no config changes, I can only assume Google is making changes to their smtp server that are conflicting with emailrelay in some way.

    I still find it hard to believe that I'm the only one affected -- perhaps very few emailrelay users --forward-to gmail.

     
  • Scott Mann

    Scott Mann - 2015-01-31

    I'm starting to think I previously solved my CRLF-vs-LF problem a few weeks ago, and that my lingering issues are SSL related.

    I tried updating to TRUNK, built with debugging, to see if that would help. It didn't. So, I was going to try to packet capture, which, because gmail requires SSL/TLS, forced me to use stunnel so I could capture in clear text. However, mysteriously, the emails that refused to send earlier, send just fine when emailrelay communicates with stunnel in clear text, and stunnel handles the SSL with gmail, instead.

    I'm increasingly convinced that gmail updated their SSL/SMTP about a month ago, and that emailrelay is having issues with it, at least compiled under cygwin. But, I think stunnel will be a reasonable work-around until and if things are resolved.

     
  • Scott Mann

    Scott Mann - 2015-02-03

    Several days later, tunneling through stunnel has prevented previously frequent errors. Still curious that I'm alone in having, or at least reporting, the issue. Oh well, the work-around works. Thanks!

     
  • Scott Mann

    Scott Mann - 2015-02-20

    Last update -- a couple weeks and hundreds of emails later and not a single error or failed email. Using stunnel as a proxy definitely cleared up my problem connecting to smtp.gmail.com.

    And to recap, I believe there were two distinct issues I suffered, starting in early January, strictly related to gmail's smtp server. 1) They became intolerant of mixed use of LF and CRLF, and 2) SSL connections intermittently, but reproducibly failed at some point mid payload transfer, well after handshake.

    The solution to (1) was just cleaning up my own script. The solution to (2) was tunneling ssl via stunnel, launched as a local daemon.

     
  • Graeme Walker

    Graeme Walker - 2015-02-20

    Thanks for all the feedback, and I'm glad combining emailrelay with stunnel is working well for you.

    I am very suspicious of cygwin here. I think it's great for building tools and utilities but I would not use it for mission-critical infrastructure. There is a lot of smoke-and-mirrors performed by the cygwin dll at run-time to create the illusion of a *ix environment, but it does not always pull it off completely.

    Can you also summarise to what extent the problem was reproducible? Early on you say that certain messages would fail repeatedly -- but that was presumably because of the line-ending issue. Near the end you also say "intermittently, but reproducibly", but there I guess you just mean that it keeps happening, not that it is message-specific.

    If you still have logs for some of the failures it might also be interesting to see how old the connections where at the point of failure. I could imagine that periodic TLS renegotiation might be a trigger, for example.

     
  • Graeme Walker

    Graeme Walker - 2018-06-20
    • status: open --> closed
     

Log in to post a comment.