Menu

#59 error when multiple mail received same time

v1.0 (example)
closed
nobody
None
5
2024-09-18
2024-03-13
No

Hi,

when multiple emails received at same time (I'm not sure about numbers) Error occured and it return "452 failed" code to SMTP server. normally it should ask me over socket (filter) but I guess nothing comes and it directly return "452 failed"
how can we handle this one?

1 Attachments

Discussion

  • Graeme Walker

    Graeme Walker - 2024-03-13

    It looks like you are just hitting the 64 handle limit when emailrelay calls the Windows API function WaitForMultipleObjects().

    https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitformultipleobjects

    Each server-side connection will use one handle for the connection and (in this case) another for the network filter. Do you have anywhere near 30 connections?

    If the remote clients are hanging on to idle connections for no good reason then you should try reducing the idle timeout (eg. "--idle-timeout=2").

     
  • Yunus YILDIRIM

    Yunus YILDIRIM - 2024-04-29

    im not sure about "remote clients are hanging on to idle connection", probably the address verifier and filters cause to waiting.
    but, I think still 30 connections are too small. how can we increase this limit?
    This limit will always cause problems due to multiple/parallel e-mails sent by programs other than standard user email sendings.

     
  • Graeme Walker

    Graeme Walker - 2024-04-29

    The 64-handle limit can't be trivially increased, but I will look into alternative approaches.

     
  • Graeme Walker

    Graeme Walker - 2024-05-12

    I have gone with a multi-threaded solution to this: the event loop will switch to using multiple threads once the number of handles it is dealing with reaches ~64, allowing it to increase to more than 3000. The code is checked in to svn trunk and git master as 2.5.3dev3 and it will go into the next release.

     
    👍
    1
  • Yunus YILDIRIM

    Yunus YILDIRIM - 2024-08-19

    Hello,
    I've test it a lot in test and prod environments.
    App is incredibly fast and can handle large amounts of email traffic.
    x100 or x1000 faster than old version :)

    I'm not sure but there is problem with handling traffic or creating content files. The contents of simultaneously sended emails seem to be mixed up.
    I still haven't figured it out yet, I'm testing it.

     
  • Yunus YILDIRIM

    Yunus YILDIRIM - 2024-08-21

    with old version (2.5) I never see simultaneously +60 Email Per Second (its normal because of handle limit)
    with new version (2.6), I see over +450 Email Per Second. But I never see in the log file something like "event loop using ... threads".
    When I try to send 500/1000 Email Per Second app give error like ;

    warning: 127.0.0.1; GSmtp::ServerProtocol::messageAddContentFailed: failed to save message content
    info: 127.0.0.1; GSmtp::ServerSend: tx>>: "554 transaction failed"
    

    I think it will be better return 452 failed instead of 554 transaction failed. otherwise smtp client never try again.

     
  • Graeme Walker

    Graeme Walker - 2024-08-21

    So the increased throughput is because you were previously having to re-send e-mails that failed due to hitting the handle limit? That makes more sense than an intrinsic increase in performance.

    You will only get the "using n threads" message with verbose logging, although you might get some indication of multi-threading from the TaskManager "threads" and "handles" columns, particularly if you can compare 2.5 and 2.6 side by side.

    It's odd that you are getting the "failed to save message content" error. I would only expect that if the disk is full or the file system is in trouble. Are you running on bare metal with a local disk? Is there any indication of file system or hardware errors?

    I'm not sure about the status code -- a full disk is likely to be a permanent condition, for some definition of permanent, and you could argue that it meets this bit of RFC-2821's 5xx paragraph (with my edit at the end):

    Even some "permanent" error conditions can be corrected, so the human user may want to direct the SMTP client to reinitiate the command sequence by direct action at some point in the future (e.g., after [ freeing up space] ... )

    Hopefully it becomes moot if we can find out why writing the content file is failing.

     
  • Yunus YILDIRIM

    Yunus YILDIRIM - 2024-08-21

    I work locally and When I get the error "failed to save message content" everything is fine also CPU/RAM/DISK are fine.

    In fact, this error sometimes occurs and then the system continues to receive and process e-mails.

    you can try with attached python script to test it. my emailrealy work as server and
    --as-server --port 225 --close-stderr --hidden --filter exit:0 --log --log-address --log-file C:\ProgramData\E-MailRelay\emailrelay_server-log-%d.txt --log-time --pid-file C:\ProgramData\E-MailRelay\emailrelay_server.pid --spool-dir C:\ProgramData\E-MailRelay\spool_server --debug --verbose

     

    Last edit: Yunus YILDIRIM 2024-08-22
  • Graeme Walker

    Graeme Walker - 2024-08-21

    I think this must be that you are now hitting the limit on the number of open file descriptors in the C runtime library. I will look into it.

     
  • Yunus YILDIRIM

    Yunus YILDIRIM - 2024-08-22

    better. I dont see any messageAddContentFailed error log.
    still try.

     
  • Yunus YILDIRIM

    Yunus YILDIRIM - 2024-08-23

    2.6.x version is at least 10 times faster than 2.5 version. its very nice. Thanks!
    It sometimes stuck in my tests, I still haven't figured it out. I couldn't understand if it's due to the network or if there's a problem with the email relay.

    Also, a see this one ;

    emailrelay-2.6rc2: 20240822.122254.239: warning: 127.0.0.1; GSmtp::Server: new connection error: event loop overflow
    emailrelay-2.6rc2: 20240822.122254.239: warning: 127.0.0.1; GNet::Server::readEvent: connection rejected from 127.0.0.1:1398
    
     

    Last edit: Yunus YILDIRIM 2024-08-23
  • Graeme Walker

    Graeme Walker - 2024-08-25

    The "new connection error" implies that you are now hitting the limit of the multi-threaded event loop implementation, which is 3780 concurrent connections (63 * 60). Is that possible!? Please could you check with "netstat -a -n -p TCP" -- pipe the output to a file and count the number of lines. Please also check again for the "event loop using ... threads" message. It looks like you are using verbose logging so it would be very strange if you do not see it at all.

     
  • Yunus YILDIRIM

    Yunus YILDIRIM - 2024-08-26

    I never see "event loop using ... threads" message in the log. Try a lot but never see.
    Sometimes this message appear in the log;

    Main::Run::emit: too many notification events: discarding old ones
    

    I think I found the problem.
    After sending test emails to the emailrelay, it waits too long for the connection to close. As far as I understand, the socket connection waits in TIME_WAIT for about 120 seconds. And this prevents further connection establishments.

    you can see the attached images.
    The same port numbers are waiting in the TIME_WAIT phase for 120 seconds after the connection is ESTABLISHED and the mail sending is finished,
    in this time I see in the log file

    GSmtp::ServerPeer: smtp connection closed: smtp protocol done:SMTP_ClientIP_Address:12345" log is displayed.
    
     
  • Graeme Walker

    Graeme Walker - 2024-08-26

    The "discarding old ones" warning is not important -- that refers to the event queue for updating the user interface "Status" tab (which does not exist when using "--hidden").

    I have done some more testing on the binary that I uploaded for you and I am reasonably confident that the basic operation of the multi-threaded event loop is okay: when I run a test program to connect and re-connect up to 3000 simultaneous connections I can see
    * "event loop using 50 threads" in the log
    * Windows TaskMonitor shows ~6000 handles and 50 threads, and
    * netstat lists ~3000 ESTABLISHED connections

    When going up to 4000 connections there are some "new connection error: event loop overflow" errors in the log and netstat tops out at about 3700 connections. This is all as expected because the hard limit on the number of handles is 3780.

    TIME_WAIT states in the server can be avoided by making sure that the client sends the SMTP QUIT command when it has finished, immediately followed by a half-close (FIN). The server responds with the SMTP OK response and its own half-close. Whoever goes first ends up in TIME_WAIT and you normally want that to be the client.

    It is possible to prevent TIME_WAIT in the server altogether (IIRC) by setting the 'nolinger' socket option, but that is discouraged for security reasons. There is code in emailrelay to pass around socket options such as nolinger but there is currently no command-line option for it. I suppose it could go via --server-smtp-config and --client-smtp-config (eg. --server-smtp-config=nolinger,smtputf8) but that's not ideal.

     
  • Yunus YILDIRIM

    Yunus YILDIRIM - 2024-08-27

    I see. then everything is okay. Thanks for your help and developments, E-mailRelay just got better.

    btw, can you share your testing tool for 3000 simultaneous connections or any suggestions except my python script.

     
  • Graeme Walker

    Graeme Walker - 2024-08-27

    Thank you for your extreme beta testing and your previous feature requests -- I agree that they have made emailrelay better.

    I've attached my test program, as requested. It is written in c++ using the emailrelay libraries. Note that it does not do any SMTP, so it is only testing the basic scalability of the windows event loop. In particular, it does not cause thousands of content files to be written simultaneously (!) so it would not have found the 512 file descriptor limit.

    I don't have any low-level suggestions for your script because all the magic happens in the smtplib library, but for testing emailrelay perhaps you could enable TLS encryption, use SMTP authentication and have larger message bodies. It would be interesting to see if there is a performance drop-off when the event loop switches to being multithreaded, and you will probably only see that with much larger messages.

    As discussed above, I have decided to add a "nolinger" keyword to the "--server-smtp-config" and "--client-smtp-config" options for v2.6, even though setting nolinger is generally considered a bad idea.

     
  • Graeme Walker

    Graeme Walker - 2024-09-18
    • status: open --> closed
     
  • Graeme Walker

    Graeme Walker - 2024-09-18

    Fixed in v2.6

     

Log in to post a comment.