Menu

Outlook flooding Zimbra server

Maxxer
2014-08-29
2014-09-15
  • Maxxer

    Maxxer - 2014-08-29

    Hi.
    I'm using zpush-2.1.2-1873 (backend 59.6) with outlook 2013, and so far it seemed to work fine. At least it works fine with small accounts and 3 months sync. Sadly Outlook users want ALL their mails in outlook, so I had to change SYNC_FILTERTIME_MAX to SYNC_FILTERTYPE_ALL.

    When I open outlook of a VERY BIG mailbox it apparently works and start syncing, but after a while it seems stuck, and not enough the Zimbra server starts suffering and being unresponsive to all other users, even the ones using the webmail.

    I'm using z-push with zimbra's nginx proxy. This means requests for /Microsoft-Server-ActiveSync are redirected to a local Apache serving zpush. I have no errors in apache's error.log. My suspect is that when outlook is started it makes a lot of connections and server is running out of nginx processes. I had the default value of 4, now I changed the value to 8 since I have 4 cores, but I still have to check if this has any benefit. also I haven't found errors in nginx.log so I'm not sure this is a problem, but it should to just better with 8 processes.

    I have this in z-push-admin for the user:

    DeviceId:               8919be26ffd74d67a4a778921b4dcff8
    Device type:            WindowsOutlook
    UserAgent:              Outlook/15.0 (15.0.4613.1000; MSI; x86)
    ActiveSync version:     14.0
    First sync:             2014-08-05 18:32
    Last sync:              2014-08-29 10:41
    Total folders:          13
    Synchronized folders:   11 (1 in progress)
    Synchronized data:      Emails(8) Contacts Tasks Calendars
    Synchronization progress:
         Folder: Emails       Sync: Synchronizing    Status: 45% (24652/54924)
    Status:                 Not available
    WipeRequest on:         not set
    WipeRequest by:         not set
    Wiped on:               not set
    Attention needed:       27 messages need attention because they could not be synchronized
        Broken object:  'SyncMail' ignored on '2014-08-07 15:26'
        Information:    Subject: 'R: subject' - From: '"Name surname" <name.surname@domain.it>'
        Reason:         Message was causing loop (2)
        Item/Parent id: 9901/f2
    

    From what I see in the logs I have no problems with the DoSFilter, as nginx and apache are running on the same host as Zimbra.

    Any ideas what could cause this trouble? Any other hint on what I can tune in zimbra or apache to better handle big mailboxes sync?
    thanks

     
  • Maxxer

    Maxxer - 2014-09-02

    An update on the issue. The issue of the server not responding was not caused by z-push flood, but it was an iptables problem.

    But I still have problems with this specific client, as it's stuck forever looping on the same items:

     02/09/2014 11:42:49 [ 5634] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f3' message id '92708'
     02/09/2014 11:43:03 [ 5634] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 11:44:14 [ 6676] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 12:12:51 [12155] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f3' message id '94297'
     02/09/2014 12:13:06 [12155] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 12:14:09 [ 6676] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f3' message id '92708'
     02/09/2014 12:14:16 [ 6676] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 12:43:09 [30749] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f5' message id '94410'
     02/09/2014 12:43:10 [30749] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 12:44:19 [ 4545] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f5' message id '94408'
     02/09/2014 12:44:19 [ 4545] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 13:08:32 [ 5634] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f2' message id '94425'
     02/09/2014 13:08:33 [ 5634] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f3' message id '94297'
     02/09/2014 13:08:35 [ 5634] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 13:11:14 [31977] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f2' message id '94424'
     02/09/2014 13:11:15 [31977] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f3' message id '92708'
     02/09/2014 13:11:16 [31977] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f5' message id '94410'
     02/09/2014 13:11:18 [31977] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 13:13:40 [27822] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 13:14:54 [30749] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f2' message id '94425'
     02/09/2014 13:14:56 [30749] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f3' message id '94297'
     02/09/2014 13:15:25 [30749] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 13:16:09 [ 3150] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f3' message id '92708'
     02/09/2014 13:16:19 [ 3150] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f5' message id '94426'
     02/09/2014 13:16:20 [ 3150] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 13:16:41 [26551] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f5' message id '94410'
     02/09/2014 13:16:42 [26551] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 13:42:55 [27138] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f2' message id '94429'
     02/09/2014 13:42:56 [27138] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f3' message id '94297'
     02/09/2014 13:42:58 [27138] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 13:44:11 [27118] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f3' message id '92708'
     02/09/2014 13:44:11 [27118] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f5' message id '94426'
     02/09/2014 13:44:12 [27118] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 14:13:16 [30749] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 14:14:27 [27311] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 14:43:03 [31977] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f2' message id '94449'
     02/09/2014 14:43:04 [31977] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f3' message id '94297'
     02/09/2014 14:43:19 [31977] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 14:44:22 [27132] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f3' message id '92708'
     02/09/2014 14:44:29 [27132] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
     02/09/2014 15:04:36 [30749] [ERROR] [user@domain.net] Ignored broken message (SyncMail). Reason: '2' Folderid: 'f81349' message id '93615'
    

    I tried clearing loop information but still the same. How can I have it complete the sync?
    As said before the mailbox is pretty big.
    thanks

     
  • LiverpoolFCfan

    LiverpoolFCfan - 2014-09-02

    It looks like a few problem emails. You can see the message Ids and the folders for the messages repeating over and over. z-push should mark them as not-syncable after a few tries and move on - but it is possible if you keep clearing the loop indicator that you are just starting that loop process over again for each of them. I have to admit I don't know what the exact logic they use is in this area. I have never looked at it.

    You could take a look at the messages (show original) to see if you can identify any potential problems in the emails.

    You could also get a WBXML log of the items in question, and request help on the z-push forum.

     
  • Maxxer

    Maxxer - 2014-09-02

    It's the first time I issue a "clearloop". Now I made a "resync", to see if it changes anything, if it gets stuck to the same place I'll capture some debug info. thanks

     
  • Maxxer

    Maxxer - 2014-09-15

    Hi. I managed to install a test server for outlook with this busy mailbox. Server is idle doing nothing so dedicated to this test. I configured outlook with the account, it started syncing and after some time it brought up something like 20 emails, now it's stuck...z-push-top show no activity, wbxml log is silent...

    I uploaded the debug log here:
    https://dl.dropboxusercontent.com/u/706934/zpushoutlook.log

    I didn't spot any misbehavior, if anyone has suggestions on why it isn't working...
    thanks

     
  • LiverpoolFCfan

    LiverpoolFCfan - 2014-09-15

    This looks like the classic problem with a huge mailbox - and the reason I recommend limiting the sync window.

    The client requests a Sync at 8:14:11

    15/09/2014 08:14:11 [29164] [DEBUG] [username@domain.it] Zimbra->GetMessageList(): START GetMessageList { folderid = f2; cutoffdate = 0; virtual = 0; offset = 0 }

    Due to thee number of items in the mailbox, the loop to count up all the items is still running at 8:15:11

    15/09/2014 08:15:11 [29164] [DEBUG] [username@domain.it] Zimbra->GetMessageList(): START GetMessageList { folderid = f2; cutoffdate = 0; virtual = 1; offset = 43000 }

    by which time the client has given up on the first request and has re-issued it.

    If you search for the thread ID, you can see that the initial request did not in fact finish until

    15/09/2014 08:17:01 [29164] [DEBUG] [username@domain.it] -------- End

    almost 3 minutes after it started.

    ActiveSync clients give up after 30 seconds as far as I am aware. With the way the zimbra backend is currently implemented it cannot handle that kind of volume of email.

     
  • Maxxer

    Maxxer - 2014-09-15

    ouch :(
    does the limit apply to the number of mails into a single mailbox or is it "global"?
    I'm trying to understand if there's any workaround to this issue.
    thanks

     
  • Maxxer

    Maxxer - 2014-09-15

    hm maybe it's a stupid question, also because I don't really know the basics of AS protocol... but what if instead of going recursive in GetMessageList you just return the current split (up to $limit) of results and store somewhere the current $offset? So next time you can do the second batch and so on. If so next time the client asks for GetMessageList you should reply with a different array and fetch another set of messages, and so on. Could this work? It would indeed take [much] longer for initial sync, I guess...

     
  • LiverpoolFCfan

    LiverpoolFCfan - 2014-09-15

    No easy way round the issue as far as I am aware. It would take a complete re-design of the sync mechanism to not use the z-push diff backend - but rebuild it from scratch in some other way.

    ActiveSync was designed for keeping a rolling window (3 days - 2 weeks) of emails locally with the rest searchable on the server. This is how it was implemented in this case too. It was never intended to cater for multiple 000's of emails. Most clients cannot handle the space requirements to sync that much email - though I understand Outlook would not have the same space restrictions as other clients.

     

Log in to post a comment.