Thread: mbsync slow gmail All Email folder with +100K emails
mailbox synchronizer
Brought to you by:
ossi
From: Martin C. <mar...@gm...> - 2022-04-18 08:34:48
|
I am using mbsync (isync 1.4.4) on Ubuntu to sync Gmail imap to my local machine. The Gmail All Mail folder has +100K emails and takes a very long time to sync. Looking at the output from mbsync --debug there is not a lot of network traffic, but it appears that for EVERY email in the All Mail folder there are entries like this: ``` entry (1,1,16,) ... uid=5750 flags=S size=0 tuid=? ... pair (53571,21018) ... F: * 21018 FETCH (UID 53571 FLAGS (\Seen)) ``` The entire output of the --debug command is approx. 24 mb, which I have saved as a log file. So it appears that the slowness is local.Any idea how I can speed things up? |
From: Oswald B. <osw...@gm...> - 2022-04-18 09:13:32
|
On Mon, Apr 18, 2022 at 10:34:29AM +0200, Martin Clausen wrote: >Looking at the output from mbsync --debug there is not a lot of >network traffic, [...] So it appears that the slowness is local. > that's weird, as mbsync is generally network-bound. >Any idea how I can speed things up? > not without you finding out where the time is spent. top/iotop/iftop for starters - where does mbsync stand out? finding out during which execution phase the time is spent is the next step; `strace -tt mbsync ...` or `mbsync -D ... | ts "%H:%M:%.S"` would provide clues. |
From: Martin C. <mar...@gm...> - 2022-04-18 09:35:22
|
OK. I captured the output of mbsync -D gamil | ts "%H:%M:%.S". It is approx 35 mb. I don't have much experience with analysing this kind of issue. What should I look for. Looks to me like the time is spent on a gigantic number of the 4 operations I listed all for the All Mail folder. On Mon, Apr 18, 2022 at 11:13 AM Oswald Buddenhagen < osw...@gm...> wrote: > On Mon, Apr 18, 2022 at 10:34:29AM +0200, Martin Clausen wrote: > >Looking at the output from mbsync --debug there is not a lot of > >network traffic, [...] So it appears that the slowness is local. > > > that's weird, as mbsync is generally network-bound. > > >Any idea how I can speed things up? > > > not without you finding out where the time is spent. > top/iotop/iftop for starters - where does mbsync stand out? > finding out during which execution phase the time is spent is the next > step; `strace -tt mbsync ...` or `mbsync -D ... | ts "%H:%M:%.S"` would > provide clues. > > > _______________________________________________ > isync-devel mailing list > isy...@li... > https://lists.sourceforge.net/lists/listinfo/isync-devel > |
From: Oswald B. <osw...@gm...> - 2022-04-19 09:45:32
|
On Mon, Apr 18, 2022 at 11:34:59AM +0200, Martin Clausen wrote: >OK. I captured the output of mbsync -D gamil | ts "%H:%M:%.S". It is approx >35 mb. > it should compress well. mail it to me privately. |
From: Oswald B. <osw...@gm...> - 2022-04-19 11:45:29
|
On Mon, Apr 18, 2022 at 11:34:59AM +0200, Martin Clausen wrote: >Looks to me like the time is spent on a gigantic number of the 4 >operations I listed all for the All Mail folder. > as i expected, by far most time is spent receiving IMAP FETCH responses. so the network is definitely the limiting factor, and it's an expected effect of syncing 130k+ messages. to quote myself from a recent thread: >> the base imap protocol that mbsync uses isn't very efficient for >> resyncs. a proper fix would require using the QRESYNC extension, but >> that's a major project (also in the TODO). as a workaround, you can >> do partial syncs, though it may be challenging to automate that >> sufficiently to make it both reliable and convenient when working >> with multiple clients. but note that gmail doesn't support QRESYNC anyway. it supports CONDSTORE, for which i also have an TODO item, but that won't help you anytime soon. so practically, you need to use --pull-new and --push, and do full syncs only occasionally. note that you're syncing both "All Mail" and some labels (== virtual folders), which causes duplicate work. for performance, it would be best to exclude All Mail entirely, and sync only "views" that are currently relevant. you may consider the MaxMessages option as well. |
From: Martin C. <mar...@gm...> - 2022-04-19 12:00:28
|
On Tue, Apr 19, 2022 at 1:45 PM Oswald Buddenhagen < osw...@gm...> wrote: > On Mon, Apr 18, 2022 at 11:34:59AM +0200, Martin Clausen wrote: > >Looks to me like the time is spent on a gigantic number of the 4 > >operations I listed all for the All Mail folder. > > > as i expected, by far most time is spent receiving IMAP FETCH responses. > so the network is definitely the limiting factor, and it's an expected > effect of syncing 130k+ messages. to quote myself from a recent thread: > > >> the base imap protocol that mbsync uses isn't very efficient for > >> resyncs. a proper fix would require using the QRESYNC extension, but > >> that's a major project (also in the TODO). as a workaround, you can > >> do partial syncs, though it may be challenging to automate that > >> sufficiently to make it both reliable and convenient when working > >> with multiple clients. > > but note that gmail doesn't support QRESYNC anyway. it supports > CONDSTORE, for which i also have an TODO item, but that won't help you > anytime soon. > > so practically, you need to use --pull-new and --push, and do full syncs > only occasionally. > > note that you're syncing both "All Mail" and some labels (== virtual > folders), which causes duplicate work. for performance, it would be best > to exclude All Mail entirely, and sync only "views" that are currently > relevant. you may consider the MaxMessages option as well. > > Thank you so much for looking into this, and for the tips on how to work around. > > > _______________________________________________ > isync-devel mailing list > isy...@li... > https://lists.sourceforge.net/lists/listinfo/isync-devel > |