|
From: Patrick O. <pat...@gm...> - 2011-01-04 09:04:53
|
Hello! Let me add the SyncEvolution list, because the technical information may be relevant. For those who see this for the first time, it started with an open letter that I sent to the OpenSync list asking whether it really still makes sense to continue with two different projects instead of focusing on one: http://sourceforge.net/mailarchive/forum.php?thread_name=20110103221846.GA21876%40foursquare.net&forum_name=opensync-devel I was suggesting that SyncEvolution has the better baseline to continue from. Of course this requires further explanation, which this email is about. It is in reply to Graham because he raised several interesting technical questions. On Mo, 2011-01-03 at 16:57 +0000, Graham Cobb wrote: > On Monday 03 January 2011 13:32:52 Patrick Ohly wrote: > > In this email I'd like to appeal to the OpenSync developers to > > reconsider whether keeping OpenSync around really helps Linux and open > > source syncing. > > Patrick, > > Thanks for your well considered thoughts. [...] > > Of course I am thinking of SyncEvolution here. It already works very > > well for SyncML. I also have support for additional protocols, which I > > will be able to publish soon. I think it would be worthwhile successor > > of OpenSync, but obviously I'll need help to cover all the use cases > > that you were shooting for with OpenSync. > > I do not have a good feel for what SyncEvolution can and cannot do: can you > provide more information? One thing is the device access protocols, of > course, but even if we all joined you and helped implement those, how would > SyncEvolution compare with what OpenSync is intended to do? SyncEvolution has grown organically over time (one could call it evolutionary...), instead of shooting for a grand design covering everything, like OpenSync did for 0.40. The main advantage is that there have been regular stable releases since the very beginning four years ago. On the other hand, features for which there was no real need yet are still missing. There have been different phases: 1. SyncML client for Evolution 2. SyncML client for additional storages (iPhone, Mac OS X, file) 3. backends contributed by external developers (Ove Kaaven: N900 calendar, Franz Knipp/m-otion.com: XMLRPC) 4. SyncML server (direct syncing with phones), both via Bluetooth and HTTP, using the Synthesis engine 5. non-SyncML protocols The last point is the goal for SyncEvolution 1.2, in development right now. It still uses the Synthesis engine and everything that it provides (data conversion, conflict handling). SyncML is also still in use, but only as internal protocol between two peers. What a developer above the engine sees is the storage plugin (aka data source) interface. Conceptually such a plugin must provide: 1. change tracking (otherwise only slow syncs work) 2. data import/export, either in the internal Synthesis format (field list) or in a backend specific text format that the engine understands Further references: * introduction to the Synthesis engine and its data conversion: http://syncevolution.org/development/pim-data-synchronization-why-it-so-hard * convenience class for a data source which has id + revision string for each item and exchanges data as text: http://meego.gitorious.com/meego-middleware/syncevolution/blobs/master/src/syncevo/TrackingSyncSource.h * base class with maximum freedom: http://meego.gitorious.com/meego-middleware/syncevolution/blobs/master/src/syncevo/SyncSource.h * fully functional example backend: http://meego.gitorious.com/meego-middleware/syncevolution/trees/master/src/backends/file * configuration handling: http://syncevolution.org/development/configuration-handling * communication patterns and server mode: http://syncevolution.org/development/direct-synchronization-aka-syncml-server * local sync: http://www.mail-archive.com/syn...@sy.../msg01419.html Ove and Franz were able to implement their backends with very little assistance, so the documentation can't be that bad, although there's no doubt that documentation could always be better. > For example, does it only handle pair-wise sync? If so, what is the > implication of that restriction (do you have to designate one of your devices > as master and sync everything else to it)? Yes, sync is always between two peers. One storage should be the designated "master" copy of the data. Any data which cannot be stored by that "master" will get lost. The master could be in a capable system like EDS or Akonadi, or in the file backend, which can store anything that the sync engine itself can handle. A sync topology is created by defining several of these 1:1 relationships. The master itself might be the client of another server, as long as there are no loops. There is currently no logic for keeping several of these peers in sync, but that could be added at a meta level (keep syncing until all changes have been distributed). Unknown extensions are currently dropped. This could be changed, but leads to additional questions that would need to be sorted out: should such extensions be sent to all peers, or just the one who created them? What if different peers have a different understanding of "X-FOOBAR"? It is safer to limit syncing to the data that is fully understood and modeled in the Synthesis configuration file. Currently this covers vCard 3.0 + extensions and iCalendar 2.0 (including UID + RECURRENCE-ID, VTIMEZONE, VALARM, but not attachments). > Does it handle devices that have bugs or limited implementations (issues like > capabilities and merging)? Yes. The Synthesis engine has dealt with that for 10 years and contains a large collection of tools that can be used to deal with such problems, ranging from different data profiles to a full scripting language that can modify data on-the-fly. The Synthesis engine uses capability descriptions to determine which properties are supported by an unknown peer and has smart merging techniques for individual properties. For example, consider the case where a VEVENT was modified like this: 1. event in sync on peer A and B 2. DESCRIPTION is extended on peer A 3. SUMMARY is modified on peer B 4. syncing recognizes the conflict and resolves it by using the SUMMARY of peer B (because the item on B is more recent) and the DESCRIPTION of A (because the description of B is a subset of it) These two properties are handled differently because the conflict resolution policy is configured differently to reflect the difference between single-line and multi-line text. > What about missing unique IDs? In such a case only slow syncs are possible. The Synthesis data modeling defines which properties are compared to find pairs. The drawback of a slow sync is that data removed on one side will be recreated. I have thought a bit about that over Christmas, because I am now in that situation: I can modify the address book on my FRITZ!Box 7390 router, but it is an XML file with no unique identifier for each entry. My idea is to do synchronization in multiple steps: 1. keep a local mirror of all contacts 2. do a slow sync against that mirror to find pairs; items in the mirror which have no corresponding entry on the router can be removed 3. two-way sync between the mirror and my master data 4. upload copy of the mirror to the router The simpler alternative would be to pick some properties and use those as key, perhaps with hashing to keep the key size small. > Conflicts? See above. Client-wins/server-wins/most-recent-wins are all configurable. SyncEvolution itself uses most-recent-wins, with smart merging of some properties. > And the > many other issues that OpenSync has been adding complexity while trying to > solve? We would need to list those, but I'm fairly sure that much of it has been considered already. > In summary, I would like to understand why you feel that redirecting our > efforts to SyncEvolution has any greater chance of success in solving the hard > problems of syncing. My own summary, more at a meta level than the details above: * don't reinvent the wheel, use a mature engine (Synthesis) * add features in small steps (more manageable, immediately useful) -- Bye, Patrick Ohly -- Pat...@gm... http://www.estamos.de/ |