From: Mark E. <ma...@mp...> - 2013-08-26 14:40:48
|
Hi Chris, here's some more. filesync-escape-name.diff Replace non-reversible substitution of characters that aren't valid as filenames, with uri escaping. The old scheme resulted in unnecessary uid changes, and could have given collisions and duplications. filesync-control-flow.diff Errors we're being reported further up the call chain than necessary in the commit process, giving a confused idea of which function was actually in control. Mark |
From: Graham C. <g+o...@co...> - 2013-08-26 17:32:44
|
On 26/08/13 15:40, Mark Ellis wrote: > filesync-escape-name.diff > Replace non-reversible substitution of characters that aren't valid as > filenames, with uri escaping. The old scheme resulted in unnecessary uid > changes, and could have given collisions and duplications. I fairly strongly object to this change. I seem to remember, a very, very long time ago (probably in Armin's day) a discussion of this (on the mailing list, or in a bug report, I don't remember). I think the decision to use a non-reversible escape was made deliberately. The reason was to make the filenames as human-usable as possible. File-sync is not intended to be used for interchange (use a proper protocol!), it is intended to be convenient and easy for humans, who may exchange (.ics, etc) files with each other. By far the biggest problem is spaces. Spaces are allowed in all filesystems we target but not in URLs. They are also very common in the names of calendar and contact files. We really, really, really don't want to fill the user's directory up with %20's! The encoding has never been a problem in the real world. But, if the decision is that we should move to a reversible encoding then we need to use something a bit more clever. In that case, I suggest specifying a reserved_chars_allowed (to g_uri_escape_string) which includes at least space (probably " #$&(),", and possibly even "?!", although they are escaped today). By the way, other programs (like my OWASync), and indeed human beings, use different (also non-reversible) algorithms. These have always inter-worked successfully with OpenSync. I don't see the problem this is trying to fix. Graham |
From: Mark E. <ma...@mp...> - 2013-08-26 17:38:51
Attachments:
signature.asc
|
On Mon, 2013-08-26 at 19:15 +0200, Graham Cobb wrote: > On 26/08/13 15:40, Mark Ellis wrote: > > filesync-escape-name.diff > > Replace non-reversible substitution of characters that aren't valid as > > filenames, with uri escaping. The old scheme resulted in unnecessary uid > > changes, and could have given collisions and duplications. > > I fairly strongly object to this change. > > I seem to remember, a very, very long time ago (probably in Armin's day) > a discussion of this (on the mailing list, or in a bug report, I don't > remember). I think the decision to use a non-reversible escape was made > deliberately. > > The reason was to make the filenames as human-usable as possible. > File-sync is not intended to be used for interchange (use a proper > protocol!), it is intended to be convenient and easy for humans, who may > exchange (.ics, etc) files with each other. > > By far the biggest problem is spaces. Spaces are allowed in all > filesystems we target but not in URLs. They are also very common in the > names of calendar and contact files. We really, really, really don't > want to fill the user's directory up with %20's! > > The encoding has never been a problem in the real world. But, if the > decision is that we should move to a reversible encoding then we need to > use something a bit more clever. > > In that case, I suggest specifying a reserved_chars_allowed (to > g_uri_escape_string) which includes at least space (probably " #$&(),", > and possibly even "?!", although they are escaped today). > > By the way, other programs (like my OWASync), and indeed human beings, > use different (also non-reversible) algorithms. These have always > inter-worked successfully with OpenSync. I don't see the problem this > is trying to fix. > > Graham I have no problem with an alternative scheme, but to my mind this is currently broken. For example, create a group to sync evolution and file. Evolution commonly uses @ in it's uid scheme. Sync this group, anything that comes from evo to file has @ replaced with _ in the resulting filename. This has changed the uid, since file-sync stores the uid in the filename. Run another sync, anything that had @ replaced with _ gets sent as a deletion and an addition. Now, I don't think it's up to opensync to change item uid's for no reason, so this must be wrong. If file-sync is supposed to be usable in the wild, then this kind of behaviour will confuse and worry people at the least. If it's only for testing, then it's introducing artificial artifacts into the testing process, which is also wrong. Mark |
From: Graham C. <g+o...@co...> - 2013-08-27 21:32:43
|
On 26/08/13 18:38, Mark Ellis wrote: > For example, create a group to sync evolution and file. Evolution > commonly uses @ in it's uid scheme. Sync this group, anything that > comes from evo to file has @ replaced with _ in the resulting > filename. This has changed the uid, since file-sync stores the uid > in the filename. Run another sync, anything that had @ replaced > with _ gets sent as a deletion and an addition. Mark, I haven't touched the code of OpenSync for 5 years so I am sure your analysis is right. I thought I remembered that there was a table to map UIDs between plugins but I am probably just confused! If we need a reversible mapping then let's just agree on the reserved_chars_allowed list to provide to g_uri_escape_string so that special characters don't get turned into hex codes unnecessarily. I suggest using " #$&(),?!@'". They are all characters that appear fairly often in subject lines of meetings or in names (both of which often get used in file names for file-sync). Graham |
From: Chris F. <cd...@fo...> - 2013-09-13 17:49:10
|
On Tue, Aug 27, 2013 at 10:32:32PM +0100, Graham Cobb wrote: > If we need a reversible mapping then let's just agree on the > reserved_chars_allowed list to provide to g_uri_escape_string so that > special characters don't get turned into hex codes unnecessarily. > > I suggest using " #$&(),?!@'". They are all characters that appear > fairly often in subject lines of meetings or in names (both of which > often get used in file names for file-sync). Is there concensus in the form of a patch that I need to apply yet? :-) No rush, just checking. I don't want to be the one holding up production. - Chris |
From: Mark E. <ma...@mp...> - 2013-09-18 12:26:07
Attachments:
signature.asc
|
On Fri, 2013-09-13 at 19:49 +0200, Chris Frey wrote: > On Tue, Aug 27, 2013 at 10:32:32PM +0100, Graham Cobb wrote: > > If we need a reversible mapping then let's just agree on the > > reserved_chars_allowed list to provide to g_uri_escape_string so that > > special characters don't get turned into hex codes unnecessarily. > > > > I suggest using " #$&(),?!@'". They are all characters that appear > > fairly often in subject lines of meetings or in names (both of which > > often get used in file names for file-sync). > > Is there concensus in the form of a patch that I need to apply yet? :-) > > No rush, just checking. I don't want to be the one holding up production. > Hi Chris Actually, I was thinking about what Graham said initially, and it's left me in a dilemma. The encoding needs to be reversible, no doubt about that, and uri % encoding is nice and easy and recognisable, so I'm happy with that. Using the glib function, with Graham's additional allowed characters, is nice, easy, quick and safe, but as Graham said, does limit users in what they can call new files they want to sync. I've written an alternative which only escapes %/:*\\>< but it assumes everything is utf8 and/or ascii. So my response is, would we rather have something easy with a bit less functionality, or more functionality with built in assumptions that might have unforeseen consequences ? Mark |
From: Chris F. <cd...@fo...> - 2013-09-19 22:07:55
|
On Wed, Sep 18, 2013 at 01:25:46PM +0100, Mark Ellis wrote: > Using the glib function, with Graham's additional allowed characters, is > nice, easy, quick and safe, but as Graham said, does limit users in what > they can call new files they want to sync. I've written an alternative > which only escapes %/:*\\>< but it assumes everything is utf8 and/or > ascii. > > So my response is, would we rather have something easy with a bit less > functionality, or more functionality with built in assumptions that > might have unforeseen consequences ? If there are files that cannot be synced, just make sure there's a prominent message in the log to that effect. Maybe even an error, so that people don't mistakenly believe that their data is synced when it's not. Otherwise, I'll let you guys figure out the details. :-) - Chris |
From: Graham C. <g+o...@co...> - 2013-09-18 19:26:32
|
On 18/09/13 13:25, Mark Ellis wrote: > Actually, I was thinking about what Graham said initially, and it's left > me in a dilemma. Oops, sorry about that, Mark. Given the effort you are putting in on Opensync, I really don't want to slow you down! > Using the glib function, with Graham's additional allowed characters, is > nice, easy, quick and safe, but as Graham said, does limit users in what > they can call new files they want to sync. I've written an alternative > which only escapes %/:*\\>< but it assumes everything is utf8 and/or > ascii. > > So my response is, would we rather have something easy with a bit less > functionality, or more functionality with built in assumptions that > might have unforeseen consequences ? I am happy with either. If you are happy with the allowed characters I suggested then I am happy with those constraints. I think I covered the most likely ones people will use, except for : but we don't have much choice about that as it is invalid under Windows. I am also happy with the utf8 assumption, but you may be right to be cautious. Bottom line: whichever you prefer. My vote is go for the glib function with a fairly permissive list of allowed characters and don't worry about the other cases. Thanks for taking the time to think about this and sorry for holding you up so much. Graham |
From: Mark E. <ma...@mp...> - 2013-09-25 15:04:40
|
On Wed, 2013-09-18 at 21:26 +0200, Graham Cobb wrote: > On 18/09/13 13:25, Mark Ellis wrote: > > Actually, I was thinking about what Graham said initially, and it's left > > me in a dilemma. > > Oops, sorry about that, Mark. Given the effort you are putting in on > Opensync, I really don't want to slow you down! > Glad to have some input, there aren't enough people around here :) Right, I've gone with uri encoding with exceptions, I've reversed the order of the patches to make life easier for myself while I was fiddling, so here you go Chris. filesync-control-flow.diff Errors we're being reported further up the call chain than necessary in the commit process, giving a confused idea of which function was actually in control. filesync-escape-name.diff Replace non-reversible substitution of characters that aren't valid as filenames, with uri escaping with exceptions for commonly used characters in filenames. The old scheme resulted in unnecessary uid changes, and could have given collisions and duplications. Mark |
From: Graham C. <g+o...@co...> - 2013-09-29 17:59:28
|
On 25/09/13 16:04, Mark Ellis wrote: > Right, I've gone with uri encoding with exceptions, I've reversed the > order of the patches to make life easier for myself while I was > fiddling, so here you go Chris. They look good to me. Thanks, Mark, for your work on this. Graham |
From: Chris F. <cd...@fo...> - 2013-10-10 19:24:52
|
On Wed, Sep 25, 2013 at 04:04:10PM +0100, Mark Ellis wrote: > filesync-control-flow.diff > Errors we're being reported further up the call chain than necessary in > the commit process, giving a confused idea of which function was > actually in control. Thanks Mark! Committed. > filesync-escape-name.diff > Replace non-reversible substitution of characters that aren't valid as > filenames, with uri escaping with exceptions for commonly used > characters in filenames. The old scheme resulted in unnecessary uid > changes, and could have given collisions and duplications. Committed as well, with a small change to fix the NULL argument in g_uri_escape_string(). I changed it to TRUE, to match the other call you had. - Chris |
From: Mark E. <ma...@mp...> - 2013-10-14 15:35:41
Attachments:
signature.asc
|
> > Committed as well, with a small change to fix the NULL argument in > g_uri_escape_string(). I changed it to TRUE, to match the other > call you had. > > - Chris > Well spotted ! |