|
From: Henrik /K. <kaa...@us...> - 2009-09-21 16:41:52
|
Dear all, I am not familiar enough with the inner workings of OpenSync to comment on the details ))-: However, I do think a fix is necessary. It is great to freeze the API for 0.39/0.40, but it would be great if it would actually work too! /Henrik Christian Hilgers wrote: > Graham Cobb schrieb: > > >> The risk is that the real fix may involve an API change. However, I think >> that can be done with (i) no changes needed in plugins which do not do the >> SyncML-style batching, and (ii) the API changes would be limited to some >> additonal functions, not changes to the existing API. >> > > This should be possible without an API break, so no stopper for releasing 0.39 > > Christian > -- Graham Cobb wrote: > On Friday 18 September 2009 01:22:25 Henrik /KaarPoSoft wrote: > >> Are there any progress / comments / plans on ticket 1078? >> See also comments on the mailinglist regarding >> "Remove pendingLimit from OSyncQueue". >> > > I was hoping someone might comment on my suggestion back in April (attached). > On the other hand, I have not done anything about it since that message -- > not even drafted the API necessary to implement my suggestion. My fault, > sorry. > > >> As far as I can see, this problem prevents any sync with SyncML devices, >> as the SyncML protocol may have several changes in one message, >> which I believe OpenSync cannot handle now... >> > > My proposal is to disable the timeout handling completely for now and for me > (or someone else if they wish) to implement a real fix over the next few > months. I am not going to get a chance to work on this for the next several > weeks, unfortunately (although I can check in a change to just disable > timeout handling straight away if that is what people want). > > The risk is that the real fix may involve an API change. However, I think > that can be done with (i) no changes needed in plugins which do not do the > SyncML-style batching, and (ii) the API changes would be limited to some > additonal functions, not changes to the existing API. > > Graham > > > ------------------------------------------------------------------------ > > Subject: > Re: [Opensync-devel] [RFC] Remove pendingLimit from OSyncQueue > From: > Graham Cobb <g+o...@co...> > Date: > Sat, 25 Apr 2009 17:22:00 +0100 > To: > Daniel Gollub <go...@b1...> > > To: > Daniel Gollub <go...@b1...> > CC: > Michael Bell <mic...@cm...>, Opensync Devel > <ope...@li...> > > > On Tuesday 14 April 2009 13:39:43 Daniel Gollub wrote: > >> On Tuesday 14 April 2009 02:25:29 pm Michael Bell wrote: >> >>> I don't think that a dependency between two pipes is a good idea but I >>> understand that IPC has limits. I also used IBM MQ series in the past >>> which has nothing to do with IPC. So is our message queue implementation >>> a real IPC implementation which requires limits? >>> >> No Idea - maybe Graham kann answer this. >> > > Sorry about the delay -- I have been away and have not had a chance to spend > any time on this until today. > > I understand the problem, and the current timeout/limit mechanism definitely > deadlocks with the way the async plugins work today (I am ignoring any > changes suggested in this thread as I am not sure I understand exactly what > has been proposed). > > The pendingLimit is there to allow timeouts to work properly. If the > pendingLimit is just removed, the timeouts will break as they did before (the > timeout starts counting at the wrong time and so if there are a large number > of transactions queued up the timeout fires too early). But let's review > what timeouts are for and how we would **like** them to behave. > > As I understand it, the main purpose of the timeouts is to deal with cases > where the remote device (or some intermediary) has got stuck and is no longer > proceeding with transactions (but not returning errors). It also helps with > cases where the plugin tries to send a message but does not notice that there > is an error (e.g. a socket has been disconnected) and it will never get a > reply. This is, of course, a plugin bug but it is useful that the timeout > mechanism also protects against that problem. > > There is a secondary use for timeouts and that is to protect against problems > in the IPC mechanism itself -- e.g. a process has stopped and is no longer > reading the pipe. This is a smaller consideration and can be handled by > mechanisms within the IPC itself if necessary, so let's ignore it for now. > > There seem to be three plugin architectures which are relevant (I thought, > when I was rewriting the timeout code, that there were only the first two but > I now realise there is a third): > > 1) Synchronous plugin (most plugins are like this, I believe): when a > transaction is received by the plugin (e.g. Connect or Get Changes or Commit) > it does synchronous writes to send messages to the device and synchronous > reads to get messages from the device. If the device stops responding, the > thread executing the plugin will just wait. No other plugin messages will be > handled while it is waiting. > > 2) Asynchronous but transaction-at-a-time plugin (maybe there are none like > this): when the transaction is received, the plugin sends the message to the > device and then returns. The thread polls the socket and resumes when the > response is received. However, other plugin messages can be handled while it > is waiting -- so further updates will cause further messages to be sent to > the device. If the device stops responding, the engine will keep sending > updates which the plugin will send. although it is not seeing any responses. > > 3) Aysnchronous, multiple transaction plugin (like SyncML): when the > transaction is received, it is stored internally to the plugin. Nothing is > sent until a message fills up or the last transaction is received. Then all > updates are sent and, when responses are received, the updates are completed. > If the device stops responding then all or some updates will not receive a > response. > > For 1 and 2, the timeout is protecting each single commit and the value should > be set based on the time needed for that transaction. In the case of 2, this > means the pendingLimit is needed -- it limits the number of updates that > might already be queued ahead of this one and so allows the timeout value to > be calculated (i.e. pendingLimit * maximum time for one update). > > For 3, however, it is much harder. One option would be to set the timeouts > for each commit (and the commit_all) based on the time the device needs to > complete a maximum sized message of updates. On the other hand, that doesn't > allow for the fact that the OpenSync engine itself might take some time (in > complex cases) to even provide enough updates to fill a message: timeouts > were not intended to have to take into account OpenSync engine processing. > > Another option for 3 is to set no timeouts at all on the individual commits, > but set a timeout on the commit_all. The problem with that is that the > timeout value for the commit_all is potentially unlimited (it is not limited > to a single message of commits as the commit_all will start as soon as the > commits have all been queued and hundreds of messages may have been sent to > the device). > > On the other hand, the plugin itself knows what is going on. So, I think the > best option for case 3 is for the plugin itself to control the timeouts. I > suggest that in this case, there are no timeouts on the commit or commit_all, > but that the plugin itself sets a timeout when it has assembled a message and > sent it to the device. I.e. add some sort of OsyncStartTimeout(int timeout) > and OSyncStopTimeout() calls. The plugin would start the timeout when the > first message was sent to the device. Whenever a response is received, the > timeout would be stopped and, if there were one or more messages still > waiting for responses, it would be started again. If the timeout actually > fires, then the OSyncQueue code completes all pending operations with a > timeout error (just as in the existing timeout processing). > > This does mean that for plugins using this third architecture, they have to > have some extra complexity. But I can't see an alternative, if we want to > keep the timeout protection. Of course, we could decide that for this > release of OpenSync that we disable timeout processing altogether -- and add > it back in later (with additions to the API at that time). > > Does anyone have an alternative suggestion? If not, I will spec up a > suggested API for the timeout operations. > > Graham > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry® Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9-12, 2009. Register now! > http://p.sf.net/sfu/devconf > ------------------------------------------------------------------------ > > _______________________________________________ > Opensync-devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/opensync-devel > |