From: <p.d...@gm...> - 2017-02-27 07:58:15
|
Hi Alan Sorry, I still haven't had chance to look at your code (poorly 4 year old over the weekend). But you are definitely correct to not use a sleep to avoid a race. I think particularly in situations where the system is under strain all bets are off regarding timings. And like you said, it's just not good practice. Phil Sent from my Windows 10 phone From: Alan W. Irwin Sent: 27 February 2017 06:51 To: p.d...@gm... Cc: PLplot development list Subject: Re: [Plplot-devel] The status of the wxwidgets IPC development On 2017-02-25 17:44-0800 Alan W. Irwin wrote: > However, I certainly agree mutual use of the same resource (shared > memory) is a tricky world. And now that you have encouraged me to > think about races, I discovered there is indeed a race condition that > could explain this bug. I have now worked around that race (commit > 4e6932e) and please see that commit message for more commentary > concerning this type of race. Assuming I really did understand this > race, I am virtually positive my simple crude fix will deal with it > without any noticeable reduction in speed. However, time will tell about > that. Hi Phil: I think 10 ms sleep used in the above crude workaround would likely always work going forward because it would be pretty unusual for the OS scheduler to not give a process access to the cpu for essentially 10 million instructions. Nevertheless, that argument does depend on process speed and assumptions about scheduler details so having thought a lot more about this, I would far prefer to avoid sleep workarounds for race conditions not only on these grounds but also as simply a matter of good IPC style. Therefore, I plan to turn the current two-semaphore approach into a three semaphore approach where m_wsem and m_rsem will continue to be used for the details of a complete transfer of an array of bytes, but an additional m_tsem semaphore (where "t" stands for transfer) will be used so that only one such transfer of bytes can be done at a given time. As far as I can tell, this change means I can completely drop the moveBytesReaderReversed variant of moveBytesWriter and the moveBytesWriterReversed variant of moveBytesReader which is a really nice simplification. Furthermore, I plan to rename moveBytesWriter to transmitBytes and moveBytesReader to receiveBytes where both transmitBytes and receiveBytes will be used by either of -dev wxwidgets or wxPLViewer as needed depending simply on the direction of data flow. The additional m_tsem semaphore will be initialized to 1; transmitBytes will start by calling sem_wait on that semaphore and will end by calling sem_post on that semaphore. That simple changes means if wxPLViewer uses transmitBytes to send data that is received by -dev wxwidgets with receiveBytes, and then -dev wxwidgets follows up by calling transmitBytes to send data back that is received by wxPLViewer with a call to receiveBytes, that second use of transmitBytes will be halted by the sem_wait until that first use of transmitBytes is entirely completed, i.e., any call by either side of the IPC connection to transmitBytes cannot possibly race with a previous call to that routine by either side. Anyhow, I like this pure semaphore way to avoid the race condition much more than the 10 ms sleep, and I hope to get it completely implemented tomorrow. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |