You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(58) |
Nov
(95) |
Dec
(55) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(205) |
Feb
(106) |
Mar
(36) |
Apr
(25) |
May
(34) |
Jun
(36) |
Jul
(161) |
Aug
(66) |
Sep
(100) |
Oct
(62) |
Nov
(77) |
Dec
(172) |
2003 |
Jan
(101) |
Feb
(202) |
Mar
(191) |
Apr
(97) |
May
(27) |
Jun
(21) |
Jul
(16) |
Aug
(55) |
Sep
(155) |
Oct
(166) |
Nov
(19) |
Dec
(134) |
2004 |
Jan
(569) |
Feb
(367) |
Mar
(81) |
Apr
(62) |
May
(124) |
Jun
(77) |
Jul
(85) |
Aug
(80) |
Sep
(66) |
Oct
(42) |
Nov
(20) |
Dec
(133) |
2005 |
Jan
(192) |
Feb
(143) |
Mar
(183) |
Apr
(128) |
May
(136) |
Jun
(18) |
Jul
(22) |
Aug
(33) |
Sep
(20) |
Oct
(12) |
Nov
(80) |
Dec
(44) |
2006 |
Jan
(42) |
Feb
(38) |
Mar
(17) |
Apr
(112) |
May
(220) |
Jun
(67) |
Jul
(96) |
Aug
(214) |
Sep
(104) |
Oct
(67) |
Nov
(150) |
Dec
(103) |
2007 |
Jan
(111) |
Feb
(50) |
Mar
(113) |
Apr
(19) |
May
(32) |
Jun
(34) |
Jul
(61) |
Aug
(103) |
Sep
(75) |
Oct
(99) |
Nov
(102) |
Dec
(40) |
2008 |
Jan
(86) |
Feb
(56) |
Mar
(104) |
Apr
(50) |
May
(45) |
Jun
(64) |
Jul
(71) |
Aug
(147) |
Sep
(132) |
Oct
(176) |
Nov
(46) |
Dec
(136) |
2009 |
Jan
(159) |
Feb
(136) |
Mar
(188) |
Apr
(189) |
May
(166) |
Jun
(97) |
Jul
(160) |
Aug
(235) |
Sep
(163) |
Oct
(46) |
Nov
(99) |
Dec
(54) |
2010 |
Jan
(104) |
Feb
(121) |
Mar
(153) |
Apr
(75) |
May
(138) |
Jun
(63) |
Jul
(61) |
Aug
(27) |
Sep
(93) |
Oct
(63) |
Nov
(40) |
Dec
(102) |
2011 |
Jan
(52) |
Feb
(26) |
Mar
(61) |
Apr
(27) |
May
(33) |
Jun
(43) |
Jul
(37) |
Aug
(53) |
Sep
(58) |
Oct
(63) |
Nov
(67) |
Dec
(16) |
2012 |
Jan
(97) |
Feb
(34) |
Mar
(6) |
Apr
(18) |
May
(32) |
Jun
(9) |
Jul
(17) |
Aug
(78) |
Sep
(24) |
Oct
(101) |
Nov
(31) |
Dec
(7) |
2013 |
Jan
(44) |
Feb
(35) |
Mar
(59) |
Apr
(17) |
May
(29) |
Jun
(38) |
Jul
(48) |
Aug
(46) |
Sep
(74) |
Oct
(140) |
Nov
(94) |
Dec
(177) |
2014 |
Jan
(94) |
Feb
(74) |
Mar
(75) |
Apr
(63) |
May
(24) |
Jun
(1) |
Jul
(30) |
Aug
(112) |
Sep
(78) |
Oct
(137) |
Nov
(60) |
Dec
(17) |
2015 |
Jan
(128) |
Feb
(254) |
Mar
(273) |
Apr
(137) |
May
(181) |
Jun
(157) |
Jul
(83) |
Aug
(34) |
Sep
(26) |
Oct
(9) |
Nov
(24) |
Dec
(43) |
2016 |
Jan
(94) |
Feb
(77) |
Mar
(83) |
Apr
(19) |
May
(39) |
Jun
(1) |
Jul
(5) |
Aug
(10) |
Sep
(28) |
Oct
(34) |
Nov
(82) |
Dec
(301) |
2017 |
Jan
(53) |
Feb
(50) |
Mar
(11) |
Apr
(15) |
May
(23) |
Jun
(36) |
Jul
(84) |
Aug
(90) |
Sep
(35) |
Oct
(81) |
Nov
(13) |
Dec
(11) |
2018 |
Jan
(15) |
Feb
(4) |
Mar
(2) |
Apr
(2) |
May
|
Jun
(6) |
Jul
(4) |
Aug
(13) |
Sep
(31) |
Oct
(4) |
Nov
(25) |
Dec
(64) |
2019 |
Jan
(7) |
Feb
(4) |
Mar
|
Apr
|
May
(13) |
Jun
(8) |
Jul
(16) |
Aug
(7) |
Sep
(27) |
Oct
(1) |
Nov
|
Dec
|
2020 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
(8) |
Jun
(1) |
Jul
(4) |
Aug
|
Sep
(3) |
Oct
(2) |
Nov
(4) |
Dec
(3) |
2021 |
Jan
(1) |
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
(2) |
Jul
(9) |
Aug
(3) |
Sep
|
Oct
(8) |
Nov
(4) |
Dec
|
2022 |
Jan
|
Feb
(6) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(3) |
Dec
(8) |
2023 |
Jan
(6) |
Feb
|
Mar
(1) |
Apr
(2) |
May
(10) |
Jun
(7) |
Jul
|
Aug
(5) |
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
(1) |
Sep
(9) |
Oct
|
Nov
|
Dec
|
From: Alan W. I. <ir...@be...> - 2017-04-17 23:31:03
|
On 2017-04-16 16:55-0400 Hazen Babcock wrote: > On 02/21/2017 09:26 PM, Alan W. Irwin wrote: >> >> I have now been in contact with the OP, Barry Warsaw of python.org, of >> that thread who was quite helpful. For example, Barry told me that >> Python is designed so it is frankly impossible for >> >> import Plframe >> from Plframe import * >> >> to race (i.e., the first import completely finishes before the second >> one starts). And I cannot find any other cases where Plframe is >> imported. So I think the best bet for explaining this *.pyc >> Python-generated file corruption is some unknown Python 2 bug that >> does not have anything to do with races. I got the sense from >> what Barry said that he feels Python 3 is now much more reliable than >> Python 2. So this may be another instance of that general idea. >> >> Anyhow, I think the next step is to test whether this corruption >> occurs for Python 3. (And if it does I get the sense that Barry would >> be anxious to figure out what that Python 3 bug was.) >> >> @Hazen: >> >> This issue lends lots of additional motivation for making PLplot work >> correctly with Python 3. So please go ahead and push your Python 3 >> topic as soon as it is in reasonable state, and we can mature it >> further (if necessary) from there. > > Pushed. Hi Hazen: Thanks for pushing your work. > One possibly important thing to note is that Python3 does not allow a > mix of tabs and spaces in a file. So these changes are less extensive > then they might appear as a lot of it was converting the examples to be > all spaces. And hopefully our file formatting utility will not introduce > regressions. I agree we should use only blanks for Python identation (just as in our styled C code) since using a mixture of tabs and blanks for this purpose is just an accident waiting to happen. I have just checked, and scripts/style_source.sh, and scripts/remove_trailing_whitespace.sh do not affect Python indentation so your changed files should retain blank indentation from now on unless someone deliberately introduces tab indentation by mistake when editing these files. In addition, I noticed there was still some tab indentation left in our Python source files. I found those using find . -name '*.py*' |grep -vE '(pyc|~)$' |xargs grep -l $'\t' and fixed them in commit dd9d258. Both your commit and that commit build and run here for both Python 2 and Python 3 confirming your test results on Lubuntu. However, see below for PostScript difference issues that your changes introduced. For the record, the following Debian Jessie packages were installed for my tests: python2.7 libpython2.7-dev:amd64 python-numpy python3.4 libpython3.4-dev:amd64 python3-numpy Debian Jessie does not configure its etc-alternatives system for Python so by default CMake finds /usr/bin/python which is an (indirect) symbolic link to /usr/bin/python2.7 and python library and numpy that are consistent with that version. To try the Python 3 alternative, I used the CMake option DPython_ADDITIONAL_VERSIONS:STRING=3 which found /usr/bin/python3.4 and python library and numpy that was consistent with that version. So all appears well with the build system and the python build. Furthermore, the standard python examples execute without any obvious run-time error such as segfaults for both Python 2 and 3. However, there are now PostScript consistency issues _both_ for Python 2 and Python 3 results that were not there before for Python 2 before your push. python (2) Missing examples : Differing graphical output : 33 Missing stdout : Differing stdout : 23 python (3) Missing examples : Differing graphical output : 02 09 11 14a 15 16 Missing stdout : Differing stdout : I obtained these results both before and after my changes so I assume the python 2 issues were introduced by your commit, and the python 3 issues are due to some remaining python 2 to python 3 conversion issues. I feel it is important to get back to Python 2 PostScript difference perfection and to also achieve that perfection for Python 3 so I am willing to work on the above issues starting with the Python 2 case and then following up with the Python 3 case in ascending order of the examples. Assuming you want to help with this effort without duplicating my work, please start with the Python 3 issues in descending order by example until we meet in the middle. :-) Note also that there are additional non-standard Python examples in the examples/python subdirectory so those should all be checked to make sure they work with both Python 2 and Python 3. And similarly for the pyqt4 and pyqt5 standard examples. So there is obviously a lot more checking/debugging that needs to be done, but you have certainly made a good Python 3 start with this push. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: James D. <ji...@di...> - 2017-04-17 13:39:05
|
I finally got cmake to generate a Makefile with the wingdi and wingcc drivers enabled. When running cmake on Window 10 + msys2, there is a "System is unknown to cmake" message, which results in several variables being unset. I sent a message to the cmake mailing list and I will wait to see what happens. A quick perusal of our mailing list reveals that others have grappled with this. I think some tweaking of the cmake/modules/wingcc.cmake and cmake/modules/wingdi.cmake might be needed; however, I think waiting for some action from the cmake group might be prudent. I was able to get cmake to work with the following command cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX=/opt/plplot -DPLD_wingdi=ON -DPLD_wingcc=ON -DWIN32=1 -DMINGW=1 -DMINGWLIBPATH=/d/msys64/usr/lib/w32api ../plplot/ Unfortunately, the make fails while building plstream.cc /d/plplot/plplot/bindings/c++/plstream.cc: In member function 'virtual void cxx_pltr2::xform(PLFLT, PLFLT, PLFLT&, PLFLT&) const': /d/plplot/plplot/bindings/c++/plstream.cc:102:9: error: 'cerr' was not declared in this scope cerr << "cxx_pltr2::xform, Invalid coordinates\n"; ^~~~ /d/plplot/plplot/bindings/c++/plstream.cc:102:9: note: suggested alternative: Any thoughts? |
From: Hazen B. <hba...@ma...> - 2017-04-16 21:55:45
|
On 02/21/2017 09:26 PM, Alan W. Irwin wrote: > > I have now been in contact with the OP, Barry Warsaw of python.org, of > that thread who was quite helpful. For example, Barry told me that > Python is designed so it is frankly impossible for > > import Plframe > from Plframe import * > > to race (i.e., the first import completely finishes before the second > one starts). And I cannot find any other cases where Plframe is > imported. So I think the best bet for explaining this *.pyc > Python-generated file corruption is some unknown Python 2 bug that > does not have anything to do with races. I got the sense from > what Barry said that he feels Python 3 is now much more reliable than > Python 2. So this may be another instance of that general idea. > > Anyhow, I think the next step is to test whether this corruption > occurs for Python 3. (And if it does I get the sense that Barry would > be anxious to figure out what that Python 3 bug was.) > > @Hazen: > > This issue lends lots of additional motivation for making PLplot work > correctly with Python 3. So please go ahead and push your Python 3 > topic as soon as it is in reasonable state, and we can mature it > further (if necessary) from there. Pushed. One possibly important thing to note is that Python3 does not allow a mix of tabs and spaces in a file. So these changes are less extensive then they might appear as a lot of it was converting the examples to be all spaces. And hopefully our file formatting utility will not introduce regressions. -Hazen |
From: Alan W. I. <ir...@be...> - 2017-04-15 06:57:10
|
On 2017-04-14 20:54-0700 Alan W. Irwin wrote: > To summarize, if you discovered a standards-compliant change > to our new Fortran binding implementation that would allow use of > > export FFLAGS='-O3 -std=f2003 -pedantic -Wall -Wextra' > > with gfortran, then that would allow our users/developers to at least > do minimal standards compliance checking with gfortran. But if you > feel that is too much trouble and/or the fix would obfuscate our > fortran binding code too much simply to quiet a gfortran build error > that is likely spurious, then I am content with the current > recommendation > > export FFLAGS='-O3 -Wall -Wextra' > > Of course, the (significant) downside of the current recommendation is > it means our users/developers cannot check Fortran standards > compliance with gfortran and must rely on NAG instead for such > checking. Hi Arjen: After thinking about this some more I have become concerned that my gfortran test results are correct, and thus our code is not compliant with the pure Fortran 2003 standard nor the pure Fortran 2008 standard. For example, you could get those test results if our code is compliant with a mixture of the Fortran 2003 and Fortran 2008 standards, but with parts not compatible with pure Fortran 2003 and parts not compatible with pure Fortran 2008. For now, I would prefer to stick to pure Fortran 2003 on the principle that that standard is likely better supported by compilers than Fortran 2008. So if the NAG compiler has options equivalent to gfortran's -std=f2003 -pedantic to check compliance with the pure Fortran 2003 standard, it may reveal exactly the same issue discovered by gfortran, and that would be strong motivation for addressing that issue. By the way, the "f95" name is now obviously a misleading term describing our Fortran binding and examples. Is it time to go through our entire source tree (including documentation and build system) and replace _all_ references to "f95" (e.g., bindings/f95, examples/f95, ENABLE_f95) by the "fortran" equivalent (e.g., bindings/fortran, examples/fortran, ENABLE_fortran)? I would prefer to replace the "f95" name by "fortran" rather than "f2003" since that would allow us to move to pure Fortran 2008 compliance when the time comes without any further name changes. Alan > > Alan > __________________________ > Alan W. Irwin > > Astronomical research affiliation with Department of Physics and Astronomy, > University of Victoria (astrowww.phys.uvic.ca). > > Programming affiliations with the FreeEOS equation-of-state > implementation for stellar interiors (freeeos.sf.net); the Time > Ephemerides project (timeephem.sf.net); PLplot scientific plotting > software package (plplot.sf.net); the libLASi project > (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); > and the Linux Brochure Project (lbproject.sf.net). > __________________________ > > Linux-powered Science > __________________________ > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Plplot-devel mailing list > Plp...@li... > https://lists.sourceforge.net/lists/listinfo/plplot-devel > __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-04-15 03:54:39
|
Hi Arjen: I had to rework (commit 58dc757) the suggestion in README.developers about the recommended gfortran compiler options because the prior suggestion export FFLAGS='-std=f95 -O3 -fall-intrinsics -fvisibility=hidden -pedantic -Wall -Wextra' did not work with our new fortran binding. (As an aside, in my tests I dropped -fvisibility=hidden because I believe that makes no sense for Fortran code. I also dropped -fall-intrinsics since I prefer using Wintrinsics-std which is automatically deployed with -Wall if -fall-intrinsics is not specified.) So here are the actual flags I tried. 1. export FFLAGS='-O3 -std=f95 -pedantic -Wall -Wextra' Those options generated a build error (as expected) because the Fortran 95 standard does not include support for the ISO_C_BINDING module that we use to implement the new fortran binding. 2. export FFLAGS='-O3 -std=f2003 -pedantic -Wall -Wextra' Those options generated the following type of build error: included_plplot_real_interfaces.f90:2601.25: Included at /home/software/plplot/HEAD/plplot.git/bindings/f95/plplot_double.f90:117: c_loc(plotentries), size(plotentries, kind=private_plint) ) 1 Error: Fortran 2008: Array of interoperable type at (1) to C_LOC which is nonallocatable and neither assumed size nor explicit size for what I consider to be a spurious reason (see the revised README.developers for further discussion). 3. export FFLAGS='-O3 -std=f2008 -pedantic -Wall -Wextra' got rid of the above type of error message, but also generated a whole new set of build errors. 4. export FFLAGS='-O3 -Wall -Wextra' does work without any build errors so this is now what I recommend in README.developers. To summarize, if you discovered a standards-compliant change to our new Fortran binding implementation that would allow use of export FFLAGS='-O3 -std=f2003 -pedantic -Wall -Wextra' with gfortran, then that would allow our users/developers to at least do minimal standards compliance checking with gfortran. But if you feel that is too much trouble and/or the fix would obfuscate our fortran binding code too much simply to quiet a gfortran build error that is likely spurious, then I am content with the current recommendation export FFLAGS='-O3 -Wall -Wextra' Of course, the (significant) downside of the current recommendation is it means our users/developers cannot check Fortran standards compliance with gfortran and must rely on NAG instead for such checking. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-03-30 22:15:08
|
On 2017-03-30 14:00-0700 Alan W. Irwin wrote: > [....] The > commit [25d120e] message indicates the comprehensive test script options that > should be used to do the requested testing [of wxwidgets]. P.S. Those options included -DPLPLOT_WX_DEBUG_OUTPUT=ON -DPLPLOT_WX_NANOSEC=ON. The first of those adds lots of debugging messages which normally are not needed. The second of those puts nanosec time stamps on those debugging messages with an experimental method that normally should only work on Linux. So they worked for me on Linux, but I suggest for general comprehensive testing of our wxwidgets device and wxwidgets binding you drop both these options, but especially the second one on non-Linux platforms. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-03-30 21:00:29
|
The new three named semaphores approach to wxwidgets IPC that I implemented should "just work" on all platforms. The reason why I claim that is the method works on Linux and POSIX named semaphores are widely supported on all POSIX platforms (so should work on all free Unix platforms and Mac OS X and other proprietary Unix platforms). Also, I just checked and Cygwin apparently has complete support for POSIX named semaphores. So the new IPC method should just work in that case as well. Furthermore, I copied how Phil handled (in his earlier implementation of a different IPC method) the Window mutex equivalent of named POSIX semaphores for the case when WIN32 is #defined. Therefore, the new IPC method should just work in that case (the MSVC and MinGW-w64/MSYS2 Windows platforms) as well. However, currently the only test result I have for the new IPC method is for the Linux platform so testing is requested for all the other platforms I mentioned above. I have now (commit 25d120e) disabled all versions of IPC other than the new three named semaphores approach to insure all further testing will be with that new IPC method and as the first step in the planned removal of all other IPC methods we have tried in the past. The commit message indicates the comprehensive test script options that should be used to do the requested testing. Those options insured that the comprehensive test of our wxwidgets device and bindings only took 15 minutes to complete on Linux. I am especially interested in equivalent comprehensive test wxwidgets results for Cygwin, MinGW-w64/MSYS2, and Mac OS X since previous attempts to use that bash test script have worked well on all those platforms. Theoretically, that comprehensive bash script should also work on MSVC (if you set environment variables to access Cygwin or MinGW-w64/MSYS2 unix functionality such as bash.exe). But so far, nobody has figured out exactly how to do that so you may be limited to just hand testing of -dev wxwidgets on MSVC. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-03-13 06:27:44
|
On 2017-03-09 23:07-0800 Alan W. Irwin wrote: [...] > In sum, I believe we need to make the two improvements above (solve > the wxPLViewer order of magnitude inefficiency issue and display > wxPLViewer plots immediately for each part of plbuf that is received) > before we can get believable timing results. Therefore it is much too > soon to judge any of the "IPC3 unnamed semaphores", "IPC3 named > semaphores", and "IPC single mutex" approaches based on any apparent > timing issues that are occurring now, and once we get more reliable > timing my guess is we will see no important difference in timing > results between the three different IPC methods that are currently > implemented. To Phil and Pedro: I finally figured out a way to substantially reduce how the wxPLViewer app was interfering with timing measurements for -dev wxwidgets! To work around the wxPLViewer rendering inefficiency problem mentioned above, I ensured nothing is rendered by that app by making a local modification of utils/wxplframe.cpp to #define the PLPLOT_WX_NOPLOT macro. (See commit 7cfd6ce to see exactly the few lines of code that are skipped when that local modification is made). To work around the uncertainty about what the -np option is actually doing, I don't use that option for my timing tests (see the command below). Finally, (although I am not sure this makes much difference but I state this for the record) I also used a minimal desktop (fvwm) rather than KDE to vastly reduce the number of background desktop tasks that are contending for the cpu and memory when doing timing tests. So for a given CMake configuration, and after setting export TIMEFORMAT=$'real\t%3R' to use a one-line real time difference format for time results, a timing run is done using the following command (for N in $(seq --format='%02.0f' 0 34 |grep -vE '20'); do echo $N; examples/c/x${N}c -dev wxwidgets >&/dev/null; killall -9 wxPLViewer; (time examples/c/x${N}c -dev wxwidgets >&/dev/null); killall -9 wxPLViewer ; done) >| <captured results file> 2>&1 where the name of the <captured results file> depends on the configuration being tested to keep the results conveniently separated. Within that command, I run each of the 34 (!) examples twice so that the second timed run will be done with cached memory results. After each example completes I kill the corresponding wxPLViewer instance because there is a slight bug with the PLPLOT_WX_NOPLOT macro such that it leaves an empty wxPLViewer GUI rather than properly terminating that. With the above procedure I now have reasonable timing consistency. For example, timing results differ for a given example by a maximum of 13 ms from one run of the above command to the next for identical configurations. For the -DPL_WXWIDGETS_IPC3=ON case if I compare results derived from the above command for both named and unnamed semaphores, the differences are within the 13 ms noise of the time measurement. That is an important result which means that eventually in the interests of simplifying our code (and because unnamed semaphores are not universally available on all platforms) we will likely want to remove the unnamed semaphores variant and retain the named semaphores variant of the 3-semaphores approach, i.e., drop all code that currently is only compiled when the PL_HAVE_UNNAMED_POSIX_SEMAPHORES macro is #defined. If I compare results derived from the above command for -DPL_WXWIDGETS_IPC3=ON (three semaphores approach) versus -DPL_WXWIDGETS_IPC3=OFF (one mutex, circular buffer approach), I get the following result from ndiff (where the -abs 0.100 option ignores absolute time difference less than 100 ms, "ON_OFF" refers to results collected with -DPL_WXWIDGETS_IPC3=ON -DPL_HAVE_UNNAMED_POSIX_SEMAPHORES=OFF, and "OFF" refers to results collected with -DPL_WXWIDGETS_IPC3=OFF): ndiff -abs .100 ../IPC3_timing_ON_OFF_alt11.txt ../IPC3_timing_OFF_alt11.txt 16c16 example 07 < real 0.392 --- field 2 absolute error 1.14e-01 > real 0.506 30c30 example 14 < real 0.521 --- field 2 absolute error 1.42e-01 > real 0.379 36c36 example 17 < real 10.474 --- field 2 absolute error 1.65e-01 > real 10.309 54c54 example 27 < real 0.292 --- field 2 absolute error 1.11e-01 > real 0.403 66c66 example 33 < real 1.543 --- field 2 absolute error 1.07e+00 > real 2.610 ### Maximum absolute error in matching lines = 9.00e-02 at line 8 field 2 The above results mean the three-semaphores IPC approach is more efficient than the single mutex + circular buffer IPC approach for 3 out of these 5 examples (i.e., 60 per cent of the examples) with differences > 100 ms. However, if I had printed out all differences greater than the maximum timing noise of ~13 ms that fraction drops to ~20 per cent. Of course, for multiple cpu hardware (such as my two-cpu PC), circular buffers do have a general speed advantage in ideal circumstances since one cpu can be writing to the buffer with -dev wxwidgets and the other cpu can be reading from that buffer with wxPLViewer. Whereas with the three semaphores approach writing to the buffer is blocked while it is being read from and reading from the buffer is blocked while it is being written to. So that is a factor of two advantage to circular buffers in ideal circumstances for multiple cpu hardware. However, there are some additional overheads with the circular buffer approach (such as timing loops and checks that reading is not getting ahead of writing) which it appears under certain circumstances (i.e., 60 per cent of the large difference examples and 20 per cent of all examples) are larger than that theoretical saving. Of course, in general, IPC takes a relatively small time to complete compared to the rendering cost (e.g., example 33 with more than 100 pages takes only a few seconds for IPC) so I believe the above differences in IPC cost are not that important in the grand scheme of things. So I suggest (based on the much more important factor of code clarity) that eventually (say, once you guys have dealt with the issue with the -locate option for example 1) we should use the three named semaphores IPC approach exclusively (by forcing the PL_WXWIDGETS_IPC3 macro to always be #defined and the PL_HAVE_UNNAMED_POSIX_SEMAPHORES macro to always be #undefined). And once we are completely happy with that configuration (and presumably before the next release) we should follow up by doing a massive code cleanup to drop the code that is ignored when we use those forced settings. Please do fix the -locate issue and evaluate the code clarity for yourself for the case when PL_WXWIDGETS_IPC3 is #defined and PL_HAVE_UNNAMED_POSIX_SEMAPHORE #undefined sooner rather than later. N.B. The reason why I am emphasizing "sooner" is it is important to deal with such issues relatively early in this release cycle rather than in some last-minute rush before a release when our judgement will likely be impaired by release stress. Also, I am extremely familiar with the -DPL_WXWIDGETS_IPC3=ON version of the code now so I can easily answer your questions about it while that won't be as much the case later. So your attention to these wxwidgets requests from me as soon as you have any spare time at all would be much appreciated! Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-03-11 22:56:34
|
To Phil and Pedro: On 2017-03-09 23:07-0800 Alan W. Irwin wrote: [...] > Unnamed POSIX semaphores are known not to work on Mac OS X (and likely > a number of other proprietary Unices) and Windows. Thus, for the > -DPL_WXWIDGETS_IPC3=ON case I plan to implement a CMake test that will > determine whether PL_HAVE_UNNAMED_POSIX_SEMAPHORES is OFF or ON for a > given platform without need for user input. DONE as of commit 14e70d7. This completes my planned IPC work on wxwidgets, and I look forward to your follow ups to this work. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-03-10 18:42:15
|
On 2017-03-10 11:02-0500 Jim Dishaw wrote: > >> On Feb 21, 2017, at 4:45 PM, Alan W. Irwin <ir...@be...> wrote: >> >> @Jim, Phil, and Arjen: >> >> I used the git SF server just this morning with no issues. Also, for >> the reasons discussed in README.developers you should avoid all gui >> versions or "enhanced" versions of git (i.e., try to stick as much as >> possible to the real thing). Bearing those constraints in mind, that >> file recommends <https://github.com/msysgit/msysgit> for Windows >> users, but I just discovered from looking at that site that it has >> been obsoleted and msysgit developers now recommend using the "Git for >> Windows" <https://git-for-windows.github.io/> version of git instead. >> (I confirmed from that website it considers itself light-weight >> [check!] and it does have a command-line version [check!]). So please >> give the command-line version of that project a try, and let us know >> whether it works well for you (which would allow us to recommend that >> Windows version of git in our README.developers file). >> > > I just pushed my first patch set after recovering from my VM failure. I’m using the Git for Windows and everything appears to have worked. Hi Jim: I am glad to hear you are up and running with "Git on Windows". I confirm your push process worked. Your good result motivated me to make additional changes (commit 658796c) to README.developers concerning git command-line availability (including replacing the msysgit reference with the "Git for Windows" reference). As part of the research for that update, I consulted <http://git-scm.com/book/en/Getting-Started-Installing-Git>, and it turns out that book has also now been updated to replace its previous reference to msysgit with "Git for Windows". So that book's recommendation and your own good experience leave me confident that we are doing the right thing to recommend "Git for Windows" in addition (now) to other Windows git command-line possibilities such as the git packages from either Cygwin or MinGW-w64/MSYS2. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Jim D. <ji...@di...> - 2017-03-10 16:03:13
|
> On Feb 21, 2017, at 4:45 PM, Alan W. Irwin <ir...@be...> wrote: > > @Jim, Phil, and Arjen: > > I used the git SF server just this morning with no issues. Also, for > the reasons discussed in README.developers you should avoid all gui > versions or "enhanced" versions of git (i.e., try to stick as much as > possible to the real thing). Bearing those constraints in mind, that > file recommends <https://github.com/msysgit/msysgit> for Windows > users, but I just discovered from looking at that site that it has > been obsoleted and msysgit developers now recommend using the "Git for > Windows" <https://git-for-windows.github.io/> version of git instead. > (I confirmed from that website it considers itself light-weight > [check!] and it does have a command-line version [check!]). So please > give the command-line version of that project a try, and let us know > whether it works well for you (which would allow us to recommend that > Windows version of git in our README.developers file). > I just pushed my first patch set after recovering from my VM failure. I’m using the Git for Windows and everything appears to have worked. I would appreciate it if someone who uses windows can double-check that the wingdi driver works. One can print directly from the wingdi window by right-clicking and selecting Print |
From: Alan W. I. <ir...@be...> - 2017-03-10 07:08:03
|
To Phil and Pedro: I have now (commit 096527c) implemented named semaphores variants (both POSIX and Windows) of the 3-semaphore approach that was previously only implemented for POSIX unnamed semaphores. This should complete my planned wxwidgets work (except for the test of unnamed semaphores capability mentioned below), and the rest will likely be up to you guys. As far as I can tell on POSIX platforms the named semaphores variant of the three-semaphores approach gives identical plotted results to the unnamed semaphores variant. I obviously could not test the the Windows variant of the three-semaphores approach that I implemented so I am leaving that up to you. Use the cmake option -DPLPLOT_WX_DEBUG_OUTPUT=ON to get useful debug information about the header and plbuf results of transmitBytes and receiveBytes from both the -dev wxwidgets side of the IPC and the wxPLViewer side of the IPC. Use the cmake option -DPL_WXWIDGETS_IPC3=ON to exercise the 3-semaphore approach on any platform. Unnamed POSIX semaphores are known not to work on Mac OS X (and likely a number of other proprietary Unices) and Windows. Thus, for the -DPL_WXWIDGETS_IPC3=ON case I plan to implement a CMake test that will determine whether PL_HAVE_UNNAMED_POSIX_SEMAPHORES is OFF or ON for a given platform without need for user input. However, as an interim measure until I get that test implemented our build system sets PL_HAVE_UNNAMED_POSIX_SEMAPHORES to OFF by default and knowledgable users that _know_ their platform (e.g., Linux and likely the *BSD variants) supports unnamed semaphores can try such semaphores by setting -DPL_HAVE_UNNAMED_POSIX_SEMAPHORE=ON. I did a large number of timing tests for the various POSIX variants which I refer to below as * "IPC3 unnamed semaphores" (i.e., -DPL_WXWIDGETS_IPC3=ON -DPL_HAVE_UNNAMED_POSIX_SEMAPHORES=ON). * "IPC3 named semaphores" (i.e., -DPL_WXWIDGETS_IPC3=ON -DPL_HAVE_UNNAMED_POSIX_SEMAPHORES=OFF). * "IPC single mutex" (i.e., -DPL_WXWIDGETS_IPC3=OFF which effectively uses the same code that Phil implemented before I made the additions to that code for the above two cases). One apparent result is "IPC3 unnamed semaphores" and "IPC3 named semaphores" are the same speed within the very large replication noise (see below) with just one example where "IPC3 unnamed semaphores" appears to be significantly faster (by a factor of 1.7). But take this result with a huge grain of salt, because it doesn't make much sense that only one example is affected by some timing difference between these two cases. In any case, I believe these two cases should be essentially identical in speed (i.e., the time taken to access the semaphores should be negligible compared to the time required for memcpy to transmit header and plbuf data through the shared memory on the transmitBytes side and extract header and plbuf data from the shared memory on the receiveBytes side). Also, I have proved (for example 8 which has a very large plbuf filled with plot directives) that generation of that plbuf data and transfer of those data from -dev wxwidgets to wxPLViewer takes less than 0.8 seconds. So assuming transfer takes less than generation (and I assume substantially less), variants on how that transfer is done should hardly affect the timing of example 8, and the affect on other examples of such variants should be much less. Another apparent result is "IPC single mutex" is significantly (factor of two) faster _and_ slower than the other two depending on which example you use for the comparison with the faster results substantially outnumbering the slower results. But there doesn't seem to be any pattern to which examples are faster and which slower, and in any case I think the memcpy time should be nearly identical in all cases (e.g., for the "IPC single mutex" case where a circular buffer is used you still have to copy bytes to that buffer on the transmitting side and copy those bytes from that buffer on the receiving side just as in the 3-semaphores approach with or without named semaphores). I now provide more details on how I got these timing results in case someone can figure out why they are so unreliable. My timing results were done using the time command with its output reformatting to a single line corresponding to the real time interval using export TIMEFORMAT=$'real\t%3R' (After all tests were done I restored the normal 3-line time format by export TIMEFORMAT=$'\nreal\t%3lR\nuser\t%3lU\nsys\t%3lS' .) After building the test_c_wxwidgets target (to insure all dependencies are built), I then collected time information for -dev wxwidgets on most of our C examples (with exceptions 08, 17, 20, 25, 31, 32, 33, and 34) typically using something like the following: (for N in $(seq --format='%02.0f' 0 30 |grep -vE '08|17|20|25'); do echo $N; (time examples/c/x${N}c -dev wxwidgets >&/dev/null); sleep 5; done) >| ../IPC3_timing_ON_OFF_alt5.txt 2>&1 Note these time commands only measure the time taken by the -dev wxwidgets side of the IPC, and do not measure the time required by the other side of the IPC (wxPLViewer). The purpose of the sleep command is to allow time for the wxPLViewer command to finish so that it (usually except in those cases where wxPLViewer takes longer than 5 seconds) does not interfere with the time measurement of subsequent examples. Unfortunately, the fundamental result I have from such tests is these results are not reliable! One issue is if I run the above command multiple times I get substantially different results for all examples. For example, you can easily get time variations of 20 per cent for the same standard example from one run to the next if using the -np option for the examples/c/x${N}c commands, and it is even worse without that option (as above) because it is hard to be exactly consistent with how you run through the pages of an example by hitting the enter key by hand. In other words, both with and without the -np option, the cpu time (or more likely the memory or some other resource) consumed by wxPLViewer is likely interfering with the timing of the examples/c/x${N}c command in an inconsistent way. It should be possible to reduce this problem by a very large degree by making wxPLViewer a lot more efficient. A particularly egregious result is example 8 (which is why it was skipped above). As discussed before, it takes something like 3 seconds per page for wxPLViewer to render the 10 pages of example 8 _once that example is completely done with all IPC finished_. And during that 3 seconds per page wxPLViewer consumes virtually no cpu. In comparison, xcairo and wxwidgets generates and displays all 10 pages of that example in less than 3 seconds so we are discussing at least an order of magnitude discrepancy (and virtually all of it idle time for the wxPLViewer case) in time required to render all 10 pages of the example. That huge disparity is a puzzle to me since on Linux the wxwidgets library is essentially a wrapper for a subset of the GTK+ library suite which is accessed more directly by xcairo. Therefore, you would expect -dev xcairo and wxPLViewer (invoked by -dev wxwidgets) to take roughly the same amount of time to complete for example 8. Anyhow, example 8 has lots (probably thousands) of plfill calls so I am wondering if wxPLViewer is translatiing plfill into some generic wxwidgets library call that is extraordinarily inefficient, and there might be an alternative wxwidgets library API to use to make fills that would be much more efficient than what we are using now? But the inefficiency of wxPLViewer after an example has completed may be nothing to do with that because a major clue is virtually no CPU time is consumed by (independent, i.e., after IPC is finished) wxPLViewer. Instead it appears to be spending the vast majority of its time waiting for some non-IPC event. So perhaps the issue is the event setup that is used for wxPLViewer is inefficient for some reason. (For example, it is possible one of the types of events that are handled may currently be firing essentially continuously.) Anyhow, a careful review of that event handling code should be done with debug printout each time an event fires to make sure there are no continuously firing events. I am also concerned with the reliability of any timing results for the -np option which simply shows blank screens in all cases. Probably the "IPC3 unnamed semaphores", "IPC3 named semaphores", and "IPC single mutex" are implemented correctly because they are no obvious run-time errors such as segfaults, and for the two IPC3 cases with the -DPLPLOT_WX_DEBUG_OUTPUT=ON cmake option you get printouts verifying the data that is sent is received properly. Nevertheless, there is currently no visual plot rendering evidence to support (or refute) that conclusion for the -np case. The solution to this issue is for wxPLViewer to display plot results immediately as the parts of plbuf are acquired to verify the rendering is done correctly. (This "immediate" approach is already done for most of the other PLplot interactive devices including the old version of wxwidgets.) That change would likely allow the -locate option to work correctly for example 1 and should also solve the example 17 issue where only the final result is displayed rather than properly plotting the intermediate results that are required to get to that final result. In sum, I believe we need to make the two improvements above (solve the wxPLViewer order of magnitude inefficiency issue and display wxPLViewer plots immediately for each part of plbuf that is received) before we can get believable timing results. Therefore it is much too soon to judge any of the "IPC3 unnamed semaphores", "IPC3 named semaphores", and "IPC single mutex" approaches based on any apparent timing issues that are occurring now, and once we get more reliable timing my guess is we will see no important difference in timing results between the three different IPC methods that are currently implemented. For now (except to implement a CMake test that would determine PL_HAVE_UNNAMED_POSIX_SEMAPHORES without user input), I believe I am done with this project, and I request you follow up with these steps: * Build and test wxwidgets on Windows using -DPL_WXWIDGETS_IPC3=ON -DPL_HAVE_UNNAMED_POSIX_SEMAPHORE=OFF to verify that this variant of the three-semaphores approach is working correctly on Windows. I am fairly confident it will work (or at least any issues will be small typographical ones) because I copied the syntax for initializing, posting, waiting for, and destroying the 3 semaphores for the Windows case following how that was done for the Windows variant of the "IPC single mutex" case. * Evaluate the code clarity of the "IPC3 unnamed semaphores" and "IPC3 named semaphores" approaches compared to the "IPC single mutex" approach. My opinion is the three-semaphores approach (whether using the unnamed POSIX, named POSIX, or named Windows semaphores variants) has much improved code clarity compared to the "IPC single mutex" approach. But it will be interesting to see if your independent assessment of that question also supports that conclusion. :-) * Get examples/c/x01c -locate -dev wxwidgets to work for all three approaches. The fundamental issue here is the wxPLViewer end as currently designed refuses to render the plot for partial plbuf results before -dev wxwidgets finishes sending the complete plbuf. So at least part of the fix here is to redesign wxPLViewer to allow rendering of partial plbuf results. (See discussion above concerning two other benefits of that approach.) Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-03-04 19:59:44
|
On 2017-02-28 12:19-0800 Alan W. Irwin wrote: > On 2017-02-27 15:34-0800 Alan W. Irwin wrote: [...] > My remaining plans for the -DPL_WXWIDGETS_IPC2=ON case are as follows: > > 1. Implement interactivity so that C example 1 works with -locate mode > and C example 20 works. > > 2. Named semaphore variant for POSIX systems. > > 3. Named semaphore variant for Windows systems. (I should be able to > implement this following what is done for -DPL_WXWIDGETS_IPC2=OFF, but > I will need your help to test this variant). Hi Phil: I have committed (79987df) my implementation for step 1. above which should be equivalent to the patch I circulated here. This implementation does not work, i.e., examples/c/x01c -dev wxwidgets works fine but examples/c/x01c -dev wxwidgets -locate hangs, but I think it is close because the -DPL_WXWIDGETS_IPC2=OFF case has similar problems (see the commit message for further details). Since the problem appears to be pre-existing I would appreciate you fixing this yourself once you get a chance to look at and evaluate the -DPL_WXWIDGETS_IPC2=ON code. For now, my plan is to continue with steps 2. and 3. above. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-03-04 03:25:25
|
Hi Phil: I have now had more of a chance to read your e-mail so I have some more detailed responses to it below. On 2017-03-03 16:02-0000 Phil Rosenberg wrote: [...] > So a couple of things spring to mind. Firstly you must at some point > renter the event loop to allow the viewer to start doing stuff again. > If you are sat in a loop in you communication code just waiting for > something to happen then it never will because the execution point is > in your communication code and not dealing with the mouse pointer. I am glad you brought up wxPLViewer events because I am still trying to understand those. For example, I presume (from the name) that wxPlFrame::OnCheckTimer is called every time the event timer fires, and that will continue at various rates (depending on what is set) until the timer is killed when m_header.completeFlag is true. And similarly, wxPlFrame::OnMouse is called every time there is a mouse event, and so on for all other events in the event table. So until the timer is stopped, I assume the net effect is wxPlFrame::OnCheckTimer (which in turn calls ReadTransmission that processes all the different transmission types sent by -dev wxwidgets) is effectively run simultaneously with wxPlFrame::OnMouse and the rest of the responses to different events. Have I got what goes on in event processing interpreted correctly? Of course, in direct answer to your concern, if ReadTransmission just blocked waiting for a semaphore, then wxPLViewer would hang, but that blocking should not happen unless -dev wxwidgets has incorrectly terminated its transmission type requests to ReadTransmission, and there is no sign of that type of bug in the code right now according to debug messages being printed out by ReadTransmission. In fact, according to those, readTransmission returns (as does OnCheckTimer that called it), and the hang occurs elsewhere (before the example 1 plotbuf that is created before transmissionType == transmissionLocate is sent from the -dev wxwidgets side) is plotted. So the first order of business is to get those plotbuf commands plotted. > calling wxYield will go check the event loop, deal with that stuff > then return, but it's often not the best solution because nested calls > to wxYield aren't permitted. If possible it is better to return from > your function and use some other event to trigger your return > communication. Perhaps this could be at the end of the method that > deals with capturing mouse activity. Just to remind you, all that transmissionLocate does is call SetPageAndUpdate() (just like what happens for most other transmission types), and then it sets m_locateMode = true; which has no effect on the rest of transmissionLocate but which has a profound effect on how wxPlFrame::OnMouse is executed, i.e., that event immediately returns with that variable is false and processes a mouse event whenever that variable is true. So all that may be working fine, and the issue is simply to get the plbuf commands plotted. > Another thing is that to cause the plot to be redrawn you must tell > wxWidgets that there is something new to draw. I think the method for > this is Refresh(). Then you must allow the event loop to be entered to > allow the redraw to actually take place. SetPageAndUpdate (referred to above) does normally call Refresh() if there is anything more to be plotted. So it appears the current ReadTransmission code does everything you suggest, i.e., handles transmissionLocate flawlessly. However, I now realize there is an issue in wxPlFrame::OnMouse which attempts (when m_locateMode = true) to collect mouse information and send it back to -dev wxwidgets and evidentially that is happening in the wrong order (before the plot is displayed). How can I absolutely assure that the plot is displayed before wxPlFrame::OnMouse is entered? > A final thing to think about is that locate must be dealt with > immediately. It sounded like you were holding locate requests and > continuing to send plot commands. Locate is dealt with as follows (which I think you can classify as "immediately"). The code in question is in the patch I sent you. We have the following sequence of calls in wxPLDevice::Locate TransmitBuffer( pls, transmissionLocate ); m_outputMemoryMap.receiveBytes( true, &m_header, sizeof ( MemoryMapHeader ) ); *graphicsIn = m_header.graphicsIn; The first sends both transmissionLocate and plbuf to wxPLViewer which processes those in ReadTransmission as I have described above (also confirmed by debug messages). Then that code returns and after that the plot should be displayed (which isn't happening for some reason) and the event loop should be running. After that when a mouse event occurs wxPlFrame::OnMouse should be entered and that method collects information about that mouse event in a header, sends that header back to -dev wxwidgets with transmitBytes which is received by the above receiveBytes command and sent back to example 1 for printout via graphicsIn. I hope that is clear. In any case, my next steps are to confirm that wxPlFrame::OnMouse is being entered properly each time a mouse button is clicked, and to try to discover why there is no display of the plot before that happens (for both -DPL_WXWIDGETS_IPC2=ON and -DPL_WXWIDGETS_IPC2=OFF). > As I said I haven't looked at the code, just read your email, but > these are the first places to look as a guess. > > Also there is a debug #define which when used means that plplot does > not launch the viewer, instead it just sits and waits for you to > launch it manually. This way you can launch the viewer in a debugger > to see what is going on in there. Thanks again, Phil, for these remarks which seem to imply I am on the right track so I am likely looking for a small bug rather than a gross misdesign. By the way, I have now looked at the third (deprecated) version of the code, and that is more sophisticated in how it treats keyboard and mouse events, e.g., special crosshairs in locate mode). So I plan to do a lot more with wxPlFrame::OnKey and wxPlFrame::OnMouse once I can get the plot displayed properly before they are entered. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-03-03 21:13:32
|
On 2017-03-03 16:02-0000 Phil Rosenberg wrote: > Hi Alan > Sorry for not replying sooner. I was waiting until I had time to > actually read the code, which unfortunately I haven't had chance to do > still, but I didn't want to leave it any longer without at least > guessing what might be happening. Hi Phil: Your guesses are quite helpful so keep them coming. :-) > > So a couple of things spring to mind. Firstly you must at some point > renter the event loop to allow the viewer to start doing stuff again. > If you are sat in a loop in you communication code just waiting for > something to happen then it never will because the execution point is > in your communication code and not dealing with the mouse pointer. > calling wxYield will go check the event loop, deal with that stuff > then return, but it's often not the best solution because nested calls > to wxYield aren't permitted. If possible it is better to return from > your function and use some other event to trigger your return > communication. Perhaps this could be at the end of the method that > deals with capturing mouse activity. > > Another thing is that to cause the plot to be redrawn you must tell > wxWidgets that there is something new to draw. I think the method for > this is Refresh(). Then you must allow the event loop to be entered to > allow the redraw to actually take place. > > A final thing to think about is that locate must be dealt with > immediately. It sounded like you were holding locate requests and > continuing to send plot commands. But maybe I misunderstood this as > doing so would hang the plplot side while pllocate awaited it's > return. > > As I said I haven't looked at the code, just read your email, but > these are the first places to look as a guess. > > Also there is a debug #define which when used means that plplot does > not launch the viewer, instead it just sits and waits for you to > launch it manually. This way you can launch the viewer in a debugger > to see what is going on in there. Thanks, for those ideas which I haven't completely absorbed yet, but I will look closely at them later. The reason for putting that off is there is a new surprising result I have established which is even for -DPL_WXWIDGETS_IPC2=OFF examples/c/x01c -dev wxwidgets -locate does not work correctly, i.e., there is no display of the plot until after you have clicked the mouse away from where you guess a viewport will appear. So the net result is locate mode must be finished before the plot can appear. So it was this code I was trying to follow for the -DPL_WXWIDGETS_IPC2=ON case, but that is obviously not a good model to follow. (By the way, I would not advise working on that -DPL_WXWIDGETS_IPC2=OFF issue because there is a good possibility (assuming you like everything about the new -DPL_WXWIDGETS_IPC2=ON approach once you have had a chance to look at it) that the -DPL_WXWIDGETS_IPC2=OFF code will be completely removed once I get -DPL_WXWIDGETS_IPC2=ON working for this example.) Just now I tried the third (deprecated) version of the wxwidgets code using -DOLD_WXWIDGETS=ON, and that works fine with the above example with both mouse clicks and key presses being properly reported. Therefore, I have a lot of hope that a deeper look into that deprecated code will give me a model I can follow to get the above example to work correctly with -DPL_WXWIDGETS_IPC2=ON. So please remain ready to answer questions this weekend just in case I need some further help to try and translate how that deprecated code implements interactivity to the -DPL_WXWIDGETS_IPC2=ON case. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-03-02 08:26:59
|
On 2017-02-28 12:19-0800 Alan W. Irwin wrote: > My remaining plans for the -DPL_WXWIDGETS_IPC2=ON case are as follows: > > 1. Implement interactivity so that C example 1 works with -locate mode > and C example 20 works. Hi Phil: I am having trouble with implementing interactivity and could use some help from you to figure this out although I will be trying independently to figure this out on my own using gdb. The attached patch implements the required (as far as I can tell) small changes to implement interactivity for -DPL_WXWIDGETS_IPC2=ON. As a result of these changes, the entire code should have the following logic for dealing with locate mode for -DPL_WXWIDGETS_IPC2=ON: * In example 1 after all normal PLplot calls to generate the plot have been called (which fills up plbuf with 24272 bytes of information, see below), the locate mode loop generates a series of plGetCursor calls. * Each of those plGetCursor calls ends up (for -dev wxwidgets) as a call to wxPLDevice::Locate and TransmitBuffer( pls, transmissionLocate ); * TransmitBuffer calls transmitBytes to send header information (including transmissionLocate = 4 and plbufAmountToTransmit = 24272 bytes in plbuf to be received by wxPLViewer using receiveBytes. The results are Before transmitBytes transmissionType = 4 plbufAmountToTransmit = 24272 viewerOpenFlag = 1 locateModeFlag = 1 completeFlag = 0 After receiveBytes transmissionType = 4 plbufAmountToTransmit = 24272 viewerOpenFlag = 1 locateModeFlag = 1 completeFlag = 0 Successful read of plbuf i.e., that transmission of the plbuf information and transmissionLocate to wxPLViewer was a complete success. * Because plbufAmountToTransmit is non-zero, wxPLViewer appends that plbuf information to the correct element of the m_pageBuffers array and conditionally (although the result is the condition is true because of the above 24272 says there is a lot more data to plot) calls SetPageAndUpdate as per normal which should plot the page in preparation for the locate mode. * After plbuf is processed, then transmissionLocate calls SetPageAndUpdate one more time (which has no effect because everything should be plotted now), then sets m_locateMode = true * Here my understanding gets fuzzy, but wxPlFrame::OnMouse is part of a table of events which I presume means it is called whenever there is any mouse activity. But when m_locateMode is true, then in wxPlFrame::OnMouse information about the latest mouse click is collected into the m_header.graphicsIn struct, and then the whole header is sent back to -dev wxwidgets with transmit_Bytes (which because of the three-semaphore design will wait for the end of the previous transmit_Bytes before proceeding so I don't think we are up against any race condition here) where wxPLDevice::Locate receives it with receiveBytes, and passes it back to the example 1 locate loop which prints out values collected for a particular mouse click, then (unless there is an exit from locate mode by a mouse click off of a viewport keeps repeating the whole process indefinitely. I think this design should "just work" (especially because my changes to implement it in the attached patch are so small and straightforward). What happens in practice is the whole code just hangs without even displaying the plot (something it should do regardless of anything else right after the above "Successful read of plbuf" message). I am pretty sure the problem is due to some minor bug. For example, the wxPlFrame::OnKey routine is just empty for the -DPL_WXWIDGETS_IPC2=ON case, and maybe I need to implement something there as well for that case, but I cannot see how that would stop the whole display showing (as it does for example 1 if the -locate option is not used), and I don't see how it would stop wxPlFrame::OnMouse from working properly. So if you spot that bug please let me know. Of course, the answer may be there is a fundamental flaw in the above design so it you feel that is the case please let me know. And your clarification of my fuzzy understanding of when wxPlFrame::OnMouse gets called would be helpful as well. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-02-28 20:19:15
|
On 2017-02-27 15:34-0800 Alan W. Irwin wrote: > The current status is all examples work (i.e., there are no obvious > run-time errors and there are no rendering issues other than > previously known ones for the -DPL_WXWIDGETS_IPC2=OFF case) other than > 02 and 14 (where an exception is consistently thrown by wxPLViewer for > currently unknown reasons), and examples 01 (in locate mode) and 20 > (because interactivity not implemented yet). I next plan to work on > the example 2 issue. Hi Phil: The example 2 and 14 issues for -DPL_WXWIDGETS_IPC2=ON turned out to have the same cause which has now (commit 245815f) been fixed. My remaining plans for the -DPL_WXWIDGETS_IPC2=ON case are as follows: 1. Implement interactivity so that C example 1 works with -locate mode and C example 20 works. 2. Named semaphore variant for POSIX systems. 3. Named semaphore variant for Windows systems. (I should be able to implement this following what is done for -DPL_WXWIDGETS_IPC2=OFF, but I will need your help to test this variant). I don't believe any of these remaining develoments are going to require large code changes. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-02-27 23:34:40
|
On 2017-02-26 22:51-0800 Alan W. Irwin wrote: [....] > Therefore, I plan to turn the current two-semaphore approach into a > three semaphore approach where m_wsem and m_rsem will continue to be > used for the details of a complete transfer of an array of bytes, but > an additional m_tsem semaphore (where "t" stands for transfer) will be > used so that only one such transfer of bytes can be done at a given > time. As far as I can tell, this change means I can completely drop > the moveBytesReaderReversed variant of moveBytesWriter and the > moveBytesWriterReversed variant of moveBytesReader which is a really > nice simplification. Furthermore, I plan to rename moveBytesWriter to > transmitBytes and moveBytesReader to receiveBytes where both > transmitBytes and receiveBytes will be used by either of -dev > wxwidgets or wxPLViewer as needed depending simply on the direction of > data flow. > > The additional m_tsem semaphore will be initialized to 1; > transmitBytes will start by calling sem_wait on that semaphore and > will end by calling sem_post on that semaphore. That simple changes > means if wxPLViewer uses transmitBytes to send data that is received > by -dev wxwidgets with receiveBytes, and then -dev wxwidgets follows > up by calling transmitBytes to send data back that is received by > wxPLViewer with a call to receiveBytes, that second use of > transmitBytes will be halted by the sem_wait until that first use of > transmitBytes is entirely completed, i.e., any call by either side of > the IPC connection to transmitBytes cannot possibly race with a > previous call to that routine by either side. > > Anyhow, I like this pure semaphore way to avoid the race condition > much more than the 10 ms sleep, and I hope to get it completely > implemented tomorrow. Hi Phil: Done as of commit c39f93d. As a result, once again I think I am at a stage in this -DPL_WXWIDGETS_IPC2=ON development where all further changes will be small ones so I encourage you to look at this code in anticipation that you will want to reorganize it along the lines you discussed before. The current status is all examples work (i.e., there are no obvious run-time errors and there are no rendering issues other than previously known ones for the -DPL_WXWIDGETS_IPC2=OFF case) other than 02 and 14 (where an exception is consistently thrown by wxPLViewer for currently unknown reasons), and examples 01 (in locate mode) and 20 (because interactivity not implemented yet). I next plan to work on the example 2 issue. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: <p.d...@gm...> - 2017-02-27 07:58:15
|
Hi Alan Sorry, I still haven't had chance to look at your code (poorly 4 year old over the weekend). But you are definitely correct to not use a sleep to avoid a race. I think particularly in situations where the system is under strain all bets are off regarding timings. And like you said, it's just not good practice. Phil Sent from my Windows 10 phone From: Alan W. Irwin Sent: 27 February 2017 06:51 To: p.d...@gm... Cc: PLplot development list Subject: Re: [Plplot-devel] The status of the wxwidgets IPC development On 2017-02-25 17:44-0800 Alan W. Irwin wrote: > However, I certainly agree mutual use of the same resource (shared > memory) is a tricky world. And now that you have encouraged me to > think about races, I discovered there is indeed a race condition that > could explain this bug. I have now worked around that race (commit > 4e6932e) and please see that commit message for more commentary > concerning this type of race. Assuming I really did understand this > race, I am virtually positive my simple crude fix will deal with it > without any noticeable reduction in speed. However, time will tell about > that. Hi Phil: I think 10 ms sleep used in the above crude workaround would likely always work going forward because it would be pretty unusual for the OS scheduler to not give a process access to the cpu for essentially 10 million instructions. Nevertheless, that argument does depend on process speed and assumptions about scheduler details so having thought a lot more about this, I would far prefer to avoid sleep workarounds for race conditions not only on these grounds but also as simply a matter of good IPC style. Therefore, I plan to turn the current two-semaphore approach into a three semaphore approach where m_wsem and m_rsem will continue to be used for the details of a complete transfer of an array of bytes, but an additional m_tsem semaphore (where "t" stands for transfer) will be used so that only one such transfer of bytes can be done at a given time. As far as I can tell, this change means I can completely drop the moveBytesReaderReversed variant of moveBytesWriter and the moveBytesWriterReversed variant of moveBytesReader which is a really nice simplification. Furthermore, I plan to rename moveBytesWriter to transmitBytes and moveBytesReader to receiveBytes where both transmitBytes and receiveBytes will be used by either of -dev wxwidgets or wxPLViewer as needed depending simply on the direction of data flow. The additional m_tsem semaphore will be initialized to 1; transmitBytes will start by calling sem_wait on that semaphore and will end by calling sem_post on that semaphore. That simple changes means if wxPLViewer uses transmitBytes to send data that is received by -dev wxwidgets with receiveBytes, and then -dev wxwidgets follows up by calling transmitBytes to send data back that is received by wxPLViewer with a call to receiveBytes, that second use of transmitBytes will be halted by the sem_wait until that first use of transmitBytes is entirely completed, i.e., any call by either side of the IPC connection to transmitBytes cannot possibly race with a previous call to that routine by either side. Anyhow, I like this pure semaphore way to avoid the race condition much more than the 10 ms sleep, and I hope to get it completely implemented tomorrow. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-02-27 06:51:10
|
On 2017-02-25 17:44-0800 Alan W. Irwin wrote: > However, I certainly agree mutual use of the same resource (shared > memory) is a tricky world. And now that you have encouraged me to > think about races, I discovered there is indeed a race condition that > could explain this bug. I have now worked around that race (commit > 4e6932e) and please see that commit message for more commentary > concerning this type of race. Assuming I really did understand this > race, I am virtually positive my simple crude fix will deal with it > without any noticeable reduction in speed. However, time will tell about > that. Hi Phil: I think 10 ms sleep used in the above crude workaround would likely always work going forward because it would be pretty unusual for the OS scheduler to not give a process access to the cpu for essentially 10 million instructions. Nevertheless, that argument does depend on process speed and assumptions about scheduler details so having thought a lot more about this, I would far prefer to avoid sleep workarounds for race conditions not only on these grounds but also as simply a matter of good IPC style. Therefore, I plan to turn the current two-semaphore approach into a three semaphore approach where m_wsem and m_rsem will continue to be used for the details of a complete transfer of an array of bytes, but an additional m_tsem semaphore (where "t" stands for transfer) will be used so that only one such transfer of bytes can be done at a given time. As far as I can tell, this change means I can completely drop the moveBytesReaderReversed variant of moveBytesWriter and the moveBytesWriterReversed variant of moveBytesReader which is a really nice simplification. Furthermore, I plan to rename moveBytesWriter to transmitBytes and moveBytesReader to receiveBytes where both transmitBytes and receiveBytes will be used by either of -dev wxwidgets or wxPLViewer as needed depending simply on the direction of data flow. The additional m_tsem semaphore will be initialized to 1; transmitBytes will start by calling sem_wait on that semaphore and will end by calling sem_post on that semaphore. That simple changes means if wxPLViewer uses transmitBytes to send data that is received by -dev wxwidgets with receiveBytes, and then -dev wxwidgets follows up by calling transmitBytes to send data back that is received by wxPLViewer with a call to receiveBytes, that second use of transmitBytes will be halted by the sem_wait until that first use of transmitBytes is entirely completed, i.e., any call by either side of the IPC connection to transmitBytes cannot possibly race with a previous call to that routine by either side. Anyhow, I like this pure semaphore way to avoid the race condition much more than the 10 ms sleep, and I hope to get it completely implemented tomorrow. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-02-26 01:44:42
|
On 2017-02-25 12:12-0000 p.d...@gm... wrote: > > >> But that failure (one or both semaphores are not blocked with zero >> values) should be impossible because the fact is both semaphores are >> initialized in that expected (blocked) state (that check succeeded for >> the initial call to moveBytesReaderReversed on the wxPLViewer side), >> and at the end of moveBytesReaderReversed when that header transfer >> succeeded (as measured on the -dev wxwidgets side) the check is made >> again that the semaphores are left in the proper blocked state. > > Welcome to the world of multithread bugs and race conditions 😊. I haven't looked at the code but my first guess should be some sort of race condition. You cannot rely on any operation completing in one process before the other, unless you have an explicit check for it. I seem to remember having an initialisation flag in the shared memory that gets set by wxPlViewer to indicate all initialisation is complete and the viewer is ready for communication to begin. If you don't have a similar check then you can’t rely on things being ready, including the semaphores being initialised. Note that “overtaking” can occur anywhere, including midway through a single line of code. Hi Phil: Thanks for that welcome. :-) Last night my hypothesis (something clobbered) to explain why this "PLMemoryMap::moveBytesWriter: attempt to start transfer with semaphores not in correct blocked state." exception was being thrown went down in flames because I got an absolutely clean report from valgrind from both the -dev wxwidgets and wxPLViewer side. So your additional feedback concerning the likely possibility of race conditions to explain this bug was quite useful. I believe you are referring above in the -DPL_WXWIDGETS_IPC2=OFF case to wxPLViewer setting the viewerOpenFlag of the header to signal -dev wxwidgets that the wxPLViewer is properly initialized and -dev wxwidgets waiting to proceed until that flag is set. When you get a chance to look at the -DPL_WXWIDGETS_IPC2=ON code case, you will see that is exactly how moveBytesReaderReversed is initially used on the wxPLViewer side and moveBytesWriterReversed initially used on the -dev wxwidgets side. However, I certainly agree mutual use of the same resource (shared memory) is a tricky world. And now that you have encouraged me to think about races, I discovered there is indeed a race condition that could explain this bug. I have now worked around that race (commit 4e6932e) and please see that commit message for more commentary concerning this type of race. Assuming I really did understand this race, I am virtually positive my simple crude fix will deal with it without any noticeable reduction in speed. However, time will tell about that. So at this time I am pretty sure only two issues are left with the -DPL_WXWIDGETS_IPC2=ON case. One of those is to implement interactivity, and the other (which I plan to start working on now) is to deal with the issue (or issues?) where wxwidgets detects an uncaught exception being thrown every time examples 2 or 14 are executed. Thanks once again for your useful feedback above. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: <p.d...@gm...> - 2017-02-25 12:12:43
|
>But that failure (one or both semaphores are not blocked with zero >values) should be impossible because the fact is both semaphores are >initialized in that expected (blocked) state (that check succeeded for >the initial call to moveBytesReaderReversed on the wxPLViewer side), >and at the end of moveBytesReaderReversed when that header transfer >succeeded (as measured on the -dev wxwidgets side) the check is made >again that the semaphores are left in the proper blocked state. Welcome to the world of multithread bugs and race conditions 😊. I haven't looked at the code but my first guess should be some sort of race condition. You cannot rely on any operation completing in one process before the other, unless you have an explicit check for it. I seem to remember having an initialisation flag in the shared memory that gets set by wxPlViewer to indicate all initialisation is complete and the viewer is ready for communication to begin. If you don't have a similar check then you can’t rely on things being ready, including the semaphores being initialised. Note that “overtaking” can occur anywhere, including midway through a single line of code. Phil So I am pretty sure something must be clobbering the semaphores after the call to moveBytesReaderReversed is finished on the wxPLViewer side and before moveBytesWriter is called on the -dev wxwidgets side. But what? What severely complicates debugging this issue, is examples 9 and 16 have run flawlessly today after the first attempt that generated the above message. And the exact same generic sequence (transfer header from wxPLViewer to -dev wxwidgets and start tranferring data the other way with that call to moveBytesWriter) happens for every example with no occurrences of this error (at least so far). However, if something is getting clobbered (the only hypothesis that seems to make sense to me concerning the above results), then valgrind on -dev wxwidgets and/or wxPLViewer should be able to find what the trouble is. 2. I still have not implemented interactivity so I didn't bother to try example 1 with the -locate option. And for the same reason you should not expect example 20 to work (the only one of our examples that is interactive by default). 3. Examples 2 and 14 fail with wxPLViewer issuing the following message: Caught unhandled unknown exception; terminating This was the issue I incorrectly thought was a multipage issue yesterday, but it instead it is confined to just these two examples and must be due to some different way these examples set up plots that exposes an actual reproducible bug in the present -DPL_WXWIDGETS_IPC2=ON case which I am attempting to track down now. I cannot find this throw message anywhere in our own code, and indeed it instead appears (see <http://wxwidgets.10942.n7.nabble.com/Better-exception-handling-td87900.html>) that message is a generic wxwidgets response to uncaught exceptions. So my first step for this issue is to attempt to catch all exceptions in our code rather than passing them on, uncaught, to wxwidgets. In sum, the -DPL_WXWIDGETS_IPC2=ON has been largely matured with the only issues left being 1. an extremely elusive bug that ends up as an "impossible" "PLMemoryMap::moveBytesWriter: attempt to start transfer with semaphores not in correct blocked state." error message on rare and irreproducible occasions. 2. Interactivity not implemented. 3. Some easily reproduced bug exposed by example 2 and 14. I think 2 and 3 should be straightforward to deal with, and if my "something getting clobbered" hypothesis to explain 1 is correct, then with the help of valgrind that should be straightforward to solve as well. So I am extremely pleased that today's results showed so many of the examples are actually working fine now, and I am hoping to get all of them (and also the -locate option for example 1) working soon with -DPL_WXWIDGETS_IPC2=ON code. So stay tuned.... Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-02-25 07:45:38
|
On 2017-02-24 21:32-0800 Alan W. Irwin wrote: > In sum, the -DPL_WXWIDGETS_IPC2=ON has been largely matured with the > only [3 issues left]. Hi Phil: The corollary to this maturation which I forgot to mention is all remaining changes to the -DPL_WXWIDGETS_IPC2=ON code should be small. So I thank you for your consideration in not causing me a lot of conflicts up to now, but from now on please feel free to start reorganizing the wxwidgets code along the lines you mentioned when you saw the direction I was going with my first commit in this -DPL_WXWIDGETS_IPC2=ON series. It is also time to remind you of follow-up goals that I mentioned before once -DPL_WXWIDGETS_IPC2=ON has been completely matured (by solving the remaining 3 issues I mentioned and verifying there are no compromises on speed with this approach). Those goals are as follows: * (Alan) implement a small variation on the -DPL_WXWIDGETS_IPC2=ON approach so that it also works with named semaphores. * (You, assuming you like the code clarity of the -DPL_WXWIDGETS_IPC2=ON approach) implement a small variation of that named semaphores variant which allows it to work on Windows. * (Alan, once those first two goals are completed and we are both happy with the results) Massive code cleanup to completely drop the present Unix + Windows -DPL_WXWIDGETS_IPC2=OFF approach. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-02-25 05:32:19
|
On 2017-02-24 00:19-0800 Alan W. Irwin wrote: > I am currently trying to track down a bug (uncaught exception on the > wxPLViewer side) that shows up whenever there is an attempt to plot > multiple pages. I tried to copy the code stanzas from the old > wxplframe.cpp code to the #ifdef PL_WXWIDGETS_IPC2 case for each > transmission type, but I may have missed something that is critical to > the multipage case. Also for #ifdef PL_WXWIDGETS_IPC2, one call to > ReadTransmission reads all pages of plbuf and should plot those pages. > That's a big chance in design from ReadTransmission which for the > #ifndef PL_WXWIDGETS_IPC2 case reads and plots at most one page. > > Is there something more I have to do to accomodate that design change > so that the multipage case is displayed properly by wxPLViewer > when #ifdef PL_WXWIDGETS_IPC? Hi Phil: You should ignore the above question because it turns out there is no general multipage issue for the -DPL_WXWIDGETS_IPC2=ON case! I discovered that result by doing extensive tests (running every single C example by hand using -dev wxwidgets) today. In fact, those tests showed everything working now for -DPL_WXWIDGETS_IPC2=ON with the following exceptions: 1. Examples 9 and 16 failed the first time I executed them with the throw message "PLMemoryMap::moveBytesWriter: attempt to start transfer with semaphores not in correct blocked state." That exception is thrown when m_twoSemaphores.areBothSemaphoresBlocked() fails when starting PLMemoryMap::moveBytesWriter. And the problem occurred on the first call to moveBytesWriter just after transfer of header information from wxPLViewer to -dev wxwidgets using moveBytesReaderReversed and moveBytesWriterReversed. But that failure (one or both semaphores are not blocked with zero values) should be impossible because the fact is both semaphores are initialized in that expected (blocked) state (that check succeeded for the initial call to moveBytesReaderReversed on the wxPLViewer side), and at the end of moveBytesReaderReversed when that header transfer succeeded (as measured on the -dev wxwidgets side) the check is made again that the semaphores are left in the proper blocked state. So I am pretty sure something must be clobbering the semaphores after the call to moveBytesReaderReversed is finished on the wxPLViewer side and before moveBytesWriter is called on the -dev wxwidgets side. But what? What severely complicates debugging this issue, is examples 9 and 16 have run flawlessly today after the first attempt that generated the above message. And the exact same generic sequence (transfer header from wxPLViewer to -dev wxwidgets and start tranferring data the other way with that call to moveBytesWriter) happens for every example with no occurrences of this error (at least so far). However, if something is getting clobbered (the only hypothesis that seems to make sense to me concerning the above results), then valgrind on -dev wxwidgets and/or wxPLViewer should be able to find what the trouble is. 2. I still have not implemented interactivity so I didn't bother to try example 1 with the -locate option. And for the same reason you should not expect example 20 to work (the only one of our examples that is interactive by default). 3. Examples 2 and 14 fail with wxPLViewer issuing the following message: Caught unhandled unknown exception; terminating This was the issue I incorrectly thought was a multipage issue yesterday, but it instead it is confined to just these two examples and must be due to some different way these examples set up plots that exposes an actual reproducible bug in the present -DPL_WXWIDGETS_IPC2=ON case which I am attempting to track down now. I cannot find this throw message anywhere in our own code, and indeed it instead appears (see <http://wxwidgets.10942.n7.nabble.com/Better-exception-handling-td87900.html>) that message is a generic wxwidgets response to uncaught exceptions. So my first step for this issue is to attempt to catch all exceptions in our code rather than passing them on, uncaught, to wxwidgets. In sum, the -DPL_WXWIDGETS_IPC2=ON has been largely matured with the only issues left being 1. an extremely elusive bug that ends up as an "impossible" "PLMemoryMap::moveBytesWriter: attempt to start transfer with semaphores not in correct blocked state." error message on rare and irreproducible occasions. 2. Interactivity not implemented. 3. Some easily reproduced bug exposed by example 2 and 14. I think 2 and 3 should be straightforward to deal with, and if my "something getting clobbered" hypothesis to explain 1 is correct, then with the help of valgrind that should be straightforward to solve as well. So I am extremely pleased that today's results showed so many of the examples are actually working fine now, and I am hoping to get all of them (and also the -locate option for example 1) working soon with -DPL_WXWIDGETS_IPC2=ON code. So stay tuned.... Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2017-02-25 00:59:27
|
Hi Phil: My two-line fix definitely solved an uninitialized variable issue as reported by valgrind for wxPLViewer, but I would appreciate you reviewing this fix to make sure it does not subvert some wxwidgets purpose you had in mind when you wrote that code. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |