From: Alan W. I. <ir...@be...> - 2015-06-19 20:02:25
|
As release manager for the forthcoming release of 5.11.1, I would appreciate those who have further bug fixing projects in mind for this release cycle leading up to 5.11.1 inform me of those projects. @ Phil: specifically what is on your agenda for wxwidgets bug fixing for the next several weeks? For example, I would dearly like to see the extreme slowness regression (introduced since 5.11.0) for wxwidgets on Linux fixed for this release, and the last I heard on that topic from you was you could not build PLplot on Linux to investigate the matter further. At which point I asked for a comprehensive test report on that situation (see below), but I have not received that yet from you. Also, with regard to the concatenated file bug you found in our build system for a "spaced" build tree, I committed a second version of that fix after you reported the first one did not completely work, and I am currently waiting for your report of whether that second fix works. My own agenda items for the remainder of this release cycle are as follows: * Keep up with on-going resolution of bugs in our C/C++ source code. Those currently include the wxwidgets extreme slowness regression on Linux mentioned above, Jim's series of patches fixing the eop problem for interactive devices, and the on-going discussion of the notcrossed functionality with Phil. Please let me know if I forgot anything here that should be on my agenda during the rest of this release cycle. * Fix build-system issues that are discovered via comprehensive testing by everyone that is lurking on this list that routinely builds PLplot from our git version. Arjen has been extremely helpful in this regard, and future builds of 5.11.1 on Cygwin should be much easier for our users as a result of his many tests, but I strongly encourage the rest of you to start running scripts/comprehensive_test.sh --do_test_interactive no on all platforms accessible to you on a routine basis. (That option is a convenience to make that script run without the babysitting required for the interactive comprehensive testing part of the script.) That script automatically collects a report tarball in ../comprehensive_test_disposeable/comprehensive_test.tar.gz that you should send to this list if you have any difficulties on a platform since that tarball generally gives all the information I need to analyze the issue and find a build-system fix for it. Also, please send that report tarball in the case when you have a (partial) success you want to see reported on our wiki since the report tarball generally includes all necessary information for that wiki entry. Such comprehensinve test results from a lot of you here will go a long way to insure that 5.11.1 will have good build behaviour on all platforms. * Remove everything to do with long-retired device drivers since the outdated information in those files simply confuses those who want to develop a modern PLplot device driver. * Extend the TEST_DEVICE concept, e.g., for the svg device from ctest to the test_noninteractive target for the build tree, install tree, and traditional install tree. * Improve exporting of PLplot targets following <http://www.cmake.org/cmake/help/git-master/manual/cmake-packages.7.html>. * Update epa_build to the latest versions of cmake and all libraries. * Investigate a report on plplot-general that "MinGW Makefiles" fails to build for 5.11.0 although it was fine for 5.10.0. * One more try at a MinGW-w64/MSYS2 install on Wine to see if the latest development version of Wine has fixed the bugs that did not allow that before. However, because Wine is incredibly slow, I am hoping I will never have to do this and someone else here will adopt that platform for comprehensive testing (see above agenda item concerning comprehensive testing on all accessible platforms). * Fix Ada language support for Cygwin. In the interests of getting 5.11.1 out roughly a month from now rather than considerably later, I will likely have to put off the last three items until later. But I am pretty sure I can get everything else on the above agenda into 5.11.1 especially with cooperation from everyone lurking on this list on doing comprehensive testing and sending the report tarballs that are automatically generated by that script to this list. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2015-06-20 10:46:17
|
Hi Alan Regarding the tcl cmake bug it is still there. Just to reiterate the problem does not seem to be that the file that is being looked for has a space - the problem is that the file genuinely does not exist. So the problem must be elsewhere in the CMake logic, which could be due to speces still I guess. Here is the error I get from trying to build INSTALL 102> CMake Error at bindings/cmake_install.cmake:39 (file): 102> file INSTALL cannot find "D:/usr/local/src/plplot-plplot/build/Visual 102> Studio 11 64sd/bindings/pkgIndex.tcl". 102> Call Stack (most recent call first): 102> cmake_install.cmake:61 (include) 102> 102> 102>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): error MSB3073: The command "setlocal 102>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): error MSB3073: "C:\Program Files (x86)\CMake\bin\cmake.exe" -DBUILD_TYPE=Debug -P cmake_install.cmake 102>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): error MSB3073: if %errorlevel% neq 0 goto :cmEnd 102>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): error MSB3073: :cmEnd 102>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): error MSB3073: endlocal & call :cmErrorLevel %errorlevel% & goto :cmDone 102>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): error MSB3073: :cmErrorLevel 102>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): error MSB3073: exit /b %1 102>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): error MSB3073: :cmDone 102>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): error MSB3073: if %errorlevel% neq 0 goto :VCEnd 102>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): error MSB3073: :VCEnd" exited with code 1. Perhaps if you try to build in a directory with a space you will be able to see more easily what the problem is? Regarding the speed on Linux.I have a problem which I will describe below with my Ubuntu machine, but I have just quickly tested example 3 with ssh over the internet connecting to my work CentOS PC. I think was the example you used to highlight the problem. On Windows it takes about 3 seconds from selecting the wxWidgets driver and pressing enter to running the wxViewer and displaying the plot. By comparison the ssh over the internet test takes about 10 seconds. 6 of those seconds are the time up to the window being initially displayed so they are probably the time required to load the executable and for wxWidgets to do its initial window setup. This leaves about 4 seconds to transfer the data to wxViewer and render. While I agree that this isn't brilliant, given all the extra overheads going on with that connection I think that is acceptable. What sort of rendering times do you see running on an actual Linux machine? Now for my Ubuntu machine. I have hit a snag that has come from the checking text length. I think from the limited debugging I have done that when I try to get the font metrics in the console part of the device this causes a segfault within wxWidgets. This is therefore going to have to be rewritten. For some reason there is no problem on my CentOS machine, I think this is running wxWidgets 2.8 rather than 3.0. If anyone else can confirm this behaviour then it would be appreciated. Regarding other bugs. Please see my trello page where I am currently tracking everything https://trello.com/b/xBv7SJco/plplot-wxwidgets-plus-related-buffer-issues. One item I would appreciate confirmation on - it seems like fixing the text size problem has fixed the bad transform of 3d text, at least to my eyes. If you could take a quick look and confirm you are happy that would be good. Phil On 19 June 2015 at 21:02, Alan W. Irwin <ir...@be...> wrote: > As release manager for the forthcoming release of 5.11.1, I would > appreciate those who have further bug fixing projects in mind for this > release cycle leading up to 5.11.1 inform me of those projects. > > @ Phil: specifically what is on your agenda for wxwidgets bug fixing > for the next several weeks? For example, I would dearly like to see > the extreme slowness regression (introduced since 5.11.0) for > wxwidgets on Linux fixed for this release, and the last I heard on > that topic from you was you could not build PLplot on Linux to > investigate the matter further. At which point I asked for a > comprehensive test report on that situation (see below), but I have > not received that yet from you. Also, with regard to the concatenated > file bug you found in our build system for a "spaced" build tree, I > committed a second version of that fix after you reported the first > one did not completely work, and I am currently waiting for your > report of whether that second fix works. > > My own agenda items for the remainder of this release cycle are as follows: > > * Keep up with on-going resolution of bugs in our C/C++ source code. > Those currently include the wxwidgets extreme slowness regression on > Linux mentioned above, Jim's series of patches fixing the eop > problem for interactive devices, and the on-going discussion of the > notcrossed functionality with Phil. Please let me know if I forgot > anything here that should be on my agenda during the rest of this > release cycle. > > * Fix build-system issues that are discovered via comprehensive > testing by everyone that is lurking on this list that routinely > builds PLplot from our git version. Arjen has been extremely > helpful in this regard, and future builds of 5.11.1 on Cygwin should > be much easier for our users as a result of his many tests, but I > strongly encourage the rest of you to start running > > scripts/comprehensive_test.sh --do_test_interactive no > > on all platforms accessible to you on a routine basis. (That option > is a convenience to make that script run without the babysitting > required for the interactive comprehensive testing part of the > script.) That script automatically collects a report tarball in > ../comprehensive_test_disposeable/comprehensive_test.tar.gz that you > should send to this list if you have any difficulties on a platform > since that tarball generally gives all the information I need to > analyze the issue and find a build-system fix for it. Also, please > send that report tarball in the case when you have a (partial) > success you want to see reported on our wiki since the report > tarball generally includes all necessary information for that wiki > entry. Such comprehensinve test results from a lot of you here will go a > long way > to insure that 5.11.1 will have good build behaviour on all > platforms. > > * Remove everything to do with long-retired device drivers since the > outdated information in those files simply confuses those who want > to develop a modern PLplot device driver. > > * Extend the TEST_DEVICE concept, e.g., for the svg device from ctest > to the test_noninteractive target for the build tree, install tree, > and traditional install tree. > > * Improve exporting of PLplot targets following > <http://www.cmake.org/cmake/help/git-master/manual/cmake-packages.7.html>. > > * Update epa_build to the latest versions of cmake and all libraries. > > * Investigate a report on plplot-general that "MinGW Makefiles" fails to > build for 5.11.0 although it was fine for 5.10.0. > > * One more try at a MinGW-w64/MSYS2 install on Wine to see if the latest > development version of Wine has fixed the bugs that did not allow that > before. However, because Wine is incredibly slow, I am hoping I > will never have to do this and someone else here will adopt that > platform for comprehensive testing (see above agenda item concerning > comprehensive testing on all accessible platforms). > > * Fix Ada language support for Cygwin. > > In the interests of getting 5.11.1 out roughly a month from now rather > than considerably later, I will likely have to put off the last three > items until later. But I am pretty sure I can get everything else on > the above agenda into 5.11.1 especially with cooperation from everyone > lurking on this list on doing comprehensive testing and sending the > report tarballs that are automatically generated by that script to > this list. > > Alan > __________________________ > Alan W. Irwin > > Astronomical research affiliation with Department of Physics and Astronomy, > University of Victoria (astrowww.phys.uvic.ca). > > Programming affiliations with the FreeEOS equation-of-state > implementation for stellar interiors (freeeos.sf.net); the Time > Ephemerides project (timeephem.sf.net); PLplot scientific plotting > software package (plplot.sf.net); the libLASi project > (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); > and the Linux Brochure Project (lbproject.sf.net). > __________________________ > > Linux-powered Science > __________________________ |
From: Phil R. <p.d...@gm...> - 2015-06-20 12:34:04
|
Oh there is the fill issue that I reported previously too. Once you have had a chance to consider if the change is okay let me know. Phil On 20 June 2015 at 11:46, Phil Rosenberg <p.d...@gm...> wrote: > Hi Alan > > Regarding the tcl cmake bug it is still there. Just to reiterate the > problem does not seem to be that the file that is being looked for has > a space - the problem is that the file genuinely does not exist. So > the problem must be elsewhere in the CMake logic, which could be due > to speces still I guess. > > Here is the error I get from trying to build INSTALL > > 102> CMake Error at bindings/cmake_install.cmake:39 (file): > > 102> file INSTALL cannot find "D:/usr/local/src/plplot-plplot/build/Visual > > 102> Studio 11 64sd/bindings/pkgIndex.tcl". > > 102> Call Stack (most recent call first): > > 102> cmake_install.cmake:61 (include) > > 102> > > 102> > > 102>C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): > error MSB3073: The command "setlocal > > 102>C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): > error MSB3073: "C:\Program Files (x86)\CMake\bin\cmake.exe" > -DBUILD_TYPE=Debug -P cmake_install.cmake > > 102>C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): > error MSB3073: if %errorlevel% neq 0 goto :cmEnd > > 102>C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): > error MSB3073: :cmEnd > > 102>C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): > error MSB3073: endlocal & call :cmErrorLevel %errorlevel% & goto > :cmDone > > 102>C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): > error MSB3073: :cmErrorLevel > > 102>C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): > error MSB3073: exit /b %1 > > 102>C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): > error MSB3073: :cmDone > > 102>C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): > error MSB3073: if %errorlevel% neq 0 goto :VCEnd > > 102>C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\Microsoft.CppCommon.targets(134,5): > error MSB3073: :VCEnd" exited with code 1. > > Perhaps if you try to build in a directory with a space you will be > able to see more easily what the problem is? > > Regarding the speed on Linux.I have a problem which I will describe > below with my Ubuntu machine, but I have just quickly tested example 3 > with ssh over the internet connecting to my work CentOS PC. I think > was the example you used to highlight the problem. On Windows it takes > about 3 seconds from selecting the wxWidgets driver and pressing enter > to running the wxViewer and displaying the plot. By comparison the ssh > over the internet test takes about 10 seconds. 6 of those seconds are > the time up to the window being initially displayed so they are > probably the time required to load the executable and for wxWidgets to > do its initial window setup. This leaves about 4 seconds to transfer > the data to wxViewer and render. While I agree that this isn't > brilliant, given all the extra overheads going on with that connection > I think that is acceptable. What sort of rendering times do you see > running on an actual Linux machine? > > Now for my Ubuntu machine. I have hit a snag that has come from the > checking text length. I think from the limited debugging I have done > that when I try to get the font metrics in the console part of the > device this causes a segfault within wxWidgets. This is therefore > going to have to be rewritten. For some reason there is no problem on > my CentOS machine, I think this is running wxWidgets 2.8 rather than > 3.0. If anyone else can confirm this behaviour then it would be > appreciated. > > Regarding other bugs. Please see my trello page where I am currently > tracking everything > https://trello.com/b/xBv7SJco/plplot-wxwidgets-plus-related-buffer-issues. > > One item I would appreciate confirmation on - it seems like fixing the > text size problem has fixed the bad transform of 3d text, at least to > my eyes. If you could take a quick look and confirm you are happy that > would be good. > > Phil > > On 19 June 2015 at 21:02, Alan W. Irwin <ir...@be...> wrote: >> As release manager for the forthcoming release of 5.11.1, I would >> appreciate those who have further bug fixing projects in mind for this >> release cycle leading up to 5.11.1 inform me of those projects. >> >> @ Phil: specifically what is on your agenda for wxwidgets bug fixing >> for the next several weeks? For example, I would dearly like to see >> the extreme slowness regression (introduced since 5.11.0) for >> wxwidgets on Linux fixed for this release, and the last I heard on >> that topic from you was you could not build PLplot on Linux to >> investigate the matter further. At which point I asked for a >> comprehensive test report on that situation (see below), but I have >> not received that yet from you. Also, with regard to the concatenated >> file bug you found in our build system for a "spaced" build tree, I >> committed a second version of that fix after you reported the first >> one did not completely work, and I am currently waiting for your >> report of whether that second fix works. >> >> My own agenda items for the remainder of this release cycle are as follows: >> >> * Keep up with on-going resolution of bugs in our C/C++ source code. >> Those currently include the wxwidgets extreme slowness regression on >> Linux mentioned above, Jim's series of patches fixing the eop >> problem for interactive devices, and the on-going discussion of the >> notcrossed functionality with Phil. Please let me know if I forgot >> anything here that should be on my agenda during the rest of this >> release cycle. >> >> * Fix build-system issues that are discovered via comprehensive >> testing by everyone that is lurking on this list that routinely >> builds PLplot from our git version. Arjen has been extremely >> helpful in this regard, and future builds of 5.11.1 on Cygwin should >> be much easier for our users as a result of his many tests, but I >> strongly encourage the rest of you to start running >> >> scripts/comprehensive_test.sh --do_test_interactive no >> >> on all platforms accessible to you on a routine basis. (That option >> is a convenience to make that script run without the babysitting >> required for the interactive comprehensive testing part of the >> script.) That script automatically collects a report tarball in >> ../comprehensive_test_disposeable/comprehensive_test.tar.gz that you >> should send to this list if you have any difficulties on a platform >> since that tarball generally gives all the information I need to >> analyze the issue and find a build-system fix for it. Also, please >> send that report tarball in the case when you have a (partial) >> success you want to see reported on our wiki since the report >> tarball generally includes all necessary information for that wiki >> entry. Such comprehensinve test results from a lot of you here will go a >> long way >> to insure that 5.11.1 will have good build behaviour on all >> platforms. >> >> * Remove everything to do with long-retired device drivers since the >> outdated information in those files simply confuses those who want >> to develop a modern PLplot device driver. >> >> * Extend the TEST_DEVICE concept, e.g., for the svg device from ctest >> to the test_noninteractive target for the build tree, install tree, >> and traditional install tree. >> >> * Improve exporting of PLplot targets following >> <http://www.cmake.org/cmake/help/git-master/manual/cmake-packages.7.html>. >> >> * Update epa_build to the latest versions of cmake and all libraries. >> >> * Investigate a report on plplot-general that "MinGW Makefiles" fails to >> build for 5.11.0 although it was fine for 5.10.0. >> >> * One more try at a MinGW-w64/MSYS2 install on Wine to see if the latest >> development version of Wine has fixed the bugs that did not allow that >> before. However, because Wine is incredibly slow, I am hoping I >> will never have to do this and someone else here will adopt that >> platform for comprehensive testing (see above agenda item concerning >> comprehensive testing on all accessible platforms). >> >> * Fix Ada language support for Cygwin. >> >> In the interests of getting 5.11.1 out roughly a month from now rather >> than considerably later, I will likely have to put off the last three >> items until later. But I am pretty sure I can get everything else on >> the above agenda into 5.11.1 especially with cooperation from everyone >> lurking on this list on doing comprehensive testing and sending the >> report tarballs that are automatically generated by that script to >> this list. >> >> Alan >> __________________________ >> Alan W. Irwin >> >> Astronomical research affiliation with Department of Physics and Astronomy, >> University of Victoria (astrowww.phys.uvic.ca). >> >> Programming affiliations with the FreeEOS equation-of-state >> implementation for stellar interiors (freeeos.sf.net); the Time >> Ephemerides project (timeephem.sf.net); PLplot scientific plotting >> software package (plplot.sf.net); the libLASi project >> (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); >> and the Linux Brochure Project (lbproject.sf.net). >> __________________________ >> >> Linux-powered Science >> __________________________ |
From: Alan W. I. <ir...@be...> - 2015-06-20 15:44:22
|
On 2015-06-20 13:33+0100 Phil Rosenberg wrote: > Oh there is the fill issue that I reported previously too. Once you > have had a chance to consider if the change is okay let me know. Hi Phil: That is (buried) on my agenda already as >> On 19 June 2015 at 21:02, Alan W. Irwin <ir...@be...> wrote: >>> [...]My own agenda items for the remainder of this release cycle are as follows: >>> >>> * Keep up with on-going resolution of bugs in our C/C++ source code. >>> Those currently include [...] the on-going discussion of the >>> notcrossed functionality with Phil. The ball is in my court on that one, but I have had no chance to look at it yet due to everything else on my PLplot agenda. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2015-06-20 20:00:22
|
Hi Phil: I have changed the subject line for this particular topic to something less general. I am hoping my hypothesis that you are the victim of stale build results (see below) is correct so we can put this topic finally to rest. On 2015-06-20 11:46+0100 Phil Rosenberg wrote: > Hi Alan > > Regarding the tcl cmake bug it is still there. Just to reiterate the > problem does not seem to be that the file that is being looked for has > a space - the problem is that the file genuinely does not exist. So > the problem must be elsewhere in the CMake logic, which could be due > to speces still I guess. > > Here is the error I get from trying to build INSTALL > > 102> CMake Error at bindings/cmake_install.cmake:39 (file): > > 102> file INSTALL cannot find "D:/usr/local/src/plplot-plplot/build/Visual > > 102> Studio 11 64sd/bindings/pkgIndex.tcl". > [...]Perhaps if you try to build in a directory with a space you will be > able to see more easily what the problem is? I looked at that as follows (as you should be able to verify on your Linux machines): software@raven> mkdir build\ dir software@raven> cd build\ dir/ software@raven> cmake \ -DCMAKE_INSTALL_PREFIX=/home/software/plplot/installcmake \ -DENABLE_ada=OFF -DENABLE_ocaml=OFF \ -DBUILD_TEST=ON ../plplot.git >& cmake.out software@raven> make -j4 install >& install.out I had to drop Ada and Ocaml because there are "spaced" pathname issues with those components which I don't have time to deal with at the present time. But otherwise everything worked perfectly (thanks mostly to your work on spaced pathname issues in the past, but also because of my recent fixes). For example, software@raven> grep pkgIndex install.out Scanning dependencies of target concatenate_pkgIndex.tcl Generating pkgIndex.tcl Built target concatenate_pkgIndex.tcl -- Installing: /home/software/plplot/installcmake/share/plplot5.11.0/pkgIndex.tcl Since all the pkgIndex.tcl concatenation logic is working fine with a spaced build tree on Linux, I wonder if you are the victim of stale cached results. So please try the master tip version with an absolutely fresh build to see if that proves to be the case. Note you normally don't have to do a fresh build to get reliable results especially when there is just a source code change, but sometimes it makes a difference if there are build-system changes (as in this case). Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2015-06-20 23:33:00
|
On 2015-06-20 11:46+0100 Phil Rosenberg wrote: > Regarding the speed on Linux.I have a problem which I will describe > below with my Ubuntu machine, but I have just quickly tested example 3 > with ssh over the internet connecting to my work CentOS PC. I think > was the example you used to highlight the problem. On Windows it takes > about 3 seconds from selecting the wxWidgets driver and pressing enter > to running the wxViewer and displaying the plot. By comparison the ssh > over the internet test takes about 10 seconds. 6 of those seconds are > the time up to the window being initially displayed so they are > probably the time required to load the executable and for wxWidgets to > do its initial window setup. This leaves about 4 seconds to transfer > the data to wxViewer and render. While I agree that this isn't > brilliant, given all the extra overheads going on with that connection > I think that is acceptable. What sort of rendering times do you see > running on an actual Linux machine? Hi Phil: I did some timing tests for 5.11.0 for comparison purposes which you should be able to replicate at least in part (the wxwidgets part) on your own Linux boxes. In each case after building x00c (one of the simplest examples), wxwidgets, and other device drivers I ran time examples/c/x00c -dev <device> -np twice (where I used the second timing result since that tends to be more reliable with everything required loaded from disk into memory for that second try) for device = wxwidgets, xcairo, qtwidget, and xwin. The 5.11.0 results when run directly on my principal box (no ssh) were wxwidgets: real 0m0.421s user 0m0.032s sys 0m0.036s In this (5.11.0) case, the -np option is ignored (I think), but the wxPLViewer app seems to disconnect as soon as the page is displayed so the above time corresponds pretty closely to the time required to display that one page example. xcairo: real 0m0.163s user 0m0.016s sys 0m0.008s qtwidget: real 0m0.323s user 0m0.060s sys 0m0.024s xwin: real 0m0.097s user 0m0.004s sys 0m0.000s If I did the same timing tests from a thin client over ssh, the times went up by roughly a factor of two, and in no case did the time exceed 1 second for any of these devices. In sum, wxwidgets is a bit slow for 5.11.0, but still acceptable in comparison with other heavy-duty interactive devices, and all of those heavy-duty ones are substantially slower than -dev xwin. I also tested the current (2db68f6 Impliment use of nopause in the wxWidgets driver) master tip in the same way for the direct method of display. The timing results were similar (as expected) for xcairo, qtwidget, and xwin. However, the command time examples/c/x00c -dev wxwidgets -np ended up with the wxPLViewer application frozen with no signs of life for up to 30 seconds before I gave up and killed it. If this test works for you on Windows and also your CentOS box then you should double check that you are really testing commit 2db68f6 rather than some local version with added changes, and if you are, perhaps this issue is due to some wxWidgets version issue? (I am currently testing the Debian wheezy version 2.8.12.1-12 of WxWidgets.) Anyhow, I am anxious to complete the timing test on Linux for master tip -dev wxwidgets once you can figure out how to unfreeze the wxPLViewer application for my version of WxWidgets. In any case you should be doing similar timing tests yourself between master tip and 5.11.0 on the Windows and CentOS boxes accessible to you to make sure there have been no serious timing regressions introduced during the current release cycle for our wxwidgets device driver. (The additional timing results I reported above for the xcairo, qtwidget, and xwin devices are just to give context for these wxwidgets timing results, and there is no necessity for you to time any device other than wxwidgets on your various boxes for 5.11.0 and master tip.) Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2015-06-22 08:54:29
|
Hi Alan It seems you were correct, I deleted the cache, but obviously that wasn't enough. A full clean build directory solved the problem so you can tick that off. Phil On 20 June 2015 at 21:00, Alan W. Irwin <ir...@be...> wrote: > Hi Phil: > > I have changed the subject line for this particular topic to something > less general. I am hoping my hypothesis that you are the victim of > stale build results (see below) is correct so we can put this topic > finally to rest. > > On 2015-06-20 11:46+0100 Phil Rosenberg wrote: > >> Hi Alan >> >> Regarding the tcl cmake bug it is still there. Just to reiterate the >> problem does not seem to be that the file that is being looked for has >> a space - the problem is that the file genuinely does not exist. So >> the problem must be elsewhere in the CMake logic, which could be due >> to speces still I guess. >> >> Here is the error I get from trying to build INSTALL >> >> 102> CMake Error at bindings/cmake_install.cmake:39 (file): >> >> 102> file INSTALL cannot find "D:/usr/local/src/plplot-plplot/build/Visual >> >> 102> Studio 11 64sd/bindings/pkgIndex.tcl". > > >> [...]Perhaps if you try to build in a directory with a space you will be >> able to see more easily what the problem is? > > > I looked at that as follows (as you should be able to verify on your Linux > machines): > > software@raven> mkdir build\ dir > software@raven> cd build\ dir/ > software@raven> cmake \ > -DCMAKE_INSTALL_PREFIX=/home/software/plplot/installcmake \ > -DENABLE_ada=OFF -DENABLE_ocaml=OFF \ > -DBUILD_TEST=ON ../plplot.git >& cmake.out > software@raven> make -j4 install >& install.out > > I had to drop Ada and Ocaml because there are "spaced" pathname issues with > those components which I don't have time to deal with > at the present time. But otherwise everything worked perfectly (thanks > mostly to your work on spaced pathname issues in the past, but also > because of my recent fixes). For > example, > > software@raven> grep pkgIndex install.out > Scanning dependencies of target concatenate_pkgIndex.tcl > Generating pkgIndex.tcl > Built target concatenate_pkgIndex.tcl > -- Installing: > /home/software/plplot/installcmake/share/plplot5.11.0/pkgIndex.tcl > > Since all the pkgIndex.tcl concatenation logic is working fine with a spaced > build > tree on Linux, I wonder if you are the victim of stale cached results. > So please try the master tip version with an absolutely fresh build to > see if that proves to be the case. Note you normally don't have > to do a fresh build to get reliable results especially when there is > just a source code change, but sometimes it makes a difference if there > are build-system changes (as in this case). > > Alan > __________________________ > Alan W. Irwin > > Astronomical research affiliation with Department of Physics and Astronomy, > University of Victoria (astrowww.phys.uvic.ca). > > Programming affiliations with the FreeEOS equation-of-state > implementation for stellar interiors (freeeos.sf.net); the Time > Ephemerides project (timeephem.sf.net); PLplot scientific plotting > software package (plplot.sf.net); the libLASi project > (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); > and the Linux Brochure Project (lbproject.sf.net). > __________________________ > > Linux-powered Science > __________________________ |
From: Alan W. I. <ir...@be...> - 2015-06-22 19:37:27
|
On 2015-06-22 09:54+0100 Phil Rosenberg wrote: > Hi Alan > It seems you were correct, I deleted the cache, but obviously that > wasn't enough. A full clean build directory solved the problem so you > can tick that off. Good. I think a pretty good rule of thumb is if you have to remove the cache whenever there is a build-system fix, then you might as well remove the whole build tree as well. I am pretty casual about the extra compile time required when starting with an empty build tree, but the reason for that is I use ccache. Which likely doesn't do you any good with MSVC (although there might be a Windows equivalent), but I do highly, highly recommend ccache for all gcc users here (including the Windows variants of gcc). What it does is keep track of all compilations in a database, and if the compiler, options, source file, and headers are all identical, it delivers the cached result of the previous compilation rather than running gcc to generate that. ccache is lightening quick and so far extremely reliable which is why I recommend it. To give you some idea of how fast ccache is, here are my time results for building the plplot library. software@raven> make clean software@raven> time make -j4 plplot >& plplot.out real 0m0.970s user 0m0.684s sys 0m0.424s The first time you see such results (less than a second to rebuild the plplot library) you hardly believe them, but it is true, ccache is really that fast. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2015-06-22 09:21:37
|
Hi Alan I now see times close to a minute to render both pages of example 2, which is clearly a problem. This is the case with or without -np. I'm not sure why . I will look into it. Phil On 21 June 2015 at 00:32, Alan W. Irwin <ir...@be...> wrote: > On 2015-06-20 11:46+0100 Phil Rosenberg wrote: > >> Regarding the speed on Linux.I have a problem which I will describe >> below with my Ubuntu machine, but I have just quickly tested example 3 >> with ssh over the internet connecting to my work CentOS PC. I think >> was the example you used to highlight the problem. On Windows it takes >> about 3 seconds from selecting the wxWidgets driver and pressing enter >> to running the wxViewer and displaying the plot. By comparison the ssh >> over the internet test takes about 10 seconds. 6 of those seconds are >> the time up to the window being initially displayed so they are >> probably the time required to load the executable and for wxWidgets to >> do its initial window setup. This leaves about 4 seconds to transfer >> the data to wxViewer and render. While I agree that this isn't >> brilliant, given all the extra overheads going on with that connection >> I think that is acceptable. What sort of rendering times do you see >> running on an actual Linux machine? > > > Hi Phil: > > I did some timing tests for 5.11.0 for comparison purposes which you > should be able to replicate at least in part (the wxwidgets part) on > your own Linux boxes. In each case after building x00c (one of the > simplest examples), wxwidgets, and other device drivers I ran > > time examples/c/x00c -dev <device> -np > > twice (where I used the second timing result since that tends to be > more reliable with everything required loaded from disk into memory > for that second try) for device = wxwidgets, xcairo, qtwidget, and > xwin. > > The 5.11.0 results when run directly on my principal box (no ssh) were > > wxwidgets: > real 0m0.421s > user 0m0.032s > sys 0m0.036s > > In this (5.11.0) case, the -np option is ignored (I think), but the > wxPLViewer > app seems to disconnect as soon as the page is displayed so > the above time corresponds pretty closely to the time required to > display that one page example. > > xcairo: > real 0m0.163s > user 0m0.016s > sys 0m0.008s > > qtwidget: > real 0m0.323s > user 0m0.060s > sys 0m0.024s > > xwin: > real 0m0.097s > user 0m0.004s > sys 0m0.000s > > If I did the same timing tests from a thin client over ssh, the times > went up by roughly a factor of two, and in no case did the time exceed > 1 second for any of these devices. > > In sum, wxwidgets is a bit slow for > 5.11.0, but still acceptable in comparison with other heavy-duty > interactive devices, and all of those heavy-duty ones are substantially > slower than -dev xwin. > > I also tested the current (2db68f6 Impliment use of nopause in the wxWidgets > driver) master tip in the same way for the direct method of display. > The timing results were similar (as expected) for xcairo, qtwidget, and > xwin. However, the command > > time examples/c/x00c -dev wxwidgets -np > > ended up with the wxPLViewer application frozen with no signs of life > for up to 30 seconds before I gave up and killed it. > > If this test works for you on Windows and also your CentOS box then > you should double check that you are really testing commit 2db68f6 > rather than some local version with added changes, and if you are, > perhaps this issue is due to some wxWidgets version issue? (I am > currently testing the Debian wheezy version 2.8.12.1-12 of WxWidgets.) > > Anyhow, I am anxious to complete the timing test on Linux for master > tip -dev wxwidgets once you can figure out how to unfreeze the > wxPLViewer application for my version of WxWidgets. > > In any case you should be doing similar timing tests yourself between > master tip and 5.11.0 on the Windows and CentOS boxes accessible to > you to make sure there have been no serious timing regressions > introduced during the current release cycle for our wxwidgets device > driver. (The additional timing results I reported above for the xcairo, > qtwidget, and xwin devices are just to give context for these > wxwidgets timing results, and there is no necessity for you to time > any device other than wxwidgets on your various boxes for > 5.11.0 and master tip.) > > Alan > __________________________ > Alan W. Irwin > > Astronomical research affiliation with Department of Physics and Astronomy, > University of Victoria (astrowww.phys.uvic.ca). > > Programming affiliations with the FreeEOS equation-of-state > implementation for stellar interiors (freeeos.sf.net); the Time > Ephemerides project (timeephem.sf.net); PLplot scientific plotting > software package (plplot.sf.net); the libLASi project > (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); > and the Linux Brochure Project (lbproject.sf.net). > __________________________ > > Linux-powered Science > __________________________ |
From: Alan W. I. <ir...@be...> - 2015-06-23 16:01:59
|
On 2015-06-22 10:21+0100 Phil Rosenberg wrote: > Hi Alan > I now see times close to a minute to render both pages of example 2, > which is clearly a problem. This is the case with or without -np. I'm > not sure why . I will look into it. Hi Phil: I am glad you were able to verify the efficiency problem there since issues that are only seen on platforms not accessible to the original developer are a real devil to fix. I wonder if the current efficiency regressions are due to the IPC <https://en.wikipedia.org/wiki/Inter-process_communication> method you currently use between applications and wxPLViewer not scaling well for the extra burdens you are placing on that IPC method since 5.11.0? In particular I would appreciate your comments on whether switching from your current IPC method (whatever it is) to one of the many other possibilities in that article might completely solve these efficiency regressions. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2015-06-24 08:24:56
|
Hi Alan As far as the method is concerned I am using shared memory which should scale very well. The implementation is as a circular buffer, but for most examples the first circuit is enough. I imagine that the issue is related to the use of global semaphores to lock that memory and prevent concurrent access. These are likely to have very different instantiations on Linux and windows or even different Linux flavours or kernel versions. I may have been over cautious with their use as finding race condition bugs fan be very painful. I will look to see if this is the issue and if i can reduce their use. Phil -----Original Message----- From: "Alan W. Irwin" <ir...@be...> Sent: 23/06/2015 17:01 To: "Phil Rosenberg" <p.d...@gm...> Cc: "PLplot development list" <Plp...@li...> Subject: Re: Status report on remaining issues to be addressed for theforthcoming 5.11.1 release (wxwidgets issues) On 2015-06-22 10:21+0100 Phil Rosenberg wrote: > Hi Alan > I now see times close to a minute to render both pages of example 2, > which is clearly a problem. This is the case with or without -np. I'm > not sure why . I will look into it. Hi Phil: I am glad you were able to verify the efficiency problem there since issues that are only seen on platforms not accessible to the original developer are a real devil to fix. I wonder if the current efficiency regressions are due to the IPC <https://en.wikipedia.org/wiki/Inter-process_communication> method you currently use between applications and wxPLViewer not scaling well for the extra burdens you are placing on that IPC method since 5.11.0? In particular I would appreciate your comments on whether switching from your current IPC method (whatever it is) to one of the many other possibilities in that article might completely solve these efficiency regressions. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2015-06-24 11:47:17
|
Hi Alan I've done two quick tests. The first was to call on valgrind to do some profiling using valgrind --tool=callgrind examples/c/x03c -dev wxwidgets. This indicated that the executable spent 75% of its time searching for the wxPLViewer executable. I must confess that the method I used to hunt it out was pretty slapdash in that I enumerate the entire build directory. I have restricted the search to just the utils directory and this is hugely improved. However, I can only assume that the time reported by valgrind doesn't includes time "doing nothing" for example if I have called sleep and the OS does something else for a while then I'm not sure valgrind recognises this as time within the executable. I assume this because most of the wall clock time of the examples occurs after the viewer has appeared and functions called after this point are only negligibly represented by the valgrind results. So I did a few more things to work out what was taking so long and the short story is that most of the time is spent by the console part of the executable asking the viewer how long the rendered text is. Unfortunately there is always going to be some delay when we have the inter-process communications so this is always going to be a bottle neck. However I have removed a mutex which was not needed and have introduced a delay before the viewer reduces its poling frequency. This has sped things up significantly. The mutex in particular has helped a lot so probably explains the speed difference between Windows and Linux. I'm not sure there is much else I can do to improve things further. Phil Phil On 24 June 2015 at 09:24, Phil Rosenberg <p.d...@gm...> wrote: > Hi Alan > As far as the method is concerned I am using shared memory which should > scale very well. The implementation is as a circular buffer, but for most > examples the first circuit is enough. I imagine that the issue is related to > the use of global semaphores to lock that memory and prevent concurrent > access. These are likely to have very different instantiations on Linux and > windows or even different Linux flavours or kernel versions. I may have been > over cautious with their use as finding race condition bugs fan be very > painful. I will look to see if this is the issue and if i can reduce their > use. > > Phil > ________________________________ > From: Alan W. Irwin > Sent: 23/06/2015 17:01 > To: Phil Rosenberg > Cc: PLplot development list > Subject: Re: Status report on remaining issues to be addressed for > theforthcoming 5.11.1 release (wxwidgets issues) > > On 2015-06-22 10:21+0100 Phil Rosenberg wrote: > >> Hi Alan >> I now see times close to a minute to render both pages of example 2, >> which is clearly a problem. This is the case with or without -np. I'm >> not sure why . I will look into it. > > Hi Phil: > > I am glad you were able to verify the efficiency problem there since > issues that are only seen on platforms not accessible to the original > developer are a real devil to fix. > > I wonder if the current efficiency regressions are due to the IPC > <https://en.wikipedia.org/wiki/Inter-process_communication> method you > currently use between applications and wxPLViewer not scaling well for > the extra burdens you are placing on that IPC method since 5.11.0? In > particular I would appreciate your comments on whether switching from > your current IPC method (whatever it is) to one of the many other > possibilities in that article might completely solve these efficiency > regressions. > > Alan > __________________________ > Alan W. Irwin > > Astronomical research affiliation with Department of Physics and Astronomy, > University of Victoria (astrowww.phys.uvic.ca). > > Programming affiliations with the FreeEOS equation-of-state > implementation for stellar interiors (freeeos.sf.net); the Time > Ephemerides project (timeephem.sf.net); PLplot scientific plotting > software package (plplot.sf.net); the libLASi project > (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); > and the Linux Brochure Project (lbproject.sf.net). > __________________________ > > Linux-powered Science > __________________________ |
From: Alan W. I. <ir...@be...> - 2015-06-24 17:51:44
|
On 2015-06-24 12:47+0100 Phil Rosenberg wrote: > Hi Alan > > I've done two quick tests. > > The first was to call on valgrind to do some profiling using valgrind > --tool=callgrind examples/c/x03c -dev wxwidgets. This indicated that > the executable spent 75% of its time searching for the wxPLViewer > executable. I must confess that the method I used to hunt it out was > pretty slapdash in that I enumerate the entire build directory. I have > restricted the search to just the utils directory and this is hugely > improved. > > However, I can only assume that the time reported by valgrind doesn't > includes time "doing nothing" for example if I have called sleep and > the OS does something else for a while then I'm not sure valgrind > recognises this as time within the executable. I assume this because > most of the wall clock time of the examples occurs after the viewer > has appeared and functions called after this point are only negligibly > represented by the valgrind results. > > So I did a few more things to work out what was taking so long and the > short story is that most of the time is spent by the console part of > the executable asking the viewer how long the rendered text is. > > Unfortunately there is always going to be some delay when we have the > inter-process communications so this is always going to be a bottle > neck. However I have removed a mutex which was not needed and have > introduced a delay before the viewer reduces its poling frequency. > This has sped things up significantly. The mutex in particular has > helped a lot so probably explains the speed difference between Windows > and Linux. > > I'm not sure there is much else I can do to improve things further. Hi Phil: I have tested your changes, and I can now at least get the standard examples to finish, but there is still a large efficiency regression compared with what is possible with 5.11.0. Also, my previous timing numbers for both master tip and 5.11.0 were spuriously long for some reason. I suspect I was not actually running directly on our principal box (a 8-year old computer which was high-end when we bought it). But I am doing that now directly from my wife's desktop on that computer as indicated by the "barbara@raven" tag below. In all cases the PLplot software was built using the following compiler options: CXXFLAGS=-O3 -fvisibility=hidden -Wuninitialized CFLAGS=-O3 -fvisibility=hidden -Wuninitialized FFLAGS=-O3 -Wuninitialized 5.11.0: barbara@raven> time examples/c/x00c -dev xcairo -np real 0m0.048s user 0m0.012s sys 0m0.016s barbara@raven> time examples/c/x00c -dev qtwidget -np real 0m0.119s user 0m0.064s sys 0m0.016s barbara@raven> time examples/c/x00c -dev wxwidgets -np real 0m0.222s user 0m0.024s sys 0m0.044s For this case the -np option is a no-op for -dev wxwidgets and x00c relenquishes control of wxPLViewer and finishes the timing when the first (and only page) of this example has been displayed. So it is a pretty realistic timing result that can be compared directly with the equivalent master tip result where the -np option works. But, of course, multipage examples would give spurious timing results for 5.11.0 because the application finishes just as soon as the first page is displayed by wxPLViewer. master tip: barbara@raven> time examples/c/x00c -dev xcairo -np real 0m0.043s user 0m0.016s sys 0m0.008s barbara@raven> time examples/c/x00c -dev qtwidget -np real 0m0.108s user 0m0.060s sys 0m0.016s barbara@raven> time examples/c/x00c -dev wxwidgets -np real 0m3.474s user 0m0.012s sys 0m0.016s So as expected the xcairo and qtwidget efficiency doesn't change that much from 5.11.0 to master tip. However, the efficiency drops by a factor of 16 for -dev wxwidgets and changes wxwidgets timing from a factor of 2-5 slower than the rest for 5.11.0 to a factor of 30-80 slower than the rest. For master tip (where the -np option works for wxwidgets so we can obtain reliable timings even for multipage examples) the slow speed of wxwidgets is also confirmed for x08c: barbara@raven> time examples/c/x08c -dev xcairo -np real 0m1.453s user 0m1.392s sys 0m0.024s barbara@raven> time examples/c/x08c -dev qtwidget -np real 0m0.508s user 0m0.424s sys 0m0.016s barbara@raven> time examples/c/x08c -dev wxwidgets -np real 0m41.480s user 0m0.148s sys 0m0.084s I looked at the top command results during that ~42 seconds, and x08c and wxPLViewer rarely exceeded 10 per cent of cpu usage. In other words they are spending the vast majority of that real time interval waiting for each other. I strongly recommend that your next step should be to confirm the above time results for wxwidgets on your Linux box for both 5.11.0 and master tip to prove they aren't some artifact of my own test environment. Once confirmed, then it seems to me the most promising avenue for you to explore is to try and figure out why both the application and wxPLViewer spend such a large fraction of their time waiting according to the above "top" results. Do you really think such long waits are inevitable with the shared memory IPC method you are currently using (which sounds like it should be quite efficient)? Or is there some small detail of your current IPC method that needs to be tweaked to substantially drop those wait times? Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2015-07-03 12:53:58
|
Hi Alan So I think you should now find the wxPLViewer speed has significantly improved. The item I have dealt with now was that there is a timer at the OS level (initiated by wxWidgets) which sends a message to the viewer which initiates checking for new commands from the console app. At this point I used to check for a single message and return after dealing with it. However it seems that for whatever reason (different OS, different wxWidgets version?) on some systems the minimum time between the timer calls was quite long. This meant that when we were dealing with text which required a lot of small transmissions to the viewer there was a huge overhead. I have now modified things so that on every timer event I check for new transmissions at least 100 times with a 1 millisecond sleep between each. This means that for many small transmissions in a short time we remove most of the overhead. On my CentOS machine the execution time reported by time x08c -dev wxWidgets -np has dropped from 40s to about 2 s. Note that this only times the console part, not the viewer, but still we have clearly moved to something I would call acceptable. Note that there will always be a significant overhead for the transmission so this will never compete with Cairo, but this is unavoidable because wxWidgets cannot be run in a library without running it as a separate thread and PlPlot isn't thread safe. Note also that when using the wxWidgets driver via the wxWidgets binding (I.e. from within a wxWidgets app) this all becomes irrelevant as the viewer is not used. Anyway I hope this fix closes the issue and you are happy Alan. Phil On 21 June 2015 at 00:32, Alan W. Irwin <ir...@be...> wrote: > On 2015-06-20 11:46+0100 Phil Rosenberg wrote: > >> Regarding the speed on Linux.I have a problem which I will describe >> below with my Ubuntu machine, but I have just quickly tested example 3 >> with ssh over the internet connecting to my work CentOS PC. I think >> was the example you used to highlight the problem. On Windows it takes >> about 3 seconds from selecting the wxWidgets driver and pressing enter >> to running the wxViewer and displaying the plot. By comparison the ssh >> over the internet test takes about 10 seconds. 6 of those seconds are >> the time up to the window being initially displayed so they are >> probably the time required to load the executable and for wxWidgets to >> do its initial window setup. This leaves about 4 seconds to transfer >> the data to wxViewer and render. While I agree that this isn't >> brilliant, given all the extra overheads going on with that connection >> I think that is acceptable. What sort of rendering times do you see >> running on an actual Linux machine? > > > Hi Phil: > > I did some timing tests for 5.11.0 for comparison purposes which you > should be able to replicate at least in part (the wxwidgets part) on > your own Linux boxes. In each case after building x00c (one of the > simplest examples), wxwidgets, and other device drivers I ran > > time examples/c/x00c -dev <device> -np > > twice (where I used the second timing result since that tends to be > more reliable with everything required loaded from disk into memory > for that second try) for device = wxwidgets, xcairo, qtwidget, and > xwin. > > The 5.11.0 results when run directly on my principal box (no ssh) were > > wxwidgets: > real 0m0.421s > user 0m0.032s > sys 0m0.036s > > In this (5.11.0) case, the -np option is ignored (I think), but the > wxPLViewer > app seems to disconnect as soon as the page is displayed so > the above time corresponds pretty closely to the time required to > display that one page example. > > xcairo: > real 0m0.163s > user 0m0.016s > sys 0m0.008s > > qtwidget: > real 0m0.323s > user 0m0.060s > sys 0m0.024s > > xwin: > real 0m0.097s > user 0m0.004s > sys 0m0.000s > > If I did the same timing tests from a thin client over ssh, the times > went up by roughly a factor of two, and in no case did the time exceed > 1 second for any of these devices. > > In sum, wxwidgets is a bit slow for > 5.11.0, but still acceptable in comparison with other heavy-duty > interactive devices, and all of those heavy-duty ones are substantially > slower than -dev xwin. > > I also tested the current (2db68f6 Impliment use of nopause in the wxWidgets > driver) master tip in the same way for the direct method of display. > The timing results were similar (as expected) for xcairo, qtwidget, and > xwin. However, the command > > time examples/c/x00c -dev wxwidgets -np > > ended up with the wxPLViewer application frozen with no signs of life > for up to 30 seconds before I gave up and killed it. > > If this test works for you on Windows and also your CentOS box then > you should double check that you are really testing commit 2db68f6 > rather than some local version with added changes, and if you are, > perhaps this issue is due to some wxWidgets version issue? (I am > currently testing the Debian wheezy version 2.8.12.1-12 of WxWidgets.) > > Anyhow, I am anxious to complete the timing test on Linux for master > tip -dev wxwidgets once you can figure out how to unfreeze the > wxPLViewer application for my version of WxWidgets. > > In any case you should be doing similar timing tests yourself between > master tip and 5.11.0 on the Windows and CentOS boxes accessible to > you to make sure there have been no serious timing regressions > introduced during the current release cycle for our wxwidgets device > driver. (The additional timing results I reported above for the xcairo, > qtwidget, and xwin devices are just to give context for these > wxwidgets timing results, and there is no necessity for you to time > any device other than wxwidgets on your various boxes for > 5.11.0 and master tip.) > > Alan > __________________________ > Alan W. Irwin > > Astronomical research affiliation with Department of Physics and Astronomy, > University of Victoria (astrowww.phys.uvic.ca). > > Programming affiliations with the FreeEOS equation-of-state > implementation for stellar interiors (freeeos.sf.net); the Time > Ephemerides project (timeephem.sf.net); PLplot scientific plotting > software package (plplot.sf.net); the libLASi project > (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); > and the Linux Brochure Project (lbproject.sf.net). > __________________________ > > Linux-powered Science > __________________________ |
From: Alan W. I. <ir...@be...> - 2015-07-03 19:33:00
|
To Phil and Andrew: @Andrew: I am addressing you directly (in addition to Phil) because I have questions for you below concerning whether you can replicate my weird timing results for wxwidgets and also concerning the issue of shared memory leaks on Linux. On 2015-07-03 13:53+0100 Phil Rosenberg wrote: > Hi Alan > So I think you should now find the wxPLViewer speed has significantly > improved. The item I have dealt with now was that there is a timer at > the OS level (initiated by wxWidgets) which sends a message to the > viewer which initiates checking for new commands from the console app. > At this point I used to check for a single message and return after > dealing with it. However it seems that for whatever reason (different > OS, different wxWidgets version?) on some systems the minimum time > between the timer calls was quite long. This meant that when we were > dealing with text which required a lot of small transmissions to the > viewer there was a huge overhead. I have now modified things so that > on every timer event I check for new transmissions at least 100 times > with a 1 millisecond sleep between each. This means that for many > small transmissions in a short time we remove most of the overhead. > > On my CentOS machine the execution time reported by time x08c -dev > wxWidgets -np has dropped from 40s to about 2 s. Note that this only > times the console part, not the viewer, but still we have clearly > moved to something I would call acceptable. > > Note that there will always be a significant overhead for the > transmission so this will never compete with Cairo, but this is > unavoidable because wxWidgets cannot be run in a library without > running it as a separate thread and PlPlot isn't thread safe. Note > also that when using the wxWidgets driver via the wxWidgets binding > (I.e. from within a wxWidgets app) this all becomes irrelevant as the > viewer is not used. > > Anyway I hope this fix closes the issue and you are happy Alan. @Phil: The timing results for current master tip (commit id 5e74b6a6, "Modification to previous wxPLViewer optimisation") are much improved on first execution of examples. So big congratulations on that result! However, an important issue still remains as shown by the following repeated timing results for both xcairo (as a typical benchmark) and wxwidgets: software@raven> time examples/c/x00c -dev xcairo -np real 0m0.168s user 0m0.008s sys 0m0.016s software@raven> time examples/c/x00c -dev xcairo -np real 0m0.129s user 0m0.020s sys 0m0.004s software@raven> time examples/c/x00c -dev xcairo -np real 0m0.114s user 0m0.020s sys 0m0.008s software@raven> time examples/c/x00c -dev wxwidgets -np real 0m0.365s user 0m0.008s sys 0m0.012s software@raven> time examples/c/x00c -dev wxwidgets -np real 0m1.231s user 0m0.008s sys 0m0.012s software@raven> time examples/c/x00c -dev wxwidgets -np real 0m5.587s user 0m0.008s sys 0m0.012s So the xcairo case has subsequent time results that are up to 1.5 times faster (in real time which is the most important component) then the first execution of the example while the wxwidgets case has subsequent time results that are up to 15 (!) times slower. The initial fast wxwidgers result seems guaranteed if you wait a minute or so since your last attempt to use wxwidgets. I mostly tested the above wxwidgets time results with the -np option (for convenience), but I also tried the test a few times without -np and got the fast, slow, slow,... pattern for that case as well. Also, I usually get a consistent fast, slow, slow,.... pattern for wxwidgets, but just now I tried it again, and I got the fast result for something like 10 times in a row, then the slow result. So the wxwidgets inconsistency in time results is sometimes inconsistent. :-) The above results for the xcairo case are quite typical of all non-wxwidgets cases I have ever investigated before, where real time results are significantly shorter on second and subsequent execution because of the well-known effect on Linux of the system caching all reasonably small files in memory to improve small file access times for subsequent use. The very unusual much longer times that often but not always immediately appear on second and subsequent executions that occur for the wxwidgets case above strongly suggest to me a system resource is not being properly released by the first execution of the example so getting that resource back again often takes a lot of extra time on the second and subsequent runs. I brought up this issue with you off list much earlier this week, but you did not respond at that time, but I hope you do that now. The first step in your response should be to try the above bash commands for at least the wxwidgets case on your Windows and Linux platforms. to (a) see if this issue also occurs on Windows, and (b) see if this issue occurs on all Linux systems accessible to you. @Andrew: could you please try the above time tests as well on your Linux platforms? @Both: If the issue is a Linux resource which I am chronically short of because of my prior wxwidgets testing and a shared memory leak (see discussion below) in wxwidgets, then you might not be able to replicate the issue on Linux until you do sufficient wxwidgets testing. My Linux up time is currently 42 days, and if I reboot (which I won't do lightly because there are two users on the system and it is a bit of a pain for us to reset all desktops the way we like them), I might find the issue goes away for a while. @Phil: If the issue is a Linux-only one, then an obvious candidate for a system resource grabber is the shm_open calls in drivers/wxwidgets_comms.cpp that helps to allow shared memory access between the wxwidgets device driver and wxPLViewer. I have just now read the shm_overview and shm_open/shm_unlink man pages, and it appears that shm_unlink must be called by _both_ the wxwidgets device driver and wxPLViewer to properly release the shared memory. So a code review for those two cases to make sure that _always_ happens (i.e., there is no shared memory leak) is indicated. @Andrew: will you comment further here please about whether we should be concerned about possible shared memory leaks? I now realize, for example, that both Phil's and my calling the IPC method that is used "shared memory" is likely not specific enough because it does not distinguish the old-fashioned shmget method (which we do not use) with the modern mmap-based method that we do use. See <http://stackoverflow.com/questions/5656530/how-to-use-shared-memory-with-linux-in-c> for comments on this important "shared-memory" distinction. Furthermore, from the discussion at <http://stackoverflow.com/questions/22691621/how-to-avoid-shared-memory-leaks> it appears the mmap-based method cannot actually leak memory. Is that your interpretation of those remarks as well? @Phil: One thing I noticed from reading those man pages is it is suggested the name used for shm_open should always start with a "/" for the most portable results. It appears our current wxwidgets code does not follow that suggestion so this may affect Unix platforms that are not Linux. So I suggest you will want to address that issue (even though it has nothing to do with timing. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2015-07-06 08:52:17
|
Hi Alan - this is now slightly out of context as I found it in my drafts folder unsent. I can confirm similar odd timing results on my Centos machine. I did have a couple of exit() calls in wxPLViewer, which to be honest was just lazy of me. On was used to close the app when -np was used. I have now replaced these both with clean exit methods. this ensures all destructors are called and I can now say that at least as far as I can tell on Windows the shared memory is _always_ released. It is certainly released when the window is closed using -np. This is on the viewer side. I presume the use of -np still gives a clean exit on the console side and doesn't invoke any exit calls anywhere? Unfortunately the odd timings still persist. Note that the application only asks for 1 MB of shared memory so even if not freed I wouldn't have thought it would slow anything down really after just one run. I haven't got time right now to set up a decent debugging environment on my Centos machine, but I will try to double check the shared memory is released on that system as soon as I can. Phil On 3 July 2015 at 20:32, Alan W. Irwin <ir...@be...> wrote: > To Phil and Andrew: > > @Andrew: I am addressing you directly (in addition to Phil) because I > have questions for you below concerning whether you can replicate my > weird timing results for wxwidgets and also concerning the issue of > shared memory leaks on Linux. > > > On 2015-07-03 13:53+0100 Phil Rosenberg wrote: > >> Hi Alan >> So I think you should now find the wxPLViewer speed has significantly >> improved. The item I have dealt with now was that there is a timer at >> the OS level (initiated by wxWidgets) which sends a message to the >> viewer which initiates checking for new commands from the console app. >> At this point I used to check for a single message and return after >> dealing with it. However it seems that for whatever reason (different >> OS, different wxWidgets version?) on some systems the minimum time >> between the timer calls was quite long. This meant that when we were >> dealing with text which required a lot of small transmissions to the >> viewer there was a huge overhead. I have now modified things so that >> on every timer event I check for new transmissions at least 100 times >> with a 1 millisecond sleep between each. This means that for many >> small transmissions in a short time we remove most of the overhead. >> >> On my CentOS machine the execution time reported by time x08c -dev >> wxWidgets -np has dropped from 40s to about 2 s. Note that this only >> times the console part, not the viewer, but still we have clearly >> moved to something I would call acceptable. >> >> Note that there will always be a significant overhead for the >> transmission so this will never compete with Cairo, but this is >> unavoidable because wxWidgets cannot be run in a library without >> running it as a separate thread and PlPlot isn't thread safe. Note >> also that when using the wxWidgets driver via the wxWidgets binding >> (I.e. from within a wxWidgets app) this all becomes irrelevant as the >> viewer is not used. >> >> Anyway I hope this fix closes the issue and you are happy Alan. > > > @Phil: > > The timing results for current master tip (commit id 5e74b6a6, > "Modification to previous wxPLViewer optimisation") are much improved > on first execution of examples. So big congratulations on that result! > > However, an important issue still remains as shown by the following > repeated timing results for both xcairo (as a typical benchmark) and > wxwidgets: > > software@raven> time examples/c/x00c -dev xcairo -np > > real 0m0.168s > user 0m0.008s > sys 0m0.016s > software@raven> time examples/c/x00c -dev xcairo -np > > real 0m0.129s > user 0m0.020s > sys 0m0.004s > software@raven> time examples/c/x00c -dev xcairo -np > > real 0m0.114s > user 0m0.020s > sys 0m0.008s > > software@raven> time examples/c/x00c -dev wxwidgets -np > > real 0m0.365s > user 0m0.008s > sys 0m0.012s > software@raven> time examples/c/x00c -dev wxwidgets -np > > real 0m1.231s > user 0m0.008s > sys 0m0.012s > software@raven> time examples/c/x00c -dev wxwidgets -np > > real 0m5.587s > user 0m0.008s > sys 0m0.012s > > So the xcairo case has subsequent time results that are up to 1.5 > times faster (in real time which is the most important component) then > the first execution of the example while the wxwidgets case has > subsequent time results that are up to 15 (!) times slower. The > initial fast wxwidgers result seems guaranteed if you wait a minute or > so since your last attempt to use wxwidgets. I mostly tested the > above wxwidgets time results with the -np option (for convenience), > but I also tried the test a few times without -np and got the fast, > slow, slow,... pattern for that case as well. Also, I usually get a > consistent fast, slow, slow,.... pattern for wxwidgets, but just now I > tried it again, and I got the fast result for something like 10 times > in a row, then the slow result. So the wxwidgets inconsistency in time > results is sometimes inconsistent. :-) > > The above results for the xcairo case are quite typical of all > non-wxwidgets cases I have ever investigated before, where real time > results are significantly shorter on second and subsequent > execution because of the well-known effect on Linux of the system > caching all reasonably small files in memory to improve small file > access times for subsequent use. > > The very unusual much longer times that often but not always > immediately appear on second and subsequent executions that occur for > the wxwidgets case above strongly suggest to me a system resource is > not being properly released by the first execution of the example so > getting that resource back again often takes a lot of extra time on > the second and subsequent runs. > > I brought up this issue with you off list much earlier this week, but > you did not respond at that time, but I hope you do that now. The > first step in your response should be to try the above bash commands > for at least the wxwidgets case on your Windows and Linux platforms. > to (a) see if this issue also occurs on Windows, and (b) see if this > issue occurs on all Linux systems accessible to you. > > @Andrew: could you please try the above time tests as well > on your Linux platforms? > > @Both: > > If the issue is a Linux resource which I am chronically short of > because of my prior wxwidgets testing and a shared memory leak (see > discussion below) in wxwidgets, then you might not be able to > replicate the issue on Linux until you do sufficient wxwidgets > testing. My Linux up time is currently 42 days, and if I reboot > (which I won't do lightly because there are two users on the system > and it is a bit of a pain for us to reset all desktops the way we like > them), I might find the issue goes away for a while. > > @Phil: > > If the issue is a Linux-only one, then an obvious candidate for a > system resource grabber is the shm_open calls in > drivers/wxwidgets_comms.cpp that helps to allow shared memory access > between the wxwidgets device driver and wxPLViewer. I have just now > read the shm_overview and shm_open/shm_unlink man pages, and it > appears that shm_unlink must be called by _both_ the wxwidgets device > driver and wxPLViewer to properly release the shared memory. So a > code review for those two cases to make sure that _always_ happens > (i.e., there is no shared memory leak) is indicated. > > @Andrew: will you comment further here please about whether we should > be concerned about possible shared memory leaks? I now realize, for > example, that both Phil's and my calling the IPC method that is used > "shared memory" is likely not specific enough because it does not > distinguish the old-fashioned shmget method (which we do not use) with > the modern mmap-based method that we do use. See > <http://stackoverflow.com/questions/5656530/how-to-use-shared-memory-with-linux-in-c> > for comments on this important "shared-memory" distinction. Furthermore, > from the discussion at > <http://stackoverflow.com/questions/22691621/how-to-avoid-shared-memory-leaks> > it appears the mmap-based method cannot actually leak memory. Is that > your interpretation of those remarks as well? > > @Phil: > > One thing I noticed from reading those man pages is it is suggested > the name used for shm_open should always start with a "/" for the most > portable results. It appears our current wxwidgets code does not > follow that suggestion so this may affect Unix platforms that are not > Linux. So I suggest you will want to address that issue (even though > it has nothing to do with timing. > > > Alan > __________________________ > Alan W. Irwin > > Astronomical research affiliation with Department of Physics and Astronomy, > University of Victoria (astrowww.phys.uvic.ca). > > Programming affiliations with the FreeEOS equation-of-state > implementation for stellar interiors (freeeos.sf.net); the Time > Ephemerides project (timeephem.sf.net); PLplot scientific plotting > software package (plplot.sf.net); the libLASi project > (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); > and the Linux Brochure Project (lbproject.sf.net). > __________________________ > > Linux-powered Science > __________________________ |
From: Alan W. I. <ir...@be...> - 2015-07-04 17:51:49
|
Hi Phil: Your recent efficiency improvements means it is now a lot more convenient to test the wxwidgets device. So I have done that and noticed the following issues that appear to be caused by your efficiency improvements. * The -np option no longer works properly. For example, before these efficency improvements examples/c/x08c -dev wxwidgets -np would (extremely slowly) render each page of that example and then exit. Now, the rendering for each page simply presents a a black screen before the exit occurs. Sometimes, just as the example exits you will see a quick flash of the last page, but usually not. In addition, the example now produces the following WARNING message for 5 of the pages *** PLPLOT WARNING *** Failed to get text size from wxPLViewer - timeout Without the -np option, none of these problems occur, and the time to render all the example 8 pages has a noticeable efficiency improvement consistent with the length of time the black screen is rendered when the -np option is used. * examples/c/x17c -dev wxwidgets shows huge efficiency improvements, but it has changed from an interactive plot showing all the intermediate results to a black screen until the very end which then shows the final results. In other words, the interactive nature of this example has been lost. * examples/c/x26c -dev wxwidgets shows something is wrong with the delivery of string-length calculations between -dev wxwidgets and wxPLViewer. (For this case I am not sure whether this issue was introduced right when you implemented the new string-length calculation for wxwidgets or is the result of your recent efficiency changes.) The delivery of string-length information should be done independently for each page of this example (since the Russian text of the second page is longer than the English text of the first page). What goes on here is that _for each page of the example_ the pllegend call internally calls plstrl which then requests the wxwidgets device to deliver plsc->string_length (used by pllegend to adjust legend-box size) before wxwidgets actually renders the string. Of course, the complication is that -dev wxwidgets measures string lengths and renders those strings indirectly via wxPLViewer. So obviously there has to be communications between -dev wxwidgets and wxPLViewer for every different page of example 26 to get that done properly. Instead what happens now is that examples/c/x26c -dev wxwidgets incorrectly finishes (i.e., you get a command-line prompt and/or time results if requested) as soon as the first page of that example has been rendered by wxPLViewer. And when the second page is viewed by hitting the enter key for the wxPLViewer GUI, the legend box (which should be controlled by the string-length calculation for the Russian text of that page) appears to be the same size as the first page so that the longer Russian text for the second page overflows the legend box for that page. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Jim D. <ji...@di...> - 2015-07-04 18:58:10
|
The behavior with the -np flag is something I have grappled with while implementing the new windows driver. Example 17 is particularly challenging because of the way the stripchart function is implemented. Splitting out the wait for user input from the EOP handler is part of the fix. If that function is not defined, that might be part of the problem. As for example 17, what should the behavior be on a resize event? Right now I play the back the entire buffer, which shows all the rescalings. I think the correct behavior on a resize would be to render the most recent frame. > On Jul 4, 2015, at 1:51 PM, Alan W. Irwin <ir...@be...> wrote: > > Hi Phil: > > Your recent efficiency improvements means it is now a lot more > convenient to test the wxwidgets device. So I have done that and > noticed the following issues that appear to be caused by your > efficiency improvements. > > * The -np option no longer works properly. For example, before these > efficency improvements > > examples/c/x08c -dev wxwidgets -np > > would (extremely slowly) render each page of that example and then exit. > > Now, the rendering for each page simply presents a a black screen > before the exit occurs. Sometimes, just as the example exits you will > see a quick flash of the last page, but usually not. > > In addition, the example now produces the following WARNING message > for 5 of the pages > > *** PLPLOT WARNING *** > Failed to get text size from wxPLViewer - timeout > > Without the -np option, none of these problems occur, and the > time to render all the example 8 pages has a noticeable efficiency > improvement consistent with the length of time the black screen > is rendered when the -np option is used. > > * examples/c/x17c -dev wxwidgets shows huge efficiency improvements, > but it has changed from an interactive plot showing all the > intermediate results to a black screen until the very end which then > shows the final results. In other words, the interactive nature of > this example has been lost. > > * examples/c/x26c -dev wxwidgets shows something is wrong with the > delivery of string-length calculations between -dev wxwidgets and > wxPLViewer. (For this case I am not sure whether this issue was > introduced right when you implemented the new string-length > calculation for wxwidgets or is the result of your recent efficiency > changes.) The delivery of string-length information should be done > independently for each page of this example (since the Russian text of > the second page is longer than the English text of the first page). > > What goes on here is that _for each page of the example_ the pllegend > call internally calls plstrl which then requests the wxwidgets device > to deliver plsc->string_length (used by pllegend to adjust legend-box > size) before wxwidgets actually renders the string. Of course, the > complication is that -dev wxwidgets measures string lengths and > renders those strings indirectly via wxPLViewer. So obviously there > has to be communications between -dev wxwidgets and wxPLViewer for > every different page of example 26 to get that done properly. > > Instead what happens now is that examples/c/x26c -dev wxwidgets > incorrectly finishes (i.e., you get a command-line prompt and/or time > results if requested) as soon as the first page of that example has > been rendered by wxPLViewer. And when the second page is viewed by > hitting the enter key for the wxPLViewer GUI, the legend box (which > should be controlled by the string-length calculation for the Russian > text of that page) appears to be the same size as the first page so > that the longer Russian text for the second page overflows the legend > box for that page. > > Alan > __________________________ > Alan W. Irwin > > Astronomical research affiliation with Department of Physics and Astronomy, > University of Victoria (astrowww.phys.uvic.ca). > > Programming affiliations with the FreeEOS equation-of-state > implementation for stellar interiors (freeeos.sf.net); the Time > Ephemerides project (timeephem.sf.net); PLplot scientific plotting > software package (plplot.sf.net); the libLASi project > (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); > and the Linux Brochure Project (lbproject.sf.net). > __________________________ > > Linux-powered Science > __________________________ > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > _______________________________________________ > Plplot-devel mailing list > Plp...@li... > https://lists.sourceforge.net/lists/listinfo/plplot-devel |
From: Alan W. I. <ir...@be...> - 2015-07-04 20:49:54
|
On 2015-07-04 14:57-0400 Jim Dishaw wrote: > As for example 17, what should the behavior be on a resize event? Right now I play the back the entire buffer, which shows all the rescalings. I think the correct behavior on a resize would be to render the most recent frame. Hi Jim: I have (slightly) changed the subject line for this subtopic of your post. The answer to your question is you should model resizes for your new device on what happens for -dev xwin. In that case, a resize during the course of example 17 resizes the current version of the plot (which is what I assume you mean by "the most recent frame") and then continues from there. As I have mentioned before there is still a long-standing resize scaling bug for -dev xwin where the scaling is appropriate to the size of the last resize rather than the current resize. So small resizes look OK, but large ones do not. We can continue to live with this long-standing bug if necessary in -dev xwin, but I hope your new device does not have this issue so you will know the right fix for -dev xwin and be able to get that fix into the forthcoming 5.11.1. The other resizing issue I have remarked on before concerns the xcairo device. The plD_eop_xcairo function currently reads as follows: void plD_eop_xcairo( PLStream *pls ) { PLCairo *aStream; aStream = (PLCairo *) pls->dev; // Blit the offscreen image to the X window. blit_to_x( pls, 0.0, 0.0, pls->xlength, pls->ylength ); if ( aStream->xdrawable_mode ) return; } and my concern is the last two statements in this function since they make no difference, i.e., the return occurs regardless of whether aStream->xdrawable_mode is true or not. Note, that plD_bop_xcairo has a similar test of aStream->xdrawable_mode, but in that case if aStream->xdrawable_mode is true there is an immediate return and otherwise there is a call to XFlush( aStream->XDisplay ) before the return. I am pretty sure that same logic should be used in the above case as well so I tried that change in hopes that it would solve the current resizing issues with xcairo, but it made no difference and resizing for xcairo continues to leave the scale and origin of the rendered graphics completely unaffected. Nevertheless, I think this fix should be part of the complete cure for this issue so I hope you will consider it when you go ahead with your plan to fix this issue. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2015-07-05 08:33:21
|
Hi Jim I fought with x17 too. Philosophically the only correct behaviour for a driver to do on resize is to replot the whole buffer. For x17, this is clearly much more plotting than necessary. However, the only alternative is for each driver to have a buffer parser which checks which part of the buffer to plot. This should not be included in the drivers, but could be a future optimisation in Plplot core code. Already I think the division of labour between drivers and core is more blurred than it should be and including parsing in the drivers would worsen, not improve this. Phil -----Original Message----- From: "Jim Dishaw" <ji...@di...> Sent: 04/07/2015 19:58 To: "Alan W. Irwin" <ir...@be...> Cc: "Phil Rosenberg" <p.d...@gm...>; "PLplot development list" <Plp...@li...> Subject: Re: [Plplot-devel] Status report on remaining issues to be addressedfor the forthcoming 5.11.1 release (wxwidgets issues) The behavior with the -np flag is something I have grappled with while implementing the new windows driver. Example 17 is particularly challenging because of the way the stripchart function is implemented. Splitting out the wait for user input from the EOP handler is part of the fix. If that function is not defined, that might be part of the problem. As for example 17, what should the behavior be on a resize event? Right now I play the back the entire buffer, which shows all the rescalings. I think the correct behavior on a resize would be to render the most recent frame. > On Jul 4, 2015, at 1:51 PM, Alan W. Irwin <ir...@be...> wrote: > > Hi Phil: > > Your recent efficiency improvements means it is now a lot more > convenient to test the wxwidgets device. So I have done that and > noticed the following issues that appear to be caused by your > efficiency improvements. > > * The -np option no longer works properly. For example, before these > efficency improvements > > examples/c/x08c -dev wxwidgets -np > > would (extremely slowly) render each page of that example and then exit. > > Now, the rendering for each page simply presents a a black screen > before the exit occurs. Sometimes, just as the example exits you will > see a quick flash of the last page, but usually not. > > In addition, the example now produces the following WARNING message > for 5 of the pages > > *** PLPLOT WARNING *** > Failed to get text size from wxPLViewer - timeout > > Without the -np option, none of these problems occur, and the > time to render all the example 8 pages has a noticeable efficiency > improvement consistent with the length of time the black screen > is rendered when the -np option is used. > > * examples/c/x17c -dev wxwidgets shows huge efficiency improvements, > but it has changed from an interactive plot showing all the > intermediate results to a black screen until the very end which then > shows the final results. In other words, the interactive nature of > this example has been lost. > > * examples/c/x26c -dev wxwidgets shows something is wrong with the > delivery of string-length calculations between -dev wxwidgets and > wxPLViewer. (For this case I am not sure whether this issue was > introduced right when you implemented the new string-length > calculation for wxwidgets or is the result of your recent efficiency > changes.) The delivery of string-length information should be done > independently for each page of this example (since the Russian text of > the second page is longer than the English text of the first page). > > What goes on here is that _for each page of the example_ the pllegend > call internally calls plstrl which then requests the wxwidgets device > to deliver plsc->string_length (used by pllegend to adjust legend-box > size) before wxwidgets actually renders the string. Of course, the > complication is that -dev wxwidgets measures string lengths and > renders those strings indirectly via wxPLViewer. So obviously there > has to be communications between -dev wxwidgets and wxPLViewer for > every different page of example 26 to get that done properly. > > Instead what happens now is that examples/c/x26c -dev wxwidgets > incorrectly finishes (i.e., you get a command-line prompt and/or time > results if requested) as soon as the first page of that example has > been rendered by wxPLViewer. And when the second page is viewed by > hitting the enter key for the wxPLViewer GUI, the legend box (which > should be controlled by the string-length calculation for the Russian > text of that page) appears to be the same size as the first page so > that the longer Russian text for the second page overflows the legend > box for that page. > > Alan > __________________________ > Alan W. Irwin > > Astronomical research affiliation with Department of Physics and Astronomy, > University of Victoria (astrowww.phys.uvic.ca). > > Programming affiliations with the FreeEOS equation-of-state > implementation for stellar interiors (freeeos.sf.net); the Time > Ephemerides project (timeephem.sf.net); PLplot scientific plotting > software package (plplot.sf.net); the libLASi project > (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); > and the Linux Brochure Project (lbproject.sf.net). > __________________________ > > Linux-powered Science > __________________________ > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > _______________________________________________ > Plplot-devel mailing list > Plp...@li... > https://lists.sourceforge.net/lists/listinfo/plplot-devel |
From: Phil R. <p.d...@gm...> - 2015-07-05 08:25:52
|
Hi Alan Could describe exactly what the -np option should do and also add that to the documentation. At the moment for wxWidgets as soon as a page is rendered we move to the next page, which as you state causes at most a brief flicker of a plot. But this is exactly what the documentation states it should do. Regarding the warnings, I am not sure what to do. How long do you think is sensible to wait for a response from the viewer? This is a balance, because if someone kills the viewer or it crashes then the console will be waiting for a response that never comes. I will look at tuning this, but suggestions welcome. I will look into x17 and x26. I think I can guess the issue for 17, but unless 26 is doing something obscure I think it should have just worked. Phil -----Original Message----- From: "Alan W. Irwin" <ir...@be...> Sent: 04/07/2015 18:51 To: "Phil Rosenberg" <p.d...@gm...> Cc: "PLplot development list" <Plp...@li...> Subject: Re: Status report on remaining issues to be addressed for theforthcoming 5.11.1 release (wxwidgets issues) Hi Phil: Your recent efficiency improvements means it is now a lot more convenient to test the wxwidgets device. So I have done that and noticed the following issues that appear to be caused by your efficiency improvements. * The -np option no longer works properly. For example, before these efficency improvements examples/c/x08c -dev wxwidgets -np would (extremely slowly) render each page of that example and then exit. Now, the rendering for each page simply presents a a black screen before the exit occurs. Sometimes, just as the example exits you will see a quick flash of the last page, but usually not. In addition, the example now produces the following WARNING message for 5 of the pages *** PLPLOT WARNING *** Failed to get text size from wxPLViewer - timeout Without the -np option, none of these problems occur, and the time to render all the example 8 pages has a noticeable efficiency improvement consistent with the length of time the black screen is rendered when the -np option is used. * examples/c/x17c -dev wxwidgets shows huge efficiency improvements, but it has changed from an interactive plot showing all the intermediate results to a black screen until the very end which then shows the final results. In other words, the interactive nature of this example has been lost. * examples/c/x26c -dev wxwidgets shows something is wrong with the delivery of string-length calculations between -dev wxwidgets and wxPLViewer. (For this case I am not sure whether this issue was introduced right when you implemented the new string-length calculation for wxwidgets or is the result of your recent efficiency changes.) The delivery of string-length information should be done independently for each page of this example (since the Russian text of the second page is longer than the English text of the first page). What goes on here is that _for each page of the example_ the pllegend call internally calls plstrl which then requests the wxwidgets device to deliver plsc->string_length (used by pllegend to adjust legend-box size) before wxwidgets actually renders the string. Of course, the complication is that -dev wxwidgets measures string lengths and renders those strings indirectly via wxPLViewer. So obviously there has to be communications between -dev wxwidgets and wxPLViewer for every different page of example 26 to get that done properly. Instead what happens now is that examples/c/x26c -dev wxwidgets incorrectly finishes (i.e., you get a command-line prompt and/or time results if requested) as soon as the first page of that example has been rendered by wxPLViewer. And when the second page is viewed by hitting the enter key for the wxPLViewer GUI, the legend box (which should be controlled by the string-length calculation for the Russian text of that page) appears to be the same size as the first page so that the longer Russian text for the second page overflows the legend box for that page. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Alan W. I. <ir...@be...> - 2015-07-05 18:48:42
|
On 2015-07-05 09:25+0100 Phil Rosenberg wrote: > Hi Alan > Could describe exactly what the -np option should do and also add that to the documentation. At the moment for wxWidgets as soon as a page is rendered we move to the next page, which as you state causes at most a brief flicker of a plot. But this is exactly what the documentation states it should do. Hi Phil: The -np option simply means "no pause" for human interaction (such as waiting for a human to hit the "enter" key to move from one page to the next or to terminate the plot if the last page is being displayed). Let me ask you a question in return. What general method is used by wxPLViewer to actually render a page to the screen? (Note the question concerns the rendering part alone and not the steps leading up to the start of that rendering.) Is that general rendering process expected to be extremely short? For example, I just checked examples/c/x08c -dev wxwidgets -bg 0000FF (note without the -np option), and for each page the associated wxPLViewer gives you a black screen for a while, and then an "instantaneous" render of the entire page including the blue PLplot background specified by the -bg 0000FF option above (as opposed to the default black PLplot background to keep the PLplot background distinguisable from the black background of the wxPLViewer GUI). Such instantaneous rendering results are fine, but I am curious about the general method used to do that instanteous rendering. I have also done some experiments with examples/c/x08c -dev xwin -bg 0000FF The initial render of a page is pretty fast but still slow enough so you can see it happen. But on resizes (which I believe uses the same general method as wxPLViewer) it appears to be instantanous despite the complexities of that plot (which is fine). If you confirm the rendering part should be essentially instantaneous for wxPLViewer, then it appears the wxwidgets device is handling the -np option correctly. However, I strongly suggest a change so the -np option produces more meaningful looking results on multipage examples which is to allow the previous page to be displayed on the screen until the present page is ready to be rendered. So with this model, wxPLViewer would initialize the GUI (which presumably would create the black background), process a page to collect all the data needed for rendering, render the page, terminate the page, then continue with processing the next page without reinitializing the GUI so you would never see the black "GUI" background after the first page was rendered. This change although extremely useful for multipage examples would still leave single-page examples rendered as a brief flicker, but I don't think that can be helped unless we were willing to put pauses in when PLplot exited which I would think would be a bad idea. > Regarding the warnings, I am not sure what to do. How long do you think is sensible to wait for a response from the viewer? This is a balance, because if someone kills the viewer or it crashes then the console will be waiting for a response that never comes. I will look at tuning this, but suggestions welcome. Instead of treating symptoms here I would try and cure the disease which is the rather long times that the combination of wxPLViewer and -dev wxwidgets are currently waiting for each other. For example, if you cure why the second and subsubsequent runs of an example can take up to 16 times longer than the first run of an example, that cure might have implications for runs of single examples, i.e., their wait times might be substantially reduced so the above warning issue might just go away. > I will look into x17 and x26. Thanks, and I look forward to your conclusions concerning those examples. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2015-07-05 21:14:54
|
Hi Alan Regarding the rendering, this is buffered. Hence the bit that actually displays to screen is a blt, which is close to instant - it should take significantly less than one screen refresh. Well that would be the case if using an accelerated method. With gdi, gdi+ or gtk which wxWidgets use it may take a bit longer. So in that sense the viewer is doing what it should I think. I will look at leaving the previous plot visible, but I think it will be low on my list of priorities at the moment Phil -----Original Message----- From: "Alan W. Irwin" <ir...@be...> Sent: 05/07/2015 19:48 To: "Phil Rosenberg" <p.d...@gm...> Cc: "PLplot development list" <Plp...@li...> Subject: RE: Status report on remaining issues to be addressed fortheforthcoming 5.11.1 release (wxwidgets issues) On 2015-07-05 09:25+0100 Phil Rosenberg wrote: > Hi Alan > Could describe exactly what the -np option should do and also add that to the documentation. At the moment for wxWidgets as soon as a page is rendered we move to the next page, which as you state causes at most a brief flicker of a plot. But this is exactly what the documentation states it should do. Hi Phil: The -np option simply means "no pause" for human interaction (such as waiting for a human to hit the "enter" key to move from one page to the next or to terminate the plot if the last page is being displayed). Let me ask you a question in return. What general method is used by wxPLViewer to actually render a page to the screen? (Note the question concerns the rendering part alone and not the steps leading up to the start of that rendering.) Is that general rendering process expected to be extremely short? For example, I just checked examples/c/x08c -dev wxwidgets -bg 0000FF (note without the -np option), and for each page the associated wxPLViewer gives you a black screen for a while, and then an "instantaneous" render of the entire page including the blue PLplot background specified by the -bg 0000FF option above (as opposed to the default black PLplot background to keep the PLplot background distinguisable from the black background of the wxPLViewer GUI). Such instantaneous rendering results are fine, but I am curious about the general method used to do that instanteous rendering. I have also done some experiments with examples/c/x08c -dev xwin -bg 0000FF The initial render of a page is pretty fast but still slow enough so you can see it happen. But on resizes (which I believe uses the same general method as wxPLViewer) it appears to be instantanous despite the complexities of that plot (which is fine). If you confirm the rendering part should be essentially instantaneous for wxPLViewer, then it appears the wxwidgets device is handling the -np option correctly. However, I strongly suggest a change so the -np option produces more meaningful looking results on multipage examples which is to allow the previous page to be displayed on the screen until the present page is ready to be rendered. So with this model, wxPLViewer would initialize the GUI (which presumably would create the black background), process a page to collect all the data needed for rendering, render the page, terminate the page, then continue with processing the next page without reinitializing the GUI so you would never see the black "GUI" background after the first page was rendered. This change although extremely useful for multipage examples would still leave single-page examples rendered as a brief flicker, but I don't think that can be helped unless we were willing to put pauses in when PLplot exited which I would think would be a bad idea. > Regarding the warnings, I am not sure what to do. How long do you think is sensible to wait for a response from the viewer? This is a balance, because if someone kills the viewer or it crashes then the console will be waiting for a response that never comes. I will look at tuning this, but suggestions welcome. Instead of treating symptoms here I would try and cure the disease which is the rather long times that the combination of wxPLViewer and -dev wxwidgets are currently waiting for each other. For example, if you cure why the second and subsubsequent runs of an example can take up to 16 times longer than the first run of an example, that cure might have implications for runs of single examples, i.e., their wait times might be substantially reduced so the above warning issue might just go away. > I will look into x17 and x26. Thanks, and I look forward to your conclusions concerning those examples. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |
From: Phil R. <p.d...@gm...> - 2015-07-06 08:52:17
|
Hi Alan - this is now slightly out of context as I found it in my drafts folder unsent. I can confirm similar odd timing results on my Centos machine. I did have a couple of exit() calls in wxPLViewer, which to be honest was just lazy of me. On was used to close the app when -np was used. I have now replaced these both with clean exit methods. this ensures all destructors are called and I can now say that at least as far as I can tell on Windows the shared memory is _always_ released. It is certainly released when the window is closed using -np. This is on the viewer side. I presume the use of -np still gives a clean exit on the console side and doesn't invoke any exit calls anywhere? Unfortunately the odd timings still persist. Note that the application only asks for 1 MB of shared memory so even if not freed I wouldn't have thought it would slow anything down really after just one run. I haven't got time right now to set up a decent debugging environment on my Centos machine, but I will try to double check the shared memory is released on that system as soon as I can. Phil On 3 July 2015 at 20:32, Alan W. Irwin <ir...@be...> wrote: > To Phil and Andrew: > > @Andrew: I am addressing you directly (in addition to Phil) because I > have questions for you below concerning whether you can replicate my > weird timing results for wxwidgets and also concerning the issue of > shared memory leaks on Linux. > > > On 2015-07-03 13:53+0100 Phil Rosenberg wrote: > >> Hi Alan >> So I think you should now find the wxPLViewer speed has significantly >> improved. The item I have dealt with now was that there is a timer at >> the OS level (initiated by wxWidgets) which sends a message to the >> viewer which initiates checking for new commands from the console app. >> At this point I used to check for a single message and return after >> dealing with it. However it seems that for whatever reason (different >> OS, different wxWidgets version?) on some systems the minimum time >> between the timer calls was quite long. This meant that when we were >> dealing with text which required a lot of small transmissions to the >> viewer there was a huge overhead. I have now modified things so that >> on every timer event I check for new transmissions at least 100 times >> with a 1 millisecond sleep between each. This means that for many >> small transmissions in a short time we remove most of the overhead. >> >> On my CentOS machine the execution time reported by time x08c -dev >> wxWidgets -np has dropped from 40s to about 2 s. Note that this only >> times the console part, not the viewer, but still we have clearly >> moved to something I would call acceptable. >> >> Note that there will always be a significant overhead for the >> transmission so this will never compete with Cairo, but this is >> unavoidable because wxWidgets cannot be run in a library without >> running it as a separate thread and PlPlot isn't thread safe. Note >> also that when using the wxWidgets driver via the wxWidgets binding >> (I.e. from within a wxWidgets app) this all becomes irrelevant as the >> viewer is not used. >> >> Anyway I hope this fix closes the issue and you are happy Alan. > > > @Phil: > > The timing results for current master tip (commit id 5e74b6a6, > "Modification to previous wxPLViewer optimisation") are much improved > on first execution of examples. So big congratulations on that result! > > However, an important issue still remains as shown by the following > repeated timing results for both xcairo (as a typical benchmark) and > wxwidgets: > > software@raven> time examples/c/x00c -dev xcairo -np > > real 0m0.168s > user 0m0.008s > sys 0m0.016s > software@raven> time examples/c/x00c -dev xcairo -np > > real 0m0.129s > user 0m0.020s > sys 0m0.004s > software@raven> time examples/c/x00c -dev xcairo -np > > real 0m0.114s > user 0m0.020s > sys 0m0.008s > > software@raven> time examples/c/x00c -dev wxwidgets -np > > real 0m0.365s > user 0m0.008s > sys 0m0.012s > software@raven> time examples/c/x00c -dev wxwidgets -np > > real 0m1.231s > user 0m0.008s > sys 0m0.012s > software@raven> time examples/c/x00c -dev wxwidgets -np > > real 0m5.587s > user 0m0.008s > sys 0m0.012s > > So the xcairo case has subsequent time results that are up to 1.5 > times faster (in real time which is the most important component) then > the first execution of the example while the wxwidgets case has > subsequent time results that are up to 15 (!) times slower. The > initial fast wxwidgers result seems guaranteed if you wait a minute or > so since your last attempt to use wxwidgets. I mostly tested the > above wxwidgets time results with the -np option (for convenience), > but I also tried the test a few times without -np and got the fast, > slow, slow,... pattern for that case as well. Also, I usually get a > consistent fast, slow, slow,.... pattern for wxwidgets, but just now I > tried it again, and I got the fast result for something like 10 times > in a row, then the slow result. So the wxwidgets inconsistency in time > results is sometimes inconsistent. :-) > > The above results for the xcairo case are quite typical of all > non-wxwidgets cases I have ever investigated before, where real time > results are significantly shorter on second and subsequent > execution because of the well-known effect on Linux of the system > caching all reasonably small files in memory to improve small file > access times for subsequent use. > > The very unusual much longer times that often but not always > immediately appear on second and subsequent executions that occur for > the wxwidgets case above strongly suggest to me a system resource is > not being properly released by the first execution of the example so > getting that resource back again often takes a lot of extra time on > the second and subsequent runs. > > I brought up this issue with you off list much earlier this week, but > you did not respond at that time, but I hope you do that now. The > first step in your response should be to try the above bash commands > for at least the wxwidgets case on your Windows and Linux platforms. > to (a) see if this issue also occurs on Windows, and (b) see if this > issue occurs on all Linux systems accessible to you. > > @Andrew: could you please try the above time tests as well > on your Linux platforms? > > @Both: > > If the issue is a Linux resource which I am chronically short of > because of my prior wxwidgets testing and a shared memory leak (see > discussion below) in wxwidgets, then you might not be able to > replicate the issue on Linux until you do sufficient wxwidgets > testing. My Linux up time is currently 42 days, and if I reboot > (which I won't do lightly because there are two users on the system > and it is a bit of a pain for us to reset all desktops the way we like > them), I might find the issue goes away for a while. > > @Phil: > > If the issue is a Linux-only one, then an obvious candidate for a > system resource grabber is the shm_open calls in > drivers/wxwidgets_comms.cpp that helps to allow shared memory access > between the wxwidgets device driver and wxPLViewer. I have just now > read the shm_overview and shm_open/shm_unlink man pages, and it > appears that shm_unlink must be called by _both_ the wxwidgets device > driver and wxPLViewer to properly release the shared memory. So a > code review for those two cases to make sure that _always_ happens > (i.e., there is no shared memory leak) is indicated. > > @Andrew: will you comment further here please about whether we should > be concerned about possible shared memory leaks? I now realize, for > example, that both Phil's and my calling the IPC method that is used > "shared memory" is likely not specific enough because it does not > distinguish the old-fashioned shmget method (which we do not use) with > the modern mmap-based method that we do use. See > <http://stackoverflow.com/questions/5656530/how-to-use-shared-memory-with-linux-in-c> > for comments on this important "shared-memory" distinction. Furthermore, > from the discussion at > <http://stackoverflow.com/questions/22691621/how-to-avoid-shared-memory-leaks> > it appears the mmap-based method cannot actually leak memory. Is that > your interpretation of those remarks as well? > > @Phil: > > One thing I noticed from reading those man pages is it is suggested > the name used for shm_open should always start with a "/" for the most > portable results. It appears our current wxwidgets code does not > follow that suggestion so this may affect Unix platforms that are not > Linux. So I suggest you will want to address that issue (even though > it has nothing to do with timing. > > > Alan > __________________________ > Alan W. Irwin > > Astronomical research affiliation with Department of Physics and Astronomy, > University of Victoria (astrowww.phys.uvic.ca). > > Programming affiliations with the FreeEOS equation-of-state > implementation for stellar interiors (freeeos.sf.net); the Time > Ephemerides project (timeephem.sf.net); PLplot scientific plotting > software package (plplot.sf.net); the libLASi project > (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); > and the Linux Brochure Project (lbproject.sf.net). > __________________________ > > Linux-powered Science > __________________________ |
From: Alan W. I. <ir...@be...> - 2015-07-06 17:13:56
|
On 2015-07-06 09:52+0100 Phil Rosenberg wrote: > Hi Alan - this is now slightly out of context as I found it in my > drafts folder unsent. > I can confirm similar odd timing results on my Centos machine. I did > have a couple of exit() calls in wxPLViewer, which to be honest was > just lazy of me. On was used to close the app when -np was used. I > have now replaced these both with clean exit methods. this ensures all > destructors are called and I can now say that at least as far as I can > tell on Windows the shared memory is _always_ released. It is > certainly released when the window is closed using -np. This is on the > viewer side. I presume the use of -np still gives a clean exit on the > console side and doesn't invoke any exit calls anywhere? > > Unfortunately the odd timings still persist. Note that the application > only asks for 1 MB of shared memory so even if not freed I wouldn't > have thought it would slow anything down really after just one run. > > I haven't got time right now to set up a decent debugging environment > on my Centos machine, but I will try to double check the shared memory > is released on that system as soon as I can. Hi Phil: Thanks for confirming those odd time results for second and subsequent runs of the wxwidgets examples on CentOS, and I hope that you and Andrew also try the same experiment on more modern Linux platforms as well just to check whether the issue is due to an issue with mmapped shared memory for older Linux kernels (such as those for CentOS and Debian Wheezy) that has been fixed for modern Linux kernels (say those used for Debian testing and unstable and for the latest Ubuntu). Another question to consider is does this odd timing issue occur immediately for a fresh reboot or does the platform have to accumulate a relatively long uptime with lots of wxPLViewer use before the odd timing results appear? In the former case, the issue is unlikely to be due to some costly emergency response to some mmap resource exhaustion and will therefore be a lot easier to track down (assuming modern Linux kernels show the issue). For the case where freshly booted modern Linux kernels show the problem and normal debug methods cannot find the source of it pretty quickly, then I think it is time to implement a really simple mmapped shared memory example following your present IPC method. The client should loop through nmax different requests for double precision values from a server and sum those values. The server should respond to those requests by delivering 0., 1., 2., etc. The client can compare the computed sum of those numbers with the known nmax*(nmax-1)/2 sum to double-check the server is delivering the values properly. With such a simple mmapped shared memory example implemented (with nmax set at a large enough value so the example takes a second or so) then you can test for odd timing issues like above. If the simple example demonstrates that issue for a freshly booted modern Linux kernel, then the simple example will be a big help when preparing a bug report for the Linux kernel development team. (Note, I should probably be the one to present that bug report since I have fairly good contact with a RedHat employee that works every day on the Linux kernel, and from experience several years ago with a Linux kernel issue I was having back then he is very accommodating about running simple tests for the very latest Linux kernel.) Also, implementation of such a simple example of mmapped shared memory IPC should allow you to debug any excessive wait times in your IPC method that should never occur for such a flood of requests for double precision values from the client. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ |