#800 SysStemSort very,very slow ....

vX.X.X
pending
5
2012-08-14
2009-08-28
No

After a call to
SysFileTree(filespec, 'file.', 'SFL')
which results in e.g. 30.000 files a subsequent call to
SysStemSort ('file.',,,,,41)
has a dramatically long runtime (compared with V3.2)

V4.0: 10.000 files = 1. sec, 20.000 files = 9 sec, 30.000 files = 26 sec
V 3.2: 10.000 files = 0,02 sec, 20.000 files = 0,05 sec, 30.000 files = 0,08 sec <<<<<<<<<<<<<<<<<!

Hardware: Intel Quadcore Q6600, 4 GB RAM, ... + Win XP-Prof + SP3

Best regards, Zach

Discussion

  • Mark Miesfeld

    Mark Miesfeld - 2010-10-14

    Rick has made some recent changes in this area and I ran the following tests. Using a relatively slow laptop.

    The difference between 3.2.0 and 4.2.0 are not nearly as dramatic. Although with 50,000 files, there is still a relatively large gap.

    For 30,000 files it looks like about 3:1 and for 50,000 files about 6:1

    I was hoping the gap would be close enough to close this, but I guess not. Although, it is not really a bug, maybe a Request for Enhancement would be better:

    C:\work>sysStemSortSlow.rex
    REXX-ooRexx_3.2.0(MT) 6.02 30 Oct 2007
    Searching for files ...
    File count: 36301
    Sorting files start ...
    Elapsed time for sort: 0.188000

    C:\work>sysStemSortSlow.rex
    REXX-ooRexx_3.2.0(MT) 6.02 30 Oct 2007
    Searching for files ...
    File count: 36301
    Sorting files start ...
    Elapsed time for sort: 0.172000

    C:\work>sysStemSortSlow.rex
    REXX-ooRexx_4.2.0(MT) 6.04 14 Oct 2010
    Searching for files ...
    File count: 36301
    Sorting files start ...
    Elapsed time for sort: 0.500000

    C:\work>sysStemSortSlow.rex
    REXX-ooRexx_4.2.0(MT) 6.04 14 Oct 2010
    Searching for files ...
    File count: 36301
    Sorting files start ...
    Elapsed time for sort: 0.454000

    C:\work>

    C:\Tools>sysStemSortSlow.rex
    REXX-ooRexx_4.2.0(MT) 6.04 14 Oct 2010
    Searching for files ...
    File count: 50071
    Sorting files start ...
    Elapsed time for sort: 1.547000

    C:\Tools>sysStemSortSlow.rex
    REXX-ooRexx_4.2.0(MT) 6.04 14 Oct 2010
    Searching for files ...
    File count: 50071
    Sorting files start ...
    Elapsed time for sort: 1.579000

    C:\Tools>sysStemSortSlow.rex
    REXX-ooRexx_3.2.0(MT) 6.02 30 Oct 2007
    Searching for files ...
    File count: 50071
    Sorting files start ...
    Elapsed time for sort: 0.265000

    C:\Tools>sysStemSortSlow.rex
    REXX-ooRexx_3.2.0(MT) 6.02 30 Oct 2007
    Searching for files ...
    File count: 50071
    Sorting files start ...
    Elapsed time for sort: 0.250000

    C:\Tools>

     
  • Zacharias Bumsty

    redirected output from SysFiletree function

     
  • Zacharias Bumsty

    Did some further investigations with V4.1 - same problem.

    The function 'SysStemSort' shows a unstable behaviour, sometime it works as fast a V3.2, but sometimes it is incredible slow by a factor of 1000 ! or even REXX crashes.

    It seems this problem has to do with of the CONTENT of the stem to sort.

    Therefore I created the following stable testcase:

    Instead of calling 'SysFileTree' with unspecified output, the file >SysFileTreeList.txt< is attached which represents a redirected output from this function.

    This file is loaded into a stem and then sorted by 'SysStemSort' with the 'firstcol' parameter used from 1 to 50 (the filename starts at column 41).

    It works without problem from firstcol 1 to 31, but from 32 on either REXX crashes or the runtime jumps fro 40 msecs to 40 secs (factor 1000 !)

    This looks for me as the indication of a memory leakage or buffer overflow.

    I tested this case on 3 different systems (XP 2GB, Win7 Prof 32bit - 4GB and Win7 Prof 32bit - 2GB). All systems shows the same symptoms.

    Just run 'SysStemSortBug.rex' and use the file 'SysFileTreeList.txt'

    Best regards, Zach

     
  • Zacharias Bumsty

    Testcase

     
  • Zacharias Bumsty

    Due to the upload restriction of 256 kB the already attached file 'SysFileTreeList' contains only 20.000 lines.

    The testcase shows the dramatic increase of the sorttime (only factor 300!), but no crashes of REXX.

    Therefore I attached the second part of the list with 10.000+ lines.
    After unzipping copy it to the 1st part, then the REXX crashes should occur.

    Thanks and best regards, Zach

     
  • Zacharias Bumsty

    2nd part of the SysFiletree output list

     
  • Mark Miesfeld

    Mark Miesfeld - 2011-01-21

    Zack,

    Thanks. Creating example programs like you did, is the type of active participation that we'd like to see from more people.

    Note that the changes Rick made were not put into the 4.1.0 release.

    I ran your test case on 4.1.0 and see the dramatic slow down after col = 31. Even without using your extended list.

    In trunk, which is the current code base, 4.2.0 for now, the sort times are the same all the way through. They vary by about 30 ms, which is probably due to the system clock more than the actual sort time.

    So it looks to me like Rick's changes have greatly improved this.

    You can get a build of the current code base from the build machine if you are interested in taking a look:

    http://build.oorexx.org/

    or go straight to the donwload area:

    http://build.oorexx.org/builds/interpreter-main/

    then scroll down until you find the latest build (higher numbers.)

    Remember, these are not release builds, but they are almost always stable.

    Thanks again

     
  • Rick McGuire

    Rick McGuire - 2011-01-21

    Mark beat me to it, but I get the same result. The trunk build is frequently sorting 30000 lines in less time than it takes 4.1.0 to sort 20000. We need to do a little more testing on this code before it can be released, but this will be fixed in a future release.

     
  • Mark Miesfeld

    Mark Miesfeld - 2012-07-16

    The fix for this item was in the 4.1.1 or 4.1.0 release.

     
  • Mark Miesfeld

    Mark Miesfeld - 2012-07-17

    Hi Zack,

    I closed this by mistake, I was trying to clean up the bug status. This is fixed in trunk, not in 4.1.1.

    I'm going to reopen this, move the data you put in the bug you just opened to this one, and close the new one. Sorry for the mixup.

     
  • Mark Miesfeld

    Mark Miesfeld - 2012-07-17

    This is Zack's comment from 3545055, which I'm going to close:

    Hallo,
    Sorry to say, that although my previous opened problem about \'SysStemSort\' (ID: 2846301), was just closed as \'fixed\', this problem is still there in v4.1.1.

    The problem: Because the output-stem of \'SysFileTree\' is not always sorted, \'SysStemSort\' is used to sort it by filename, which starts at postion 41 of the SysFileTree-output.

    call SysStemSort \"file.\",,,,,41 (41 = firstcol-parm = qual. filename)

    If the number of items in the stem is very large, eg. > 40.000, then either the sort time increase dramatically (by a factor > 300, 30 msec -> 10 sec !) or even REXX crashes.

    Warning: a lower item-count (eg. 5.000) in the stem to sort doesn\'t show this effect so clear !

    But I also found workaround, which works for me also with stems > 100.000 items:

    First sort the stem with default parms: call SysStemSort \"file.\"
    after then with the desired \'firstcol\': call SysStemSort \"file.\",,,,,41

    Of course this doubles the runtime, but only in the millisec-range,

    To help your investigations, I created the following testcase:

    \'SysStemSortBug.rex\' - with detailed comments.
    Unzip and use the file \'SysFileTreeList.txt\' as input. (20.000 lines)

    Note:
    Instead of calling \'SysFileTree\' directly on your system with unspecified output, the file >SysFileTreeList.txt< are attached which represents a redirected output from this function (20.000 lines)

    This file is loaded into a stem and then sorted by \'SysStemSort\' with the \'firstcol\' parameter increasing from 1 to 50.

    It works without problem from firstcol=1...31, but from firstcol=32 on either REXX crashes or the runtime jumps from 30 msecs to 10 secs (factor >300 !)

    This looks for me as the indication of a memory leakage or buffer overflow.

    I tested this case on 2 different systems (Win7-Prof 32bit - 4GB and Win7-Prof 64bit - 8GB). All systems shows the same symptoms.

    BTW: REXX v3.2 doen\'t show any problem with SysStemSort (just re-installed and verified)

     

Anonymous
Anonymous

Cancel  Add attachments