Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Linux version PipeLen or MemLat?

Anonymous
2011-12-23
2013-04-24

  • Anonymous
    2011-12-23

    is there linux version PipeLen or MemLat?

     
  • Igor Pavlov
    Igor Pavlov
    2011-12-23

    You can try to compile source code.

     

  • Anonymous
    2011-12-23

    I compile pipelen with
    g++ Pipelen.cpp -D __fastcall
    and excute it.
    but the result is not same as what I expected.
    any sugguestion?

     
  • headking
    headking
    2011-12-23

    I also have the same problem.
    result of my embedded sytem:
    #Branch   #B/P       0       1     0-1  Random    Len1    Len2

         32      4 17.0946 48.3344 32.6991 31.6218 -2.1853 -2.1546
         64      8 16.9850 47.9356 32.2850 33.8473 2.7739 3.1246
        128     16 16.5308 47.7250 32.1190 29.9982 -4.2594 -4.2417
        256     32 16.4709 47.6635 31.9971 30.9543 -2.2257 -2.0855
        512     64 16.4009 47.5631 31.9656 33.3139 2.6637 2.6966
        1-K    128 16.6245 47.5318 31.9381 33.3522 2.5482 2.8282
        2-K    256 16.3413 47.4749 31.8763 33.4775 3.1386 3.2023
        4-K    512 16.5856 47.4697 32.0860 33.5203 2.9853 2.8688
        8-K    1-K 16.5262 47.5297 32.0103 33.0590 2.0621 2.0973
       16-K    2-K 16.4508 47.5173 32.0183 33.2499 2.5317 2.4631
       32-K    4-K 16.4952 47.5538 32.0637 33.0403 2.0316 1.9531
       64-K    8-K 16.5058 47.5564 32.0712 33.1871 2.3121 2.2317
      128-K   16-K 16.6404 47.6500 32.1447 33.1465 2.0025 2.0036
      256-K   32-K 16.6585 47.6907 32.1626 33.2997 2.2503 2.2743
      512-K   64-K 16.6774 47.7165 32.1542 33.2493 2.1049 2.1902

    Test # 2 ( 4): if (c & mask) { REP-N(c^=v) } REP-2(c^=v)
    Timer frequency  =    1000000 Hz
    CPU frequency    =     603.5 MHz
    Pipeline length v.1 =   2.10 stages - TEST
    Pipeline length v.2 =   2.19 stages - TEST
    Pipeline length     =   2.15 stages - TEST

    but the pipelen of my system is six stages or seven stages.
    note that my CPU is a MIPS CPU!

     
  • Igor Pavlov
    Igor Pavlov
    2011-12-23

    testman:
    Show you results.

    headking:
    Show full name of your CPU (or name of CPU core), so we can look to some pdf specifications.

     

  • Anonymous
    2011-12-23

    here is part of my result:
    #Branch   #B/P       0       1     0-1  Random    Len1    Len2

         32      4 28.3070 2.0475 2.6941 3.5624 -23.2296 1.7366
         64      8 1.1838 86.2186 5.5229 4.2976 -78.8071 -2.4505
        128     16 29.5790 85.4033 57.4110 0.1382 -114.7060 -114.5456
        256     32 1.3057 10.1934 6.5794 2.8105 -5.8782 -7.5379
        512     64 0.9920 9.1184 57.7676 2.9028 -4.3049 -109.7297
        1-K    128 30.8148 85.8137 57.8050 4.1050 -108.4185 -107.3999
        2-K    256 0.2080 87.0189 4.0568 5.9086 -75.4097 3.7036
        4-K    512 30.8334 3.1051 3.9654 6.2714 -21.3957 4.6120
        8-K    1-K 1.3422 85.8403 3.8188 63.1906 39.1986 118.7435
       16-K    2-K 30.8347 9.8182 6.4124 68.8157 96.9786 124.8067
       32-K    4-K 30.7747 9.8289 5.6087 7.2098 -26.1839 3.2022
       64-K    8-K 30.3388 86.4798 58.3683 47.6817 -21.4552 -21.3731
      128-K   16-K 2.3420 86.8979 58.1908 21.9262 -45.3874 -72.5292
      256-K   32-K 13.0603 58.4440 29.2622 64.4906 57.4769 70.4569
      512-K   64-K 25.5547 77.9063 58.4145 58.6190 13.7769 0.4089

    Test # 2 ( 3): if (c & mask) { REP-N(c^=v) } REP-2(c^=v)
    Timer frequency  =    1000000 Hz
    CPU frequency    =    2786.07 MHz
    Pipeline length v.1 =  13.78 stages - TEST
    Pipeline length v.2 =   0.41 stages - TEST
    Pipeline length     =   7.09 stages - TEST

    #Branch   #B/P       0       1     0-1  Random    Len1    Len2

         32      4 28.3154 143.1968 3.3759 58.7248 -54.0626 110.6977
         64      8 0.3808 144.5569 4.6432 3.1276 -138.6825 -3.0312
        128     16 3.5603 146.1069 5.0949 1.4111 -146.8451 -7.3676
        256     32 1.6803 132.7839 5.3823 4.3707 -125.7227 -2.0231
        512     64 29.7568 76.1379 5.4489 7.1641 -91.5666 3.4303
        1-K    128 29.7880 99.8256 61.9952 6.6533 -116.3071 -110.6838
        2-K    256 0.9355 8.4553 7.5930 65.4186 121.4465 115.6513
        4-K    512 29.8082 11.0418 1.7409 67.9771 95.1041 132.4723
        8-K    1-K 2.8891 99.8546 3.5218 2.5033 -97.7372 -2.0371
       16-K    2-K 3.0741 8.6954 51.5287 30.2697 48.7700 -42.5180
       32-K    4-K 29.8067 68.6731 7.7249 27.5523 -43.3752 39.6548
       64-K    8-K 3.2063 101.1454 25.2614 77.6848 51.0180 104.8468
      128-K   16-K 30.1204 47.3818 65.7762 24.9073 -27.6877 -81.7379
      256-K   32-K 3.4807 75.7735 40.2462 70.0522 60.8502 59.6121
      512-K   64-K 27.8041 98.3966 53.3644 74.4300 22.6593 42.1313

    Test # 2 ( 4): if (c & mask) { REP-N(c^=v) } REP-2(c^=v)
    Timer frequency  =    1000000 Hz
    CPU frequency    =    2762.49 MHz
    Pipeline length v.1 =  22.66 stages - TEST
    Pipeline length v.2 =  42.13 stages - TEST
    Pipeline length     =  32.40 stages - TEST

     

  • Anonymous
    2011-12-23

    note that my CPU is intel Ci5-2300 2.8G 6M

     
  • Igor Pavlov
    Igor Pavlov
    2011-12-23

    I don't know why results are so unstable.
    You can try:
    1) increase the number of iterations in loops
    2) disable turbo boost feature in BIOS.

     
  • Igor Pavlov
    Igor Pavlov
    2011-12-23

    And define MY_SET_AFFINITY_NUMBER macro
    in Benchmark.h
    instead of
    #if defined(ITS_WINDOWS) && !defined(UNDER_CE)
    #define MY_SET_AFFINITY_NUMBER(n) SetThreadAffinityMask(GetCurrentThread(), 1 << (n))
    #else
    #define MY_SET_AFFINITY_NUMBER(n)
    #endif

     
  • headking
    headking
    2011-12-26

    Is __fastcall the key point of results?
    In linux, there has no __fastcall macro, so I just "#define __fastcall"
    to avoid compiling errors.

     
  • Igor Pavlov
    Igor Pavlov
    2011-12-26

    __fastcall has no effect on results.

     
  • headking
    headking
    2011-12-26

    I run PipeLen on windows and linux (running on VMware)with the same hardware seperately,
    Here is part result of window and linux:

    window:

    #Branch   #B/P       0       1     0-1  Random    Len1    Len2

         32      4    9.51   26.83   17.64   16.66   -3.02   -1.95
         64      8   10.82   28.39   19.09   19.32   -0.58    0.45
        128     16   11.50   28.19   19.81   18.13   -3.43   -3.36
        256     32   11.82   28.56   20.19   19.05   -2.27   -2.27
        512     64   11.98   28.75   20.36   20.53    0.33    0.33
        1-K    128   12.06   28.85   20.45   20.65    0.39    0.40
        2-K    256   12.10   29.89   20.49   20.66   -0.66    0.34
        4-K    512   12.12   30.34   20.51   21.08   -0.29    1.14
        8-K    1-K   12.13   28.93   20.53   23.02    4.98    4.99
       16-K    2-K   12.13   29.28   20.53   26.28   11.16   11.51
       32-K    4-K   12.13   30.24   21.43   27.72   13.07   12.58
       64-K    8-K   12.14   29.86   20.54   27.92   13.84   14.76
      128-K   16-K   12.70   30.61   21.77   28.19   13.07   12.83
      256-K   32-K   12.75   30.80   21.95   29.04   14.53   14.18
      512-K   64-K   12.80   30.70   21.75   28.95   14.41   14.39

    Test # 2 ( 2): if (c & mask) { REP-N(c^=v) } REP-2(c^=v)
    Timer frequency  =    3579545 Hz
    CPU frequency    =    2793.64 MHz
    Pipeline length v.1 =  14.41 stages - TEST
    Pipeline length v.2 =  14.39 stages - TEST
    Pipeline length     =  14.40 stages - TEST

    Linux (running on VMware):

    #Branch   #B/P       0       1     0-1  Random    Len1    Len2

         32      4 28.8804 70.5272 51.2810 48.7443 -1.9191 -5.0734
         64      8 29.4061 72.8236 49.9230 51.8260 1.4222 3.8058
        128     16 23.7080 70.2309 50.4625 47.2834 0.6280 -6.3582
        256     32 30.1614 73.6441 51.1272 49.3271 -5.1512 -3.6002
        512     64 30.3987 73.9397 48.2184 54.0462 3.7540 11.6557
        1-K    128 30.8089 73.9357 48.5047 53.0931 1.4416 9.1768
        2-K    256 30.5191 74.0993 52.1848 52.1961 -0.2261 0.0227
        4-K    512 30.3931 74.7806 49.6891 53.8874 2.6012 8.3966
        8-K    1-K 30.7781 73.7257 51.6645 51.1533 -2.1972 -1.0224
       16-K    2-K 31.7227 72.5913 53.0493 58.8802 13.4464 11.6618
       32-K    4-K 30.5949 74.8967 52.1937 62.3241 19.1567 20.2608
       64-K    8-K 31.0304 74.4007 52.9004 65.2623 25.0935 24.7238

    Test # 2 ( 2): if (c & mask) { REP-N(c^=v) } REP-2(c^=v)
    Timer frequency  =    1000000 Hz
    CPU frequency    =    2792.46 MHz
    Pipeline length v.1 =  25.09 stages - TEST
    Pipeline length v.2 =  24.72 stages - TEST
    Pipeline length     =  24.91 stages - TEST

    same hardware, different operating system and different results.

     
  • Igor Pavlov
    Igor Pavlov
    2011-12-26

    Maybe VMWARE doesn't work OK for that code.
    You can try
    1) Call it via Wine
    2) compare results for
    7z b -mmt1
    in  VMWARE and in Windows
    3) Also you can try to disbale all power_save CPU features in system (and maybe in BIOS).