assorted-commits Mailing List for Assorted projects (Page 60)

Brought to you by: yangzhang

assorted-commits

You can subscribe to this list here.

2007	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (9)	Dec (12)
2008	Jan (86)	Feb (265)	Mar (96)	Apr (47)	May (136)	Jun (28)	Jul (57)	Aug (42)	Sep (20)	Oct (67)	Nov (37)	Dec (34)
2009	Jan (39)	Feb (85)	Mar (96)	Apr (24)	May (82)	Jun (13)	Jul (10)	Aug (8)	Sep (2)	Oct (20)	Nov (31)	Dec (17)
2010	Jan (16)	Feb (11)	Mar (17)	Apr (53)	May (31)	Jun (13)	Jul (3)	Aug (6)	Sep (11)	Oct (4)	Nov (17)	Dec (17)
2011	Jan (3)	Feb (19)	Mar (5)	Apr (17)	May (3)	Jun (4)	Jul (14)	Aug (3)	Sep (2)	Oct (1)	Nov (3)	Dec (2)
2012	Jan (3)	Feb (7)	Mar (1)	Apr	May (1)	Jun	Jul (4)	Aug (5)	Sep (2)	Oct (3)	Nov	Dec
2013	Jan	Feb	Mar (9)	Apr (5)	May	Jun (2)	Jul (1)	Aug (10)	Sep (1)	Oct (2)	Nov	Dec
2014	Jan (1)	Feb (3)	Mar (3)	Apr (1)	May (4)	Jun	Jul	Aug	Sep (2)	Oct	Nov	Dec
2015	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct (1)	Nov	Dec
2016	Jan (1)	Feb	Mar (2)	Apr	May	Jun	Jul	Aug	Sep (1)	Oct	Nov	Dec
2017	Jan	Feb	Mar (1)	Apr	May (5)	Jun (1)	Jul	Aug	Sep	Oct	Nov	Dec (2)
2018	Jan	Feb	Mar	Apr	May	Jun (1)	Jul	Aug	Sep	Oct	Nov	Dec

Flat | Threaded

<< < 1 .. 58 59 60 61 62 .. 69 > >> (Page 60 of 69)

[Assorted-commits] SF.net SVN: assorted: [427] cpp-commons/trunk/publish.bash

From: <yan...@us...> - 2008-02-15 02:29:41

Revision: 427
          http://assorted.svn.sourceforge.net/assorted/?rev=427&view=rev
Author:   yangzhang
Date:     2008-02-14 18:29:45 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
added interim pandocing/web publishing script

Added Paths:
-----------
    cpp-commons/trunk/publish.bash

Added: cpp-commons/trunk/publish.bash
===================================================================
--- cpp-commons/trunk/publish.bash	                        (rev 0)
+++ cpp-commons/trunk/publish.bash	2008-02-15 02:29:45 UTC (rev 427)
@@ -0,0 +1,24 @@
+#!/usr/bin/env bash
+
+HTMLFRAG=../../assorted-site/trunk
+
+set -o errexit -o nounset
+
+pandoc -s -S --tab-stop=2 -c ../main.css -H $HTMLFRAG/header.html -A $HTMLFRAG/google-footer.html -o index.html README
+
+tmp=/tmp/cpp-commons-site
+rm -rf $tmp/
+mkdir -p $tmp
+cp -r doc/html /$tmp/doc
+cp index.html $tmp/
+
+tar czf - -C $tmp index.html doc |
+ssh shell-sf '
+  set -o errexit -o nounset
+  d=assorted/htdocs/cpp-commons
+  rm -rf $d/
+  mkdir -p $d
+  cd $d/
+  tar xzmf -
+'
+


Property changes on: cpp-commons/trunk/publish.bash
___________________________________________________________________
Name: svn:executable
   + *


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [426] hash-dist/trunk/src/HashDist.scala

From: <yan...@us...> - 2008-02-15 02:28:54

Revision: 426
          http://assorted.svn.sourceforge.net/assorted/?rev=426&view=rev
Author:   yangzhang
Date:     2008-02-14 18:28:59 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
moved hash functions to scala commons

Modified Paths:
--------------
    hash-dist/trunk/src/HashDist.scala

Modified: hash-dist/trunk/src/HashDist.scala
===================================================================
--- hash-dist/trunk/src/HashDist.scala	2008-02-15 02:28:31 UTC (rev 425)
+++ hash-dist/trunk/src/HashDist.scala	2008-02-15 02:28:59 UTC (rev 426)
@@ -3,40 +3,13 @@
 import Control._
 import Io._
 import Tree._
+import Hash._
 
 import scala.util._
 
 object HashDist {
 
   /**
-   * From libstdc++ 4.1 __stl_hash_string.
-   */
-  def hashStl(xs: Seq[Int]) = {
-    var h = 0
-    for (x <- xs) h = 5 * h + x
-    h
-  }
-
-  /**
-   * From Sun JDK6 String.hashCode.
-   */
-  def hashJava(xs: Seq[Int]) = {
-    var h = 0
-    for (x <- xs) h = 31 * h + x
-    h
-  }
-
-  /**
-   * From http://www.cse.yorku.ca/~oz/hash.html. Not sure if this is correct,
-   * since Int is signed.
-   */
-  def hashDjb2(xs: Seq[Int]) = {
-    var h = 5381
-    for (x <- xs) h = ((h << 5) + h) + x
-    h
-  }
-
-  /**
    * Hash function.
    */
   type Hasher = Seq[Int] => Int


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [425] sandbox/trunk/src/cc/hash_map_strings.cc

From: <yan...@us...> - 2008-02-15 02:28:25

Revision: 425
          http://assorted.svn.sourceforge.net/assorted/?rev=425&view=rev
Author:   yangzhang
Date:     2008-02-14 18:28:31 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
added demo of using hash maps for strings

Added Paths:
-----------
    sandbox/trunk/src/cc/hash_map_strings.cc

Added: sandbox/trunk/src/cc/hash_map_strings.cc
===================================================================
--- sandbox/trunk/src/cc/hash_map_strings.cc	                        (rev 0)
+++ sandbox/trunk/src/cc/hash_map_strings.cc	2008-02-15 02:28:31 UTC (rev 425)
@@ -0,0 +1,52 @@
+#include <iostream>
+#include <string>
+
+// #include <ext/hash_fun.h>
+#include <ext/hash_map>
+#include <ext/hash_set>
+
+using namespace std;
+using namespace __gnu_cxx;
+
+struct eqstr
+{
+  bool operator()(const char* s1, const char* s2) const
+  {
+    return strcmp(s1, s2) == 0;
+  }
+};
+
+int
+main()
+{
+  {
+    // This doesn't compile because there is no hash function for strings.
+    // string s = "hello, world!";
+    // hash_set<string> h;
+    // h.insert(s);
+  }
+
+  {
+    // The two other arguments are required. Note that eqstr is custom-defined
+    // above.
+    hash_set< const char *, hash<const char*>, eqstr > h;
+
+    const char *s = "hello, world!";
+    h.insert(s);
+
+    const int nss = 20;
+    char *ss[nss];
+    // Duplicate s nss times into ss and h.
+    for (int i = 0; i < nss; i++) {
+      ss[i] = new char[strlen(s) + 1];
+      strcpy(ss[i], s);
+      cout << ss[i] << endl;
+      h.insert(ss[i]);
+    }
+
+    // This prints 1, which is what we want.
+    cout << h.size() << endl;
+  }
+
+  return 0;
+}


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [424] assorted-site/trunk/footer.html

From: <yan...@us...> - 2008-02-15 02:20:06

Revision: 424
          http://assorted.svn.sourceforge.net/assorted/?rev=424&view=rev
Author:   yangzhang
Date:     2008-02-14 18:20:03 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
updated to use new ga.js

Modified Paths:
--------------
    assorted-site/trunk/footer.html

Modified: assorted-site/trunk/footer.html
===================================================================
--- assorted-site/trunk/footer.html	2008-02-15 02:19:38 UTC (rev 423)
+++ assorted-site/trunk/footer.html	2008-02-15 02:20:03 UTC (rev 424)
@@ -1,8 +1,11 @@
-<script src="http://www.google-analytics.com/urchin.js" type="text/javascript">
+<script type="text/javascript">
+  var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
+  document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
 </script>
 <script type="text/javascript">
-_uacct = "UA-1322384-1";
-urchinTracker();
+  var pageTracker = _gat._getTracker("UA-1322384-1");
+  pageTracker._initData();
+  pageTracker._trackPageview();
 </script>
 
 <script src="http://pmetrics.performancing.com/100.js" type="text/javascript"></script>


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [423] assorted-site/trunk

From: <yan...@us...> - 2008-02-15 02:19:33

Revision: 423
          http://assorted.svn.sourceforge.net/assorted/?rev=423&view=rev
Author:   yangzhang
Date:     2008-02-14 18:19:38 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
added google tracking footer (for other web pages)

Modified Paths:
--------------
    assorted-site/trunk/index.txt

Added Paths:
-----------
    assorted-site/trunk/google-footer.html

Added: assorted-site/trunk/google-footer.html
===================================================================
--- assorted-site/trunk/google-footer.html	                        (rev 0)
+++ assorted-site/trunk/google-footer.html	2008-02-15 02:19:38 UTC (rev 423)
@@ -0,0 +1,9 @@
+<script type="text/javascript">
+  var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
+  document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
+</script>
+<script type="text/javascript">
+  var pageTracker = _gat._getTracker("UA-1322384-1");
+  pageTracker._initData();
+  pageTracker._trackPageview();
+</script>

Modified: assorted-site/trunk/index.txt
===================================================================
--- assorted-site/trunk/index.txt	2008-02-15 02:12:48 UTC (rev 422)
+++ assorted-site/trunk/index.txt	2008-02-15 02:19:38 UTC (rev 423)
@@ -11,8 +11,8 @@
 information to come later.
 
 - ZDB: simple object database with an emphasis on semantics (active)
-- General-purpose libraries ("commons") for various languages or platforms.
-  - [Python Commons](python-commons) (maintained)
+- General-purpose libraries ("commons") for various languages or platforms
+  - [Python Commons](python-commons) (passive)
   - [Scala Commons](scala-commons) (active)
   - Java Reactor: simple event loop for single-threaded asynchronous IO and
     task scheduling (done)
@@ -20,14 +20,14 @@
   - Haskell Commons (active)
   - TeX Commons (active)
   - Shell Tools: programs written in a variety of languages and
-    oriented toward shell scripting and systems management (maintained)
+    oriented toward shell scripting and systems management (passive)
   - AFX: extensions (e.g. threading support) for the AF asynchronous
     programming framework (active)
 - UI libraries
   - Scala TUI: a declarative reactive programming toolkit for constructing
-    text user interfaces (hiatus)
-  - JFX Table: an editable table (spreadsheet) widget in JavaFX (done)
-  - LZXGrid: an editable table (spreadsheet) widget in OpenLaszlo (done)
+    [ncurses]-based text user interfaces (hiatus)
+  - JFX Table: an editable table (spreadsheet) widget in [JavaFX] (done)
+  - LZXGrid: an editable table (spreadsheet) widget in [OpenLaszlo] (done)
 - System utilities
   - UDP Prober: small program that logs the RTTs of periodic UDP pings, and an
     exercise in using [`boost::asio`] (active)
@@ -36,17 +36,19 @@
 - Meta programming
   - Object code generation: currently targets Java serialization, emphasizing
     compactness, speed, and simplicity (done)
-  - TopCoder tools: crawl TopCoder rankings to analyze players. Currently only
+  - TopCoder tools: crawl [TopCoder] rankings to analyze players. Currently only
     produces language statistics. (done)
-  - Simple Pre-Processor (spp): tiny implementation of cpp's _object-like
-    macros_ (done)
+  - Simple Pre-Processor (spp): tiny implementation of the C preprocessor's
+    _object-like macros_ (done)
 - Tools for various websites or services
-  - [Facebook](facebook-tools): monitor changes in your Facebook network
+  - [Facebook](facebook-tools): monitor changes in your [Facebook] network
     (done)
-  - Myspace: crawl profiles within $n$ degrees of you for fast searches (done)
-  - O'Reilly Safari: cache text for offline reading (abandoned)
-  - Youtube: caches videos from your favorites, playlists, and subscriptions
-    (done)
+  - Myspace: crawl [MySpace] profiles within $n$ degrees of you for fast
+    searches (done)
+  - O'Reilly Safari: cache text from the [O'Reilly Safari] online bookshelf for
+    offline reading (abandoned)
+  - Youtube: caches [YouTube] videos from your favorites, playlists, and
+    subscriptions (done)
   - MovieLookup: given an [HBO](http://hbo.com/) schedule, look up movie
     ratings on [Rotten Tomatoes](http://rottentomatoes.com/), sort the movies
     by score, and aggregate the show times for those movies based on the
@@ -59,30 +61,36 @@
   - Wallpaper Tools: tools for managing wallpapers as they are being rotated
     through (done)
 - [BattleCode]
-  - BattleCode 2007, Team Little: [Greg] and my work for the 2007 competition (done)
-  - BattleCode 2008, Team Little: our (i.e. Greg's) work for the 2008
+  - [BattleCode 2007], Team Little: [Greg] and my work for the 2007 competition
+    (done)
+  - [BattleCode 2008], Team Little: our (i.e. Greg's) work for the 2008
     competition (done)
   - BattleCode Gene Pool: a parallel implementation of a genetic algorithm for
     optimizing parameters (done)
   - BattleCode Composer: express and mix strategies quickly (abandoned)
+- Exploration and experimentation
+  - Hash distribution: for observing the distribution of hash functions on
+    supplied data.
+  - Parallel hash join: for exploring the scalability of hash joins on
+    many-core systems (active)
+  - Sandbox: heap of small test cases to explore (mostly programming language
+    details, bugs, corner cases, features, etc.) (passive)
 - Miscellanea
   - Bibliography: my pan-paper BibTeX; i.e., stalling for ZDB (active)
   - Subtitle adjuster: for time-shifting SRTs (done)
   - Programming Problems: my workspace for solving programming puzzles
     (hiatus)
-  - Experimental Sandbox: heap of small test cases to explore (bugs, corner
-    cases, features, etc.) (maintained)
   - Source management: various tools for cleaning up and maintaining a source
     code repository, identifying things that might not belong (hiatus)
 - Websites
-  - [This website](http://assorted.sf.net/) (maintained)
-  - [My personal website](http://www.mit.edu/~y_z/) (maintained)
+  - [This website](http://assorted.sf.net/) (passive)
+  - [My personal website](http://www.mit.edu/~y_z/) (passive)
 
 What the statuses mean:
 
 - done: no more active development planned, but will generally maintain/fix
   issues
-- maintained: under continual but gradual growth
+- passive: under continual but gradual growth
 - active: development is happening at a faster pace
 - abandoned: incomplete; no plans to pick it up again
 - hitaus: incomplete; plan to resume development
@@ -97,9 +105,19 @@
 - [TinyOS](http://tinyos.net/): SF-hosted project I've been involved in
 
 [BattleCode]: http://battlecode.mit.edu/
+[BattleCode 2007]: http://battlecode.mit.edu/2007/
+[BattleCode 2008]: http://battlecode.mit.edu/2008/
+[JavaFX]: https://openjfx.dev.java.net/
+[ncurses]: http://www.gnu.org/software/ncurses/
+[OpenLaszlo]: http://www.openlaszlo.org/
 [`boost::asio`]: http://asio.sourceforge.net/
 [Greg]: http://people.csail.mit.edu/glittle/
 [the Subversion repository]: https://assorted.svn.sourceforge.net/svnroot/assorted
+[TopCoder]: http://www.topcoder.com/
+[O'Reilly Safari]: http://safari.oreilly.com/
+[Facebook]: http://www.facebook.com/
+[YouTube]: http://www.youtube.com/
+[MySpace]: http://www.myspace.com/
 [browse the repository]: http://assorted.svn.sourceforge.net/viewvc/assorted/
 
 <!--


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [422] numa-bench/trunk/src/malloc.bash

From: <yan...@us...> - 2008-02-15 02:12:44

Revision: 422
          http://assorted.svn.sourceforge.net/assorted/?rev=422&view=rev
Author:   yangzhang
Date:     2008-02-14 18:12:48 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
added more read tests

Modified Paths:
--------------
    numa-bench/trunk/src/malloc.bash

Modified: numa-bench/trunk/src/malloc.bash
===================================================================
--- numa-bench/trunk/src/malloc.bash	2008-02-15 01:45:56 UTC (rev 421)
+++ numa-bench/trunk/src/malloc.bash	2008-02-15 02:12:48 UTC (rev 422)
@@ -24,8 +24,12 @@
 run     16 1000$MB     1       0   0   1     0     0     0
 run     16  100$MB     1       1   0   1     0     0     0
 
-for n in 1 4 8 12 16 ; do
+for n in 1 2 4 8 12 16 ; do
   echo par
+  run   $n   10$MB     1       0   1   1     0     0     0
+  run   $n   10$MB     1       1   1   1     0     0     0
+  run   $n   10$MB     1       0   1   1     1     0     0
+  run   $n   10$MB     1       1   1   1     1     0     0
   run   $n   10$MB     1       0   1   1     0     1     0
   run   $n   10$MB     1       1   1   1     0     1     0
   run   $n   10$MB     1       0   1   1     1     1     0


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [421] numa-bench/trunk/src/tbb.cc

From: <yan...@us...> - 2008-02-15 01:45:52

Revision: 421
          http://assorted.svn.sourceforge.net/assorted/?rev=421&view=rev
Author:   yangzhang
Date:     2008-02-14 17:45:56 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
adding tbb tests

Added Paths:
-----------
    numa-bench/trunk/src/tbb.cc

Added: numa-bench/trunk/src/tbb.cc
===================================================================
--- numa-bench/trunk/src/tbb.cc	                        (rev 0)
+++ numa-bench/trunk/src/tbb.cc	2008-02-15 01:45:56 UTC (rev 421)
@@ -0,0 +1,7 @@
+#include <tbb/scalable_allocator.h>
+
+int
+main()
+{
+  return 0;
+}


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [420] numa-bench/trunk/README

From: <yan...@us...> - 2008-02-15 01:44:50

Revision: 420
          http://assorted.svn.sourceforge.net/assorted/?rev=420&view=rev
Author:   yangzhang
Date:     2008-02-14 17:44:56 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
filled out readme

Modified Paths:
--------------
    numa-bench/trunk/README

Modified: numa-bench/trunk/README
===================================================================
--- numa-bench/trunk/README	2008-02-15 01:44:42 UTC (rev 419)
+++ numa-bench/trunk/README	2008-02-15 01:44:56 UTC (rev 420)
@@ -1,6 +1,54 @@
+% NUMA Benchmarks
+% Yang Zhang
+
+Overview
+--------
+
 This is an assortment of microbenchmarks for understanding the performance
-behavior of NUMA systems and for exploring the Linux NUMA API.
+behavior of NUMA systems and for exploring the [Linux NUMA API]. Currently, the
+only full-featured test is the memory performance test (called `malloc`). This
+program performs a variety of operations on a large piece of memory to explore
+the performance behavior of memory intensive programs.
 
-Revelant materials include the [libnuma whitepaper].
+Results
+-------
 
-[libnuma whitepaper]: http://www.novell.com/collateral/4621437/4621437.pdf
+Here are some [results].
+
+Requirements
+------------
+
+- [C++ Commons]
+- [boost] 1.34
+- [g++] 4.2
+- [libstdc++] 4.2
+
+Building
+--------
+
+C++ Commons is a utility library I maintain; you'll need to make it visible to
+numa-bench:
+
+  $ svn --quiet co https://assorted.svn.sourceforge.net/svnroot/assorted/cpp-commons/trunk cpp-commons
+  $ svn --quiet co https://assorted.svn.sourceforge.net/svnroot/assorted/numa-bench/trunk numa-bench
+  $ ln -s "$PWD/cpp-commons/src/commons" numa-bench/src/
+  $ cd numa-bench/src/
+  $ make
+  $ ./malloc.bash
+
+Supporting Tools
+----------------
+
+`PlotHist` processes stdout concatenated from multiple runs of the program.
+This will produce the averaged time measurement histograms across cores per
+experiment and across experiments varying the number of cores.  system.
+
+This tool depends on the [Scala Commons].
+
+[Linux NUMA API]: http://www.novell.com/collateral/4621437/4621437.pdf
+[C++ Commons]: http://assorted.sf.net/cpp-commons/
+[Scala Commons]: http://assorted.sf.net/scala-commons/
+[boost]: http://www.boost.org/
+[g++]: http://gcc.gnu.org/
+[libstdc++]: http://gcc.gnu.org/libstdc++/
+[results]: analysis.html


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [419] numa-bench/trunk/src/malloc.cc

From: <yan...@us...> - 2008-02-15 01:44:38

Revision: 419
          http://assorted.svn.sourceforge.net/assorted/?rev=419&view=rev
Author:   yangzhang
Date:     2008-02-14 17:44:42 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
cleaned up content

Modified Paths:
--------------
    numa-bench/trunk/src/malloc.cc

Modified: numa-bench/trunk/src/malloc.cc
===================================================================
--- numa-bench/trunk/src/malloc.cc	2008-02-15 01:44:23 UTC (rev 418)
+++ numa-bench/trunk/src/malloc.cc	2008-02-15 01:44:42 UTC (rev 419)
@@ -1,24 +1,3 @@
-// Questions this program answers:
-//
-// - Does malloc tend to allocate locally?
-//   - TODO!
-// - How much does working from another node affect throughput?
-//   - A bit: 647x from local, 649x from neighbor, 651x from remote
-// - Is there difference from repeatedly fetching the same (large) area n times
-//   vs. fetching an area n times larger?
-//   - No. The times are identical for 1GB*1 and 100MB*10.
-// - How much difference is there between sequential scan and random access?
-//   - Huge difference. Also magnifies the locality effects more.
-//   - 1700 from local, 1990 from one neighbor, 2020 from another neighbor,
-//     and 2310 from remote.
-// - What's the difference between reading and writing?
-//   - TODO!
-// - Can we observe prefetching's effects? (Random access but chew the full
-//   cache line of data.)
-//   - TODO!
-
-// TODO: use real shuffling? or is rand ok?
-
 #include <cstdlib>
 #include <fstream>
 #include <iostream>
@@ -171,6 +150,8 @@
     }
     int barrier_result = pthread_barrier_wait(&cross_barrier);
     check(barrier_result == PTHREAD_BARRIER_SERIAL_THREAD || barrier_result == 0);
+    // TODO: make this more interesting than just a sequential traversal over
+    // the partitions.
     for (int i = 0; i < config.ncores; i++) {
       chew1(partitions[i][cpu], config, len);
     }


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [418] numa-bench/trunk

From: <yan...@us...> - 2008-02-15 01:44:16

Revision: 418
          http://assorted.svn.sourceforge.net/assorted/?rev=418&view=rev
Author:   yangzhang
Date:     2008-02-14 17:44:23 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
added analysis and publishing makefile

Added Paths:
-----------
    numa-bench/trunk/doc/
    numa-bench/trunk/doc/Makefile
    numa-bench/trunk/doc/analysis.txt

Added: numa-bench/trunk/doc/Makefile
===================================================================
--- numa-bench/trunk/doc/Makefile	                        (rev 0)
+++ numa-bench/trunk/doc/Makefile	2008-02-15 01:44:23 UTC (rev 418)
@@ -0,0 +1,24 @@
+PROJECT  := numa-bench
+WEBDIR   := assorted/htdocs/$(PROJECT)
+HTMLFRAG := ../../../assorted-site/trunk
+PANDOC    = pandoc -s -S --tab-stop=2 -c ../main.css -H $(HTMLFRAG)/header.html -A $(HTMLFRAG)/google-footer.html -o $@ $^
+
+all: index.html analysis.html
+
+index.html: ../README
+	$(PANDOC)
+
+analysis.html: analysis.txt
+	$(PANDOC)
+
+publish: analysis.html index.html
+	ssh shell-sf mkdir -p $(WEBDIR)/graphs/
+	scp $^ shell-sf:$(WEBDIR)/
+
+publish-data: ../tools/graphs/*.pdf
+	scp $^ shell-sf:$(WEBDIR)/graphs/
+
+clean:
+	rm -f index.html analysis.html
+
+.PHONY: clean publish publish-data

Added: numa-bench/trunk/doc/analysis.txt
===================================================================
--- numa-bench/trunk/doc/analysis.txt	                        (rev 0)
+++ numa-bench/trunk/doc/analysis.txt	2008-02-15 01:44:23 UTC (rev 418)
@@ -0,0 +1,68 @@
+% NUMA Benchmarks Analysis
+% Yang Zhang
+
+The [graphs](graphs) show the results of running several different experiments. The
+results are averaged across three trials for each experiment. The experiments
+varied the following parameters:
+
+- number of threads (CPUs, 1-16, usually 16 if not testing scalability)
+- size of the memory buffer to operate on (10MB, 100MB, or 1GB)
+- number of times to repeat the operation (usually one)
+- whether to chew through the memory sequentially or using random access
+- whether to run operations in parallel on all the CPUs
+- whether to explicitly pin the threads to a CPU (usually we do)
+- whether to operate on a global buffer or on our own buffer (that we allocate
+  ourselves) or on buffers that all other nodes allocated (for
+  cross-communication)
+- whether to perform writes to the buffer, otherwise just read
+
+Here are some questions these results help answer:
+
+- How much does working from another node affect throughput?
+  - It doesn't make much difference for sequential scans - this shows hardware
+    prefetching (and caching) at work. It still makes [a bit of
+    difference](graphs/ncores-16-size-100000000-nreps-1-shuffle-0-par-0-pin-1-local-0-write-1-cross-0.pdf).
+  - However, for random accesses, the difference is much more
+    [pronounced](graphs/ncores-16-size-100000000-nreps-1-shuffle-1-par-0-pin-1-local-0-write-1-cross-0.pdf).
+- How much difference is there between sequential scan and random access?
+  - Substantial difference. Also magnifies NUMA effects. Compare
+    [a](graphs/ncores-16-size-100000000-nreps-1-shuffle-0-par-0-pin-1-local-0-write-1-cross-0.pdf)
+    and
+    [b](graphs/ncores-16-size-100000000-nreps-1-shuffle-1-par-0-pin-1-local-0-write-1-cross-0.pdf)
+- Read vs. write
+  - Substantial difference. Random writes are ~2x slower than random reads.
+  - Compare
+    [a](graphs/ncores-16-size-1000000000-nreps-1-shuffle-0-par-0-pin-1-local-0-write-0-cross-0.pdf)
+    and
+    [b](graphs/ncores-16-size-1000000000-nreps-1-shuffle-0-par-0-pin-1-local-0-write-1-cross-0.pdf)
+- Does `malloc` tend to allocate locally?
+  - Yes, because working with memory allocated from the current thread shows
+    improved times.
+- Scalability of: cross-node memory writes vs. shared memory writes vs. local node memory writes
+  - Graphs for each of these:
+    [a](graphs/scaling-size-10000000-nreps-1-shuffle-0-par-1-pin-1-local-0-write-1-cross-1.pdf)
+    vs.
+    [b](graphs/scaling-size-10000000-nreps-1-shuffle-0-par-1-pin-1-local-0-write-1-cross-0.pdf)
+    vs.
+    [c](graphs/scaling-size-10000000-nreps-1-shuffle-0-par-1-pin-1-local-1-write-1-cross-0.pdf)
+  - Local memory node access is best but still has problems scaling. The time
+    remains constant after some point. This is probably because increasing the
+    number of cores causes the load distribution to approach a more uniform
+    distribution.
+- Scalability of: cross-node memory reads vs. shared memory reads vs. local node memory reads
+  - Graphs for each of these:
+    [a](graphs/scaling-size-10000000-nreps-1-shuffle-0-par-1-pin-1-local-0-write-0-cross-1.pdf)
+    vs.
+    [b](graphs/scaling-size-10000000-nreps-1-shuffle-0-par-1-pin-1-local-0-write-0-cross-0.pdf)
+    vs.
+    [c](graphs/scaling-size-10000000-nreps-1-shuffle-0-par-1-pin-1-local-1-write-0-cross-0.pdf)
+  - Cross-communicating performs worse, and local memory node access performs
+    the same as shared memory access. This is expected, since we aren't
+    performing writes, so the data is freely replicated to all caches (same
+    reason that there is little difference between the non-parallel reads from
+    local vs. remote).
+
+There's still quite a bit of room to fill out this test suite. For instance,
+the experiments varying the number of cores all exercise the fewest number of
+chips; the results may be quite different for tests that distribute the loaded
+cores across all chips.


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [417] numa-bench/trunk

From: <yan...@us...> - 2008-02-15 01:43:47

Revision: 417
          http://assorted.svn.sourceforge.net/assorted/?rev=417&view=rev
Author:   yangzhang
Date:     2008-02-14 17:43:52 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
never added tools!

Added Paths:
-----------
    numa-bench/trunk/tools/
    numa-bench/trunk/tools/Makefile
    numa-bench/trunk/tools/PlotHist.scala
    numa-bench/trunk/tools/plot-hist.bash

Added: numa-bench/trunk/tools/Makefile
===================================================================
--- numa-bench/trunk/tools/Makefile	                        (rev 0)
+++ numa-bench/trunk/tools/Makefile	2008-02-15 01:43:52 UTC (rev 417)
@@ -0,0 +1,12 @@
+COMMONS := $(wildcard commons/*.scala)
+
+all: out/PlotHist.class
+
+out/PlotHist.class: PlotHist.scala $(COMMONS)
+	mkdir -p out
+	fsc -d out $^
+
+clean:
+	rm -rf out
+
+.PHONY: clean

Added: numa-bench/trunk/tools/PlotHist.scala
===================================================================
--- numa-bench/trunk/tools/PlotHist.scala	                        (rev 0)
+++ numa-bench/trunk/tools/PlotHist.scala	2008-02-15 01:43:52 UTC (rev 417)
@@ -0,0 +1,90 @@
+import commons.Collections._
+import commons.Control._
+import commons.Io._
+import scala.util._
+object PlotHist {
+  def main(args: Array[String]) {
+    // The input consists of header lines describing the experiment
+    // configuration followed by body lines reporting the time measurements.
+    // Construct a map from configuration to bodies that were run under that
+    // config.
+    val lines = using (TextReader(Console.in)) (_.readLines.toArray)
+    val runs = separateHeads(groupByHeaders(lines)(_ startsWith "config: "))
+    val exps = multimap(
+      for ((config, lines) <- runs) yield {
+        val pairs =
+          for (line <- lines) yield {
+            val Seq(a,b) = line split ": "
+            (a.toInt, b.toInt)
+          }
+        (config.split(": ")(1), pairs)
+      }
+    )
+    // For each config, aggregate the bodies together by finding the average of
+    // all the measurements corresponding to the same core.
+    val graphs = for ((config, vs) <- exps) yield {
+      val vmap = multimap(vs flatMap (x=>x))
+      val agg = for ((x,ys) <- vmap) yield (x,mean(ys.toArray))
+      val arr = agg.toStream.toArray
+      Sorting quickSort arr
+      (config, arr)
+    }
+    // Also generate the scaling view of the data by grouping together by the
+    // first numeric parameter (in this case, the number of cores).
+    val scaling = multimap(
+      for ((config, points) <- graphs) yield {
+        val Seq(_, ncores, rest) = config split (" ", 3)
+        (rest, (ncores.toInt, Iterable.max(points map (_._2))))
+      }
+    )
+    val scalingGraphs = for ((k,vs) <- scaling; if vs.size > 1) yield {
+      val arr = vs.toStream.toArray
+      Sorting quickSort arr
+      (k, arr)
+    }
+    // Prepare the plotting.
+    val cmd = <p>
+      set style data histogram
+      # set style histogram clustered
+      set terminal pdf
+      set xlabel 'core'
+      set ylabel 'time (ms)'
+      set key off
+      {
+        // Generate the histograms.
+        for ((config, points) <- graphs) yield {
+          <p>
+            set title '{config}'
+            set output 'graphs/{spacedToHyphen(config)}.pdf'
+            plot '-' using 2:xticlabel(1)
+            {points map {case (a,b) => (a + " " + b)} mkString "\n"}
+            e
+          </p>.text
+        }
+      }
+      set style data linespoints
+      {
+        // Generate the time and speedup plots varying ncores.
+        for ((config, points) <- scalingGraphs) yield {
+          <p>
+            set title '{config}'
+
+            set output 'graphs/{"scaling-" + spacedToHyphen(config)}.pdf'
+            plot '-'
+            {points map {case (a,b) => (a + " " + b)} mkString "\n"}
+            e
+
+            set output 'graphs/{"speedup-" + spacedToHyphen(config)}.pdf'
+            plot '-'
+            {
+              val (_, base) = points(0)
+              points map {case (a,b) => (a + " " + (base.toDouble/b))
+            } mkString "\n"}
+            e
+          </p>.text
+        }
+      }
+    </p>.text
+    run("gnuplot", cmd)
+  }
+}

Added: numa-bench/trunk/tools/plot-hist.bash
===================================================================
--- numa-bench/trunk/tools/plot-hist.bash	                        (rev 0)
+++ numa-bench/trunk/tools/plot-hist.bash	2008-02-15 01:43:52 UTC (rev 417)
@@ -0,0 +1,4 @@
+#!/usr/bin/env bash
+set -o errexit -o nounset
+make -s
+egrep '^[[:digit:]]+: [[:digit:]]+|config' "$@" | scala -cp out PlotHist


Property changes on: numa-bench/trunk/tools/plot-hist.bash
___________________________________________________________________
Name: svn:executable
   + *


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [416] hash-join/trunk/tools

From: <yan...@us...> - 2008-02-15 01:42:02

Revision: 416
          http://assorted.svn.sourceforge.net/assorted/?rev=416&view=rev
Author:   yangzhang
Date:     2008-02-14 17:42:05 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
first steps to reducing cpu and using int ids

Modified Paths:
--------------
    hash-join/trunk/tools/DbPrep.scala
    hash-join/trunk/tools/LogProc.scala

Modified: hash-join/trunk/tools/DbPrep.scala
===================================================================
--- hash-join/trunk/tools/DbPrep.scala	2008-02-15 01:40:15 UTC (rev 415)
+++ hash-join/trunk/tools/DbPrep.scala	2008-02-15 01:42:05 UTC (rev 416)
@@ -17,6 +17,8 @@
     val pActress = Pattern compile """^([^\t]+)\t+([^\t]+)$"""
     val (doMovies, doActresses) = (true, true)
     val nreps = args(0).toInt
+    val title2id = new IdMapper[String]
+    def titleId(s: String) = serializeInt(title2id(s))
     using (TextWriter("movies.dat")) { wm =>
       using (TextWriter("actresses.dat")) { wa =>
         for (i <- 0 until nreps) {
@@ -33,6 +35,7 @@
                   if (body && line != "") {
                     val (title, release) = extract(pMovie, line)
                     wm print (xform(title) + "\0" + release + "\0\0")
+                    // wm print (titleId(title) + xform(title) + "\0" + release + "\0\0")
                   }
                   if (!body && (line contains "=======")) {
                     body = true

Modified: hash-join/trunk/tools/LogProc.scala
===================================================================
--- hash-join/trunk/tools/LogProc.scala	2008-02-15 01:40:15 UTC (rev 415)
+++ hash-join/trunk/tools/LogProc.scala	2008-02-15 01:42:05 UTC (rev 416)
@@ -10,6 +10,7 @@
   type FieldMap = Map[String,Double]
   type MutFieldMap = HashMap[String,Double]
   type MutStatMap = HashMap[Int,ArrayBuffer[FieldMap]]
+  def dropPrefix(s: String, t: String) = if (t startsWith s) t drop (s.length) mkString else t
   def main(args: Array[String]) {
     val indexer = args(0)
     val lines = using (TextReader(Console.in)) (_.readLines.toArray)


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [415] hash-join/trunk

From: <yan...@us...> - 2008-02-15 01:40:11

Revision: 415
          http://assorted.svn.sourceforge.net/assorted/?rev=415&view=rev
Author:   yangzhang
Date:     2008-02-14 17:40:15 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
updated analysis, readme, doc publishing

Modified Paths:
--------------
    hash-join/trunk/README
    hash-join/trunk/doc/Makefile
    hash-join/trunk/doc/analysis.txt

Modified: hash-join/trunk/README
===================================================================
--- hash-join/trunk/README	2008-02-14 20:33:35 UTC (rev 414)
+++ hash-join/trunk/README	2008-02-15 01:40:15 UTC (rev 415)
@@ -29,6 +29,9 @@
   there is a match, then emit the resulting joined tuple (movie title, movie
   release year, actress name).
 
+Results
+-------
+
 Here are some [results].
 
 Requirements
@@ -75,7 +78,7 @@
 this dataset and to observe the resulting distributions.
 
 [C++ Commons]: http://assorted.sf.net/cpp-commons/
-[HashDist]: http://assorted.sf.net/
+[HashDist]: http://assorted.svn.sourceforge.net/viewvc/assorted/hash-dist/trunk/
 [Multiprocessor Hash-Based Join Algorithms]: http://citeseer.ist.psu.edu/50143.html
 [Scala Commons]: http://assorted.sf.net/scala-commons/
 [g++]: http://gcc.gnu.org/

Modified: hash-join/trunk/doc/Makefile
===================================================================
--- hash-join/trunk/doc/Makefile	2008-02-14 20:33:35 UTC (rev 414)
+++ hash-join/trunk/doc/Makefile	2008-02-15 01:40:15 UTC (rev 415)
@@ -1,6 +1,7 @@
-PROJECT := hash-join
-WEBDIR  := assorted/htdocs/$(PROJECT)
-PANDOC   = pandoc -s -S --tab-stop=2 -c ../main.css -o $@ $^
+PROJECT  := hash-join
+WEBDIR   := assorted/htdocs/$(PROJECT)
+HTMLFRAG := ../../../assorted-site/trunk
+PANDOC    = pandoc -s -S --tab-stop=2 -c ../main.css -H $(HTMLFRAG)/header.html -A $(HTMLFRAG)/google-footer.html -o $@ $^
 
 all: index.html analysis.html
 
@@ -14,10 +15,10 @@
 	ssh shell-sf mkdir -p $(WEBDIR)/
 	scp $^ shell-sf:$(WEBDIR)/
 
-publish-data: times.pdf speedups.pdf
+publish-data: ../tools/data/*.pdf
 	scp $^ shell-sf:$(WEBDIR)/
 
 clean:
 	rm -f index.html analysis.html
 
-.PHONY: clean publish
+.PHONY: clean publish publish-data

Modified: hash-join/trunk/doc/analysis.txt
===================================================================
--- hash-join/trunk/doc/analysis.txt	2008-02-14 20:33:35 UTC (rev 414)
+++ hash-join/trunk/doc/analysis.txt	2008-02-15 01:40:15 UTC (rev 415)
@@ -1,4 +1,4 @@
-% Hash-Join Benchmarks
+% Hash-Join Analysis
 % Yang Zhang
 
 Here are the graphs from the latest experiments and implementation:
@@ -9,7 +9,7 @@
 This implementation was originally not scalable in the hashtable-building
 stage, which performed frequent allocations. The hashtable is stock from the
 SGI/libstdc++ implementation. I removed this bottleneck by providing a custom
-allocator that allocated from a non-freeing local memory arena.
+allocator that allocates from a non-freeing local memory arena.
 
 Profiling reveals that most of the time is spent in the hash functions and the
 function that performs the memcpy during hash-partitioning. `actdb::partition1`
@@ -27,11 +27,14 @@
   ...
 
 Now the hashtable construction phase is the most scalable part of the
-algorithm. The remaining bottlenecks appear to be due to the memory stalls.
+algorithm (despite its random access nature). The remaining bottlenecks appear
+to be due to memory stalls, but these are mostly masked by hardware
+prefetching.
 
-The program does not scale much beyond the 16 threads, though performance does
-improve slightly. This is due to the contention for cache capacity among
-multiple hardware threads per core.
+The program does not scale much beyond 16 threads, though performance does
+improve slightly. The inability to scale beyond 16 is most likely due to the
+contention for cache capacity among multiple hardware threads per core.
 
-This implementation is straightforward, with no fanciness in terms of custom
-scheduling and control over allocation, leaving many things up to the OS.
+I've tried to keep the implementation simple, with no fanciness in terms of
+custom task scheduling or control over allocation, leaving many things up to
+the OS.


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [414] hash-join/trunk/README

From: <yan...@us...> - 2008-02-14 20:33:30

Revision: 414
          http://assorted.svn.sourceforge.net/assorted/?rev=414&view=rev
Author:   yangzhang
Date:     2008-02-14 12:33:35 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
added more to readme

Modified Paths:
--------------
    hash-join/trunk/README

Modified: hash-join/trunk/README
===================================================================
--- hash-join/trunk/README	2008-02-14 20:33:16 UTC (rev 413)
+++ hash-join/trunk/README	2008-02-14 20:33:35 UTC (rev 414)
@@ -7,34 +7,79 @@
 This is a simple implementation of parallel hash joins. I'm using this as a
 first step in studying the performance problems in multicore systems
 programming. This implementation is tailored for a particular dataset, the IMDB
-`movies.list` and `actresses.list` files, which may be found [here].
+`movies.list` and `actresses.list` files, which may be found at [here].
 
+To learn more about this algorithm, have a look at [Multiprocessor Hash-Based
+Join Algorithms]. In short:
+
+- The task is to join the set of (movie title, movie release year) with
+  (actress name, movie title) on equal movie titles. For instance, we may be
+  trying to answer the question "What is the average duration of an actress'
+  career?" (Movie titles and actress names are unique.)
+- Each of the $n$ nodes begins with an equal fraction of both datasets.
+- Stage 1 (_hash-partition_): each node hash-partitions its fraction into $n$
+  buckets, where bucket $i$ is destined for node $i$. This is done for each
+  dataset.
+- Stage 2 (_build_): each node receives the movie buckets destined for it from
+  all nodes, and builds a hash table mapping the join key (movie titles) to
+  tuples. This is done for the smaller of the two datasets, which in our case
+  is the movies dataset.
+- Stage 3 (_probe_): each node receives the actress buckets destined for it
+  from all nodes, and probes into the hash table using the movie title. If
+  there is a match, then emit the resulting joined tuple (movie title, movie
+  release year, actress name).
+
+Here are some [results].
+
 Requirements
 ------------
 
-- [C++ Commons] svn r370+
-- [libstdc++] v4.1
+- [C++ Commons]
+- [g++] 4.2
+- [libstdc++] 4.2
 
+Building
+--------
+
+C++ Commons is a utility library I maintain; you'll need to make it visible to
+hash-join:
+
+  $ svn --quiet co https://assorted.svn.sourceforge.net/svnroot/assorted/cpp-commons/trunk cpp-commons
+  $ svn --quiet co https://assorted.svn.sourceforge.net/svnroot/assorted/hash-join/trunk hash-join
+  $ ln -s "$PWD/cpp-commons/src/commons" hash-join/src/
+  $ cd hash-join/src/
+  $ make opt
+  $ ./hashjoin-opt 16 $MOVIEDATA/{movies,actresses}.dat
+
 Supporting Tools
 ----------------
 
 `DbPrep` filters the `.list` files to prepare them to be parsed by the hash
-join.
+join. This can additionally inflate the dataset size while maintaining roughly
+the same join-selectivity of the original dataset. I have also prepared [some
+datasets] available for your use.
 
 `LogProc` processes stdout concatenated from multiple runs of the program. This
 will produce the time and speedup plots illustrating the scalability of the
 system. This has actually been made into a generic tool and will be moved to
-its own project directory later.
+a utility project later.
 
 `Titles` extracts the titles from the output of `DbPrep` on `movies.list`.
 
+These tools all depend on the [Scala Commons].
+
 Related
 -------
 
 I used [HashDist] to experiment with the chaining of various hash functions on
-this dataset and observe the distribution.
+this dataset and to observe the resulting distributions.
 
+[C++ Commons]: http://assorted.sf.net/cpp-commons/
+[HashDist]: http://assorted.sf.net/
+[Multiprocessor Hash-Based Join Algorithms]: http://citeseer.ist.psu.edu/50143.html
+[Scala Commons]: http://assorted.sf.net/scala-commons/
+[g++]: http://gcc.gnu.org/
 [here]: http://us.imdb.com/interfaces#plain
 [libstdc++]: http://gcc.gnu.org/libstdc++/
-[C++ Commons]: http://assorted.sf.net/
-[HashDist]: http://assorted.sf.net/
+[results]: analysis.html
+[some datasets]: http://people.csail.mit.edu/yang/movie-data/


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [413] hash-join/trunk

From: <yan...@us...> - 2008-02-14 20:33:13

Revision: 413
          http://assorted.svn.sourceforge.net/assorted/?rev=413&view=rev
Author:   yangzhang
Date:     2008-02-14 12:33:16 -0800 (Thu, 14 Feb 2008)

Log Message:
-----------
added more doc and web publishing

Added Paths:
-----------
    hash-join/trunk/doc/
    hash-join/trunk/doc/Makefile
    hash-join/trunk/doc/analysis.txt

Added: hash-join/trunk/doc/Makefile
===================================================================
--- hash-join/trunk/doc/Makefile	                        (rev 0)
+++ hash-join/trunk/doc/Makefile	2008-02-14 20:33:16 UTC (rev 413)
@@ -0,0 +1,23 @@
+PROJECT := hash-join
+WEBDIR  := assorted/htdocs/$(PROJECT)
+PANDOC   = pandoc -s -S --tab-stop=2 -c ../main.css -o $@ $^
+
+all: index.html analysis.html
+
+index.html: ../README
+	$(PANDOC)
+
+analysis.html: analysis.txt
+	$(PANDOC)
+
+publish: analysis.html index.html
+	ssh shell-sf mkdir -p $(WEBDIR)/
+	scp $^ shell-sf:$(WEBDIR)/
+
+publish-data: times.pdf speedups.pdf
+	scp $^ shell-sf:$(WEBDIR)/
+
+clean:
+	rm -f index.html analysis.html
+
+.PHONY: clean publish

Added: hash-join/trunk/doc/analysis.txt
===================================================================
--- hash-join/trunk/doc/analysis.txt	                        (rev 0)
+++ hash-join/trunk/doc/analysis.txt	2008-02-14 20:33:16 UTC (rev 413)
@@ -0,0 +1,37 @@
+% Hash-Join Benchmarks
+% Yang Zhang
+
+Here are the graphs from the latest experiments and implementation:
+
+- [times](times.pdf)
+- [speedups](speedups.pdf)
+
+This implementation was originally not scalable in the hashtable-building
+stage, which performed frequent allocations. The hashtable is stock from the
+SGI/libstdc++ implementation. I removed this bottleneck by providing a custom
+allocator that allocated from a non-freeing local memory arena.
+
+Profiling reveals that most of the time is spent in the hash functions and the
+function that performs the memcpy during hash-partitioning. `actdb::partition1`
+is the hash-partitioning function for actresses, and it calls `push_bucket` to
+copy tuples into buckets. `scan` is just a function to touch all the data from
+the file.
+
+   %   cumulative   self              self     total
+  time   seconds   seconds    calls   s/call   s/call  name
+  16.40      0.82     0.82  4547797     0.00     0.00  commons::hash_djb2(char const*)
+  14.80      1.56     0.74  4547797     0.00     0.00  __gnu_cxx::__stl_hash_string(char const*)
+  13.20      2.22     0.66  4547797     0.00     0.00  db::push_bucket(char**, bucket*, char const*, char const*, unsigned long)
+  12.80      2.86     0.64        2     0.32     0.32  commons::scan(void const*, unsigned long)
+  10.80      3.40     0.54        1     0.54     1.78  actdb::partition1(unsigned int, bucket*)
+  ...
+
+Now the hashtable construction phase is the most scalable part of the
+algorithm. The remaining bottlenecks appear to be due to the memory stalls.
+
+The program does not scale much beyond the 16 threads, though performance does
+improve slightly. This is due to the contention for cache capacity among
+multiple hardware threads per core.
+
+This implementation is straightforward, with no fanciness in terms of custom
+scheduling and control over allocation, leaving many things up to the OS.


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [412] numa-bench/trunk/src/malloc.bash

From: <yan...@us...> - 2008-02-14 05:07:52

Revision: 412
          http://assorted.svn.sourceforge.net/assorted/?rev=412&view=rev
Author:   yangzhang
Date:     2008-02-13 21:07:52 -0800 (Wed, 13 Feb 2008)

Log Message:
-----------
thickened the test driver

Modified Paths:
--------------
    numa-bench/trunk/src/malloc.bash

Modified: numa-bench/trunk/src/malloc.bash
===================================================================
--- numa-bench/trunk/src/malloc.bash	2008-02-14 03:48:18 UTC (rev 411)
+++ numa-bench/trunk/src/malloc.bash	2008-02-14 05:07:52 UTC (rev 412)
@@ -5,7 +5,9 @@
 make -s malloc
 
 function run {
-  ./malloc "$@"
+  for i in {1..3}
+  do ./malloc "$@"
+  done
 }
 
 KB=000 MB=000000 GB=000000000
@@ -22,15 +24,16 @@
 run     16 1000$MB     1       0   0   1     0     0     0
 run     16  100$MB     1       1   0   1     0     0     0
 
-echo par
-run     16   10$MB     1       0   1   1     0     1     0
-run     16   10$MB     1       1   1   1     0     1     0
-run     16   10$MB     1       0   1   1     1     1     0
-run     16   10$MB     1       1   1   1     1     1     0
+for n in 1 4 8 12 16 ; do
+  echo par
+  run   $n   10$MB     1       0   1   1     0     1     0
+  run   $n   10$MB     1       1   1   1     0     1     0
+  run   $n   10$MB     1       0   1   1     1     1     0
+  run   $n   10$MB     1       1   1   1     1     1     0
 
-echo cross
-run     16   10$MB     1       0   1   1     0     0     1
-run     16   10$MB     1       1   1   1     0     0     1
-run     16   10$MB     1       0   1   1     0     1     1
-run     16   10$MB     1       1   1   1     0     1     1
-
+  echo cross
+  run   $n   10$MB     1       0   1   1     0     0     1
+  run   $n   10$MB     1       1   1   1     0     0     1
+  run   $n   10$MB     1       0   1   1     0     1     1
+  run   $n   10$MB     1       1   1   1     0     1     1
+done


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [411] sandbox/trunk/src/scala/PatternMatchingBug. scala

From: <yan...@us...> - 2008-02-14 03:48:18

Revision: 411
          http://assorted.svn.sourceforge.net/assorted/?rev=411&view=rev
Author:   yangzhang
Date:     2008-02-13 19:48:18 -0800 (Wed, 13 Feb 2008)

Log Message:
-----------
added pattern matching bug

Added Paths:
-----------
    sandbox/trunk/src/scala/PatternMatchingBug.scala

Added: sandbox/trunk/src/scala/PatternMatchingBug.scala
===================================================================
--- sandbox/trunk/src/scala/PatternMatchingBug.scala	                        (rev 0)
+++ sandbox/trunk/src/scala/PatternMatchingBug.scala	2008-02-14 03:48:18 UTC (rev 411)
@@ -0,0 +1,9 @@
+object PatternMatchingBug {
+  def main(args: Array[String]) {
+    def r(xs: Seq[Int]): Stream[Int] = xs match {
+      case Seq() => Stream.empty
+      case Seq(y,ys@_*) => Stream.cons(y,r(ys))
+    }
+    println(r(Array(1,2,3,4,5)).toList)
+  }
+}


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [410] cpp-commons/trunk/src/commons/threads.h

From: <yan...@us...> - 2008-02-14 00:10:05

Revision: 410
          http://assorted.svn.sourceforge.net/assorted/?rev=410&view=rev
Author:   yangzhang
Date:     2008-02-13 16:10:10 -0800 (Wed, 13 Feb 2008)

Log Message:
-----------
added pin_thread

Modified Paths:
--------------
    cpp-commons/trunk/src/commons/threads.h

Modified: cpp-commons/trunk/src/commons/threads.h
===================================================================
--- cpp-commons/trunk/src/commons/threads.h	2008-02-13 18:03:22 UTC (rev 409)
+++ cpp-commons/trunk/src/commons/threads.h	2008-02-14 00:10:10 UTC (rev 410)
@@ -22,6 +22,19 @@
   }
 
   /**
+   * Pin the thread to the given CPU number (as defined for sched_setaffinity).
+   */
+  void
+  pin_thread(int cpu)
+  {
+    pid_t pid = gettid();
+    cpu_set_t cs;
+    CPU_ZERO(&cs);
+    CPU_SET(cpu, &cs);
+    sched_setaffinity(pid, sizeof(cs), &cs);
+  }
+
+  /**
    * Wait for all the given threads to join, discarding their return values.
    */
   inline void


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [408] numa-bench/trunk/src/malloc.bash

From: <yan...@us...> - 2008-02-13 18:04:06

Revision: 408
          http://assorted.svn.sourceforge.net/assorted/?rev=408&view=rev
Author:   yangzhang
Date:     2008-02-13 10:03:09 -0800 (Wed, 13 Feb 2008)

Log Message:
-----------
updated test driver

Modified Paths:
--------------
    numa-bench/trunk/src/malloc.bash

Modified: numa-bench/trunk/src/malloc.bash
===================================================================
--- numa-bench/trunk/src/malloc.bash	2008-02-13 17:53:30 UTC (rev 407)
+++ numa-bench/trunk/src/malloc.bash	2008-02-13 18:03:09 UTC (rev 408)
@@ -2,25 +2,35 @@
 
 set -o errexit -o nounset
 
+make -s malloc
+
 function run {
   ./malloc "$@"
 }
 
 KB=000 MB=000000 GB=000000000
 
-#   ncores    size nreps shuffle par pin local write
+#   ncores    size nreps shuffle par pin local write cross
+
 echo writes
-run     16  100$MB     1       0   0   1     0     1
-run     16 1000$MB     1       0   0   1     0     1
-run     16  100$MB    10       0   0   1     0     1
-run     16  100$MB     1       1   0   1     0     1
+run     16  100$MB     1       0   0   1     0     1     0
+run     16 1000$MB     1       0   0   1     0     1     0
+run     16  100$MB    10       0   0   1     0     1     0
+run     16  100$MB     1       1   0   1     0     1     0
 
 echo reads
-run     16 1000$MB     1       0   0   1     0     0
-run     16  100$MB     1       1   0   1     0     0
+run     16 1000$MB     1       0   0   1     0     0     0
+run     16  100$MB     1       1   0   1     0     0     0
 
 echo par
-run     16  100$MB     1       0   1   1     0     1
-run     16  100$MB     1       1   1   1     0     1
-run     16  100$MB     1       0   1   1     1     1
-run     16  100$MB     1       1   1   1     1     1
+run     16   10$MB     1       0   1   1     0     1     0
+run     16   10$MB     1       1   1   1     0     1     0
+run     16   10$MB     1       0   1   1     1     1     0
+run     16   10$MB     1       1   1   1     1     1     0
+
+echo cross
+run     16   10$MB     1       0   1   1     0     0     1
+run     16   10$MB     1       1   1   1     0     0     1
+run     16   10$MB     1       0   1   1     0     1     1
+run     16   10$MB     1       1   1   1     0     1     1
+


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [409] numa-bench/trunk/src/malloc.cc

From: <yan...@us...> - 2008-02-13 18:04:06

Revision: 409
          http://assorted.svn.sourceforge.net/assorted/?rev=409&view=rev
Author:   yangzhang
Date:     2008-02-13 10:03:22 -0800 (Wed, 13 Feb 2008)

Log Message:
-----------
fixed warmup messages

Modified Paths:
--------------
    numa-bench/trunk/src/malloc.cc

Modified: numa-bench/trunk/src/malloc.cc
===================================================================
--- numa-bench/trunk/src/malloc.cc	2008-02-13 18:03:09 UTC (rev 408)
+++ numa-bench/trunk/src/malloc.cc	2008-02-13 18:03:22 UTC (rev 409)
@@ -179,7 +179,7 @@
   }
 
   // Print the elapsed time and "result".
-  if (warmup) cout << "warmup: " << endl;
+  if (warmup) cout << "warmup: ";
   cout << cpu;
   t.print();
 
@@ -252,7 +252,7 @@
   } else {
     // Chew the memory area from each core in sequence.
     for (int i = 0; i < config.ncores; i++) {
-      chew(p, i, config, "");
+      chew(p, i, config, false);
     }
   }
 


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [407] numa-bench/trunk/src/malloc.cc

From: <yan...@us...> - 2008-02-13 17:53:35

Revision: 407
          http://assorted.svn.sourceforge.net/assorted/?rev=407&view=rev
Author:   yangzhang
Date:     2008-02-13 09:53:30 -0800 (Wed, 13 Feb 2008)

Log Message:
-----------
added cross-comm

Modified Paths:
--------------
    numa-bench/trunk/src/malloc.cc

Modified: numa-bench/trunk/src/malloc.cc
===================================================================
--- numa-bench/trunk/src/malloc.cc	2008-02-13 17:53:16 UTC (rev 406)
+++ numa-bench/trunk/src/malloc.cc	2008-02-13 17:53:30 UTC (rev 407)
@@ -20,6 +20,7 @@
 // TODO: use real shuffling? or is rand ok?
 
 #include <cstdlib>
+#include <fstream>
 #include <iostream>
 
 #include <sched.h>
@@ -35,6 +36,8 @@
 using namespace commons;
 using namespace std;
 
+pthread_barrier_t cross_barrier;
+
 struct config
 {
   /**
@@ -79,25 +82,27 @@
    * Do writes, otherwise just do reads.
    */
   const bool write;
+
+  /**
+   * Test cross-communication (use partitions), otherwise use either the
+   * global/local buffer.
+   */
+  const bool cross;
 };
 
+void*** partitions;
+int global_sum;
+
 /**
- * \param pp The start of the buffer to chew.
- * \param cpu Which CPU to pin our thread to.
- * \param config The experiment configuration parameters.
+ * \param p The buffer to chew.
+ * \param config The experiment configuration.
+ * \param len Length of the buffer.
  */
-void*
-chew(void* pp, unsigned int cpu, const config & config, const char* label)
+void
+chew1(void* pp, config config, size_t len)
 {
-  int* p = (int*) (config.local ? malloc(config.size) : pp);
-  const size_t count = config.size / sizeof(int);
-  timer t(": ");
-
-  // Pin this thread to cpu `cpu`.
-  if (config.pin) {
-    pin_thread(cpu);
-  }
-
+  int* p = (int*) pp;
+  const size_t count = len / sizeof(int);
   int sum = 0;
   if (config.write) {
     // Write to the region.
@@ -139,11 +144,44 @@
       }
     }
   }
+  global_sum += sum;
+}
 
+/**
+ * \param pp The start of the buffer to chew.
+ * \param cpu Which CPU to pin our thread to.
+ * \param config The experiment configuration parameters.
+ * \param label Prefix for the elapsed time output.
+ */
+void*
+chew(void* pp, unsigned int cpu, const config & config, bool warmup)
+{
+  // Pin this thread to cpu `cpu`.
+  if (config.pin) {
+    pin_thread(cpu);
+  }
+
+  void* p = config.local ? malloc(config.size) : pp;
+  timer t(": ");
+
+  if (!warmup && config.cross) {
+    size_t len = config.size / config.ncores;
+    for (int i = 0; i < config.ncores; i++) {
+      partitions[cpu][i] = new char[len];
+    }
+    int barrier_result = pthread_barrier_wait(&cross_barrier);
+    check(barrier_result == PTHREAD_BARRIER_SERIAL_THREAD || barrier_result == 0);
+    for (int i = 0; i < config.ncores; i++) {
+      chew1(partitions[i][cpu], config, len);
+    }
+  } else {
+    chew1(p, config, config.size);
+  }
+
   // Print the elapsed time and "result".
-  cout << label << cpu;
+  if (warmup) cout << "warmup: " << endl;
+  cout << cpu;
   t.print();
-  cout << "result: " << sum;
 
   if (config.local) free(p);
 
@@ -156,7 +194,7 @@
   // So that our global shared malloc takes place on the CPU 0's node.
   pin_thread(0);
 
-  if (argc < 9) {
+  if (argc < 10) {
     cerr << argv[0] <<
       " <ncores> <size> <nreps> <shuffle> <par> <pin> <local> <write>" << endl;
     return 1;
@@ -171,7 +209,8 @@
     atoi(argv[5]),
     atoi(argv[6]),
     atoi(argv[7]),
-    atoi(argv[8])
+    atoi(argv[8]),
+    atoi(argv[9])
   };
 
   cout << "config:"
@@ -182,24 +221,34 @@
        << " par "     << config.par
        << " pin "     << config.pin
        << " local "   << config.local
-       << " write "   << config.write << endl;
+       << " write "   << config.write
+       << " cross "   << config.cross << endl;
 
   checkmsg(RAND_MAX > config.size / sizeof(int), "PRNG range not large enough");
 
   void *p = malloc(config.size);
+  check(p != NULL);
 
+  if (config.cross) {
+    partitions = new void**[config.ncores];
+    for (unsigned int i  = 0; i < config.ncores; i++)
+      partitions[i] = new void*[config.ncores];
+  }
+
   // Warmup.
-  chew(p, 0, config, "warmup: ");
+  chew(p, 0, config, true);
 
   if (config.par) {
     // Chew the memory area from each core in parallel (and also chew own).
     pthread_t ts[config.ncores];
+    check(0 == pthread_barrier_init(&cross_barrier, NULL, config.ncores));
     for (int i = 0; i < config.ncores; i++) {
-      ts[i] = spawn(bind(chew, p, i, ref(config), ""));
+      ts[i] = spawn(bind(chew, p, i, ref(config), false));
     }
     for (int i = 0; i < config.ncores; i++) {
       check(pthread_join(ts[i], NULL) == 0);
     }
+    check(0 == pthread_barrier_destroy(&cross_barrier));
   } else {
     // Chew the memory area from each core in sequence.
     for (int i = 0; i < config.ncores; i++) {
@@ -208,6 +257,8 @@
   }
 
   free(p);
+  ofstream trash("/dev/null");
+  trash << "result: " << global_sum << endl;
 
   return 0;
 }


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [406] numa-bench/trunk/src/Makefile

From: <yan...@us...> - 2008-02-13 17:53:13

Revision: 406
          http://assorted.svn.sourceforge.net/assorted/?rev=406&view=rev
Author:   yangzhang
Date:     2008-02-13 09:53:16 -0800 (Wed, 13 Feb 2008)

Log Message:
-----------
added debug

Modified Paths:
--------------
    numa-bench/trunk/src/Makefile

Modified: numa-bench/trunk/src/Makefile
===================================================================
--- numa-bench/trunk/src/Makefile	2008-02-13 08:14:57 UTC (rev 405)
+++ numa-bench/trunk/src/Makefile	2008-02-13 17:53:16 UTC (rev 406)
@@ -10,6 +10,9 @@
 cache: cache.cc $(COMMONS)
 	$(CXX)
 
+malloc-dbg: malloc.cc $(COMMONS)
+	$(CXX) -g3 -O0
+
 malloc: malloc.cc $(COMMONS)
 	$(CXX)
 


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [405] numa-bench/trunk/src/malloc.bash

From: <yan...@us...> - 2008-02-13 08:14:52

Revision: 405
          http://assorted.svn.sourceforge.net/assorted/?rev=405&view=rev
Author:   yangzhang
Date:     2008-02-13 00:14:57 -0800 (Wed, 13 Feb 2008)

Log Message:
-----------
new driver

Modified Paths:
--------------
    numa-bench/trunk/src/malloc.bash

Modified: numa-bench/trunk/src/malloc.bash
===================================================================
--- numa-bench/trunk/src/malloc.bash	2008-02-13 08:14:10 UTC (rev 404)
+++ numa-bench/trunk/src/malloc.bash	2008-02-13 08:14:57 UTC (rev 405)
@@ -3,17 +3,24 @@
 set -o errexit -o nounset
 
 function run {
-  ./malloc 16 $*
+  ./malloc "$@"
 }
 
-#echo '--- 100MB * 1, seq ---'
-#run  100000000  1 0
-#
-#echo '--- 1GB * 1, seq ---'
-#run 1000000000  1 0
-#
-#echo '--- 100MB * 10, seq ---'
-#run  100000000 10 0
+KB=000 MB=000000 GB=000000000
 
-echo '--- 100MB * 1, rand ---'
-run  100000000  1 1
+#   ncores    size nreps shuffle par pin local write
+echo writes
+run     16  100$MB     1       0   0   1     0     1
+run     16 1000$MB     1       0   0   1     0     1
+run     16  100$MB    10       0   0   1     0     1
+run     16  100$MB     1       1   0   1     0     1
+
+echo reads
+run     16 1000$MB     1       0   0   1     0     0
+run     16  100$MB     1       1   0   1     0     0
+
+echo par
+run     16  100$MB     1       0   1   1     0     1
+run     16  100$MB     1       1   1   1     0     1
+run     16  100$MB     1       0   1   1     1     1
+run     16  100$MB     1       1   1   1     1     1


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [404] numa-bench/trunk/src/malloc.cc

From: <yan...@us...> - 2008-02-13 08:14:07

Revision: 404
          http://assorted.svn.sourceforge.net/assorted/?rev=404&view=rev
Author:   yangzhang
Date:     2008-02-13 00:14:10 -0800 (Wed, 13 Feb 2008)

Log Message:
-----------
added config logging; added result (sum) printing

Modified Paths:
--------------
    numa-bench/trunk/src/malloc.cc

Modified: numa-bench/trunk/src/malloc.cc
===================================================================
--- numa-bench/trunk/src/malloc.cc	2008-02-13 07:59:10 UTC (rev 403)
+++ numa-bench/trunk/src/malloc.cc	2008-02-13 08:14:10 UTC (rev 404)
@@ -98,6 +98,7 @@
     pin_thread(cpu);
   }
 
+  int sum = 0;
   if (config.write) {
     // Write to the region.
     if (config.shuffle) {
@@ -107,20 +108,19 @@
           // NOTE: Using r as the index assumes that rand generates large-enough
           // values.
           int r = rand();
-          p[r % count] += r;
+          sum += p[r % count] += r;
         }
       }
     } else {
       // Sequential scan through the memory region.
       for (unsigned int c = 0; c < config.nreps; c++) {
         for (size_t i = 0; i < count; i++) {
-          p[i] += rand();
+          sum += p[i] += rand();
         }
       }
     }
   } else {
     // Only read from the region.
-    int sum = 0;
     if (config.shuffle) {
       // Random access into the memory region.
       for (unsigned int c = 0; c < config.nreps; c++) {
@@ -138,12 +138,12 @@
         }
       }
     }
-    cout << sum << endl;
   }
 
-  // Print the elapsed time.
+  // Print the elapsed time and "result".
   cout << label << cpu;
   t.print();
+  cout << "result: " << sum;
 
   if (config.local) free(p);
 
@@ -174,6 +174,16 @@
     atoi(argv[8])
   };
 
+  cout << "config:"
+       << " ncores "  << config.ncores
+       << " size "    << config.size
+       << " nreps "   << config.nreps
+       << " shuffle " << config.shuffle
+       << " par "     << config.par
+       << " pin "     << config.pin
+       << " local "   << config.local
+       << " write "   << config.write << endl;
+
   checkmsg(RAND_MAX > config.size / sizeof(int), "PRNG range not large enough");
 
   void *p = malloc(config.size);


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

[Assorted-commits] SF.net SVN: assorted: [403] numa-bench/trunk/src/malloc.cc

From: <yan...@us...> - 2008-02-13 07:59:08

Revision: 403
          http://assorted.svn.sourceforge.net/assorted/?rev=403&view=rev
Author:   yangzhang
Date:     2008-02-12 23:59:10 -0800 (Tue, 12 Feb 2008)

Log Message:
-----------
tweak

Modified Paths:
--------------
    numa-bench/trunk/src/malloc.cc

Modified: numa-bench/trunk/src/malloc.cc
===================================================================
--- numa-bench/trunk/src/malloc.cc	2008-02-13 07:58:40 UTC (rev 402)
+++ numa-bench/trunk/src/malloc.cc	2008-02-13 07:59:10 UTC (rev 403)
@@ -21,7 +21,6 @@
 
 #include <cstdlib>
 #include <iostream>
-#include <iomanip>
 
 #include <sched.h>
 


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.

Flat | Threaded

<< < 1 .. 58 59 60 61 62 .. 69 > >> (Page 60 of 69)