Date: 2010-04-27 00:21:10 +0000 (Tue, 27 Apr 2010)
update the readme_admin, adding the description about cuda_affinity_test
--- trunk/readme_admin.txt 2010-04-26 19:35:33 UTC (rev 55)
+++ trunk/readme_admin.txt 2010-04-27 00:21:10 UTC (rev 56)
@@ -73,6 +73,11 @@
cuda_memscrubber: This utility scrubber the content of designated GPUs.
It is usually used by post-alloation process to remove
the existing data in GPUs from the prevous job.
+ This utility automatically detects the GPU numa
+ affinity with CPU cores by running bandwidth test with
+ all GPU/CPU sockets(nodes) combination and picking the
+ best the socket(node) for the GPUs.
For details how to use each utility program, run it with --help
@@ -261,26 +266,21 @@
affinity 3 1,3
The second column above would be the physical GPU devices, the third column
-would be the CPU cores that show good performance with that GPU. Work is
-underway to develop an automated utility to create the CPU-GPU map as above.
-Until that exists, there are a couple of ways:
+would be the CPU cores that show good performance with that GPU.
+The utility cuda_affinity_test can be used to generate the GPU/CPU affinity.
+It does so through exhaustive bandwidth tests for all cpu sockets(nodes) and GPU
+combinations. You can run the utility to have the affinity print out in stdout
-Memory Bandwidth Performance tests:
-In the CUDA SDK, there is a utility available for testing memory bandwidth
-between the host and gpu device. You can use the "taskset" command in Linux
-to bind that bandwidth test to a given core. Ideally, you can identify which
-cores have optimal performance with which GPUs by targeting the full matrix
-of cores and gpus with MxN runs of the bandwidth test.
-On a system w/ 2 GPUs and 4 CPU cores, the full matrix of tests would involve
-taskset -c [0-4] ./bandwidthTest --device=[0-1] --htod --memory=pinned --noprompt
--or an additional 8 commands for the other direction-
-taskset -c [0-4] ./bandwidthTest --device=[0-1] --dtoh --memory=pinned --noprompt
-Nested for loops would obviously make this process easier.
-*CAUTION* Be sure to run these commands either without the wrapper pre-loaded,
-as root, or with CUDA_WRAPPER_PASSTHRU set!! Otherwise, you will just be
-targeting the virtualized device ID, not knowing the true physical device
+Or you can write the affinity into a config file (which can be used by
+$./cuda_affinity_test -o cuda_wrapper_config.output
+If the output file does not exist, it will be created. Otherwise the affinity
+section will be overwritten while the other parts of the file will not be
+changed. Fore more detail, run
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.