| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| condorRelease10.tar | 2009-04-08 | 829.4 kB | |
| matrixMul | 2009-04-08 | 119.5 kB | |
| deviceQuery | 2009-04-08 | 114.5 kB | |
| gpuQuery.submit | 2009-04-08 | 675 Bytes | |
| condor_config.local | 2009-04-08 | 1.1 kB | |
| gpu.sh | 2009-04-08 | 2.9 kB | |
| cudaQuery | 2009-04-08 | 80.9 kB | |
| README | 2009-04-08 | 2.9 kB | |
| Totals: 8 Items | 1.2 MB | 0 | |
This provides documentation on adding Graphics Support to a High
Throughput Computing Environment managed by Condor.
CONTENTS
=================================
/condor_config.local (Sample condor_config.local file)
/README (This README)
/samples/ (Sample Condor submission files)
/gpuQuery.submit
/gpuQueryLogs/ (Contains log files for gpuQuery.submit)
/tests/ (Graphics card tests provided by CUDA)
/deviceQuery (test obtaining information through CUDA)
(params: none)
/matrixMul (test performing matrix multiplication)
(params: size of matrix x, size of matrix y)
/script/ (Contains the scripts and executable needed to
/gpu.sh (script to get information about Graphics Cards)
/cudaQuery (executable to obtain information through CUDA)
INSTALLATION
=================================
_______LINUX_________
Adding graphics card discovery to Condor:
1. Test the script, which is located in script/gpu.sh. In order to obtain
information about graphics cards, the condor user should have access to the
command lspci. For detailed information about NVIDIA CUDA capable graphics
cards, the condor user should be granted access to writing on the graphics
card.
Sample script output:
HasGpu = True
NGpu = 2
Gpu0 = "Quadro FX 3700"
Gpu0CudaCapable = True
Gpu0Mem = 536150016
Gpu0Procs = 14
Gpu0Cores = 112
Gpu1 = "Quadro FX 3700"
Gpu1CudaCapable = True
Gpu1Mem = 536608768
Gpu1Procs = 14
Gpu1Cores = 112
HasCuda = True
CudaRelease = V1.1
CudaVersion = V0.2.1221
-
2. "condor_config.local" contains code to add cronjob into the machine's condor
local configuration file. Copy the cronjob code into the condor local
configuration file, which is located by default at:
/var/lib/condor/condor_config.local
Cronjob code:
STARTD_CRON_JOBLIST = $(STARTD_CRON_JOBLIST), UPDATEGPUINFO
STARTD_CRON_UPDATEGPUINFO_EXECUTABLE = /DIRECTORY/TO/SCRIPT/gpu.sh
STARTD_CRON_UPDATEGPUINFO_PERIOD = 1m
STARTD_CRON_UPDATEGPUINFO_MODE = Periodic
STARTD_CRON_UPDATEGPUINFO_KILL = True
3. Restart condor daemons on local machine with command:
`/sbin/service condor restart`
By restarting the daemons, the cronjob will be added and information regarding
gpus should be sent in class ad form.
4. To check that information is in the condor_collector run the command:
`condor_status -constraint HasGpu`
This command will display those machines with the requirement HasGpu.
Note: It may take a few minutes for the machine's class-ad to be sent.
5. To view the class-ads, type the command:
`condor_status (MACHINE ADDRESS) -long`
Preparing Condor to run CUDA jobs:
TESTING
=================================
EXAMPLES
=================================