Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
condorRelease10.tar | 2009-04-08 | 829.4 kB | |
matrixMul | 2009-04-08 | 119.5 kB | |
deviceQuery | 2009-04-08 | 114.5 kB | |
gpuQuery.submit | 2009-04-08 | 675 Bytes | |
condor_config.local | 2009-04-08 | 1.1 kB | |
gpu.sh | 2009-04-08 | 2.9 kB | |
cudaQuery | 2009-04-08 | 80.9 kB | |
README | 2009-04-08 | 2.9 kB | |
Totals: 8 Items | 1.2 MB | 0 |
This provides documentation on adding Graphics Support to a High Throughput Computing Environment managed by Condor. CONTENTS ================================= /condor_config.local (Sample condor_config.local file) /README (This README) /samples/ (Sample Condor submission files) /gpuQuery.submit /gpuQueryLogs/ (Contains log files for gpuQuery.submit) /tests/ (Graphics card tests provided by CUDA) /deviceQuery (test obtaining information through CUDA) (params: none) /matrixMul (test performing matrix multiplication) (params: size of matrix x, size of matrix y) /script/ (Contains the scripts and executable needed to /gpu.sh (script to get information about Graphics Cards) /cudaQuery (executable to obtain information through CUDA) INSTALLATION ================================= _______LINUX_________ Adding graphics card discovery to Condor: 1. Test the script, which is located in script/gpu.sh. In order to obtain information about graphics cards, the condor user should have access to the command lspci. For detailed information about NVIDIA CUDA capable graphics cards, the condor user should be granted access to writing on the graphics card. Sample script output: HasGpu = True NGpu = 2 Gpu0 = "Quadro FX 3700" Gpu0CudaCapable = True Gpu0Mem = 536150016 Gpu0Procs = 14 Gpu0Cores = 112 Gpu1 = "Quadro FX 3700" Gpu1CudaCapable = True Gpu1Mem = 536608768 Gpu1Procs = 14 Gpu1Cores = 112 HasCuda = True CudaRelease = V1.1 CudaVersion = V0.2.1221 - 2. "condor_config.local" contains code to add cronjob into the machine's condor local configuration file. Copy the cronjob code into the condor local configuration file, which is located by default at: /var/lib/condor/condor_config.local Cronjob code: STARTD_CRON_JOBLIST = $(STARTD_CRON_JOBLIST), UPDATEGPUINFO STARTD_CRON_UPDATEGPUINFO_EXECUTABLE = /DIRECTORY/TO/SCRIPT/gpu.sh STARTD_CRON_UPDATEGPUINFO_PERIOD = 1m STARTD_CRON_UPDATEGPUINFO_MODE = Periodic STARTD_CRON_UPDATEGPUINFO_KILL = True 3. Restart condor daemons on local machine with command: `/sbin/service condor restart` By restarting the daemons, the cronjob will be added and information regarding gpus should be sent in class ad form. 4. To check that information is in the condor_collector run the command: `condor_status -constraint HasGpu` This command will display those machines with the requirement HasGpu. Note: It may take a few minutes for the machine's class-ad to be sent. 5. To view the class-ads, type the command: `condor_status (MACHINE ADDRESS) -long` Preparing Condor to run CUDA jobs: TESTING ================================= EXAMPLES =================================