Name | Modified | Size | Downloads / Week |
---|---|---|---|
README | 2024-05-06 | 2.3 kB | |
UPDATE | 2024-04-10 | 393 Bytes | |
patched_files_ge2011.11.p1.0.2.tar.gz | 2024-04-10 | 98.2 kB | |
patched_files_ge2011.11.p1.0.1.tar.gz | 2018-10-15 | 97.6 kB | |
whole_patches_gpu_ge_update.tar.gz | 2018-04-25 | 125.3 kB | |
Totals: 5 Items | 323.8 kB | 0 |
04/10/2024 For patched Son of GridEgine 8.1.9, Check github /prod-feng/songe Improve support fractioned -l ngpus=0.5. Useful for multithreading GPU jobs which needs multiple CPU cores, but 1 or several GPUs. -pe openmp 10 # requests 10 cpu cores -l ngpus=0.2 # so here 10X0.2=2 GPUs for this job on the same node. Added protection for MT for multiple worker threads. Or set #SGE_ROOT/default/common/bootstrap to be: listener_threads 1 worker_threads 1 Check github /prod-feng/sge-gpu/tree/master NO Guarantee! ===================== 10/13/2018 Add patched_files_ge2011.11.p1.0.1.tar.gz . Fix a bug to support GPU array jobs properly. Only for GE2011.p1 now. NO Guarantee! ====================== 04/24/2018, bugfix: whole_patches_gpu_ge_update.tar.gz NO Guarantee! ====================== The file whole_patches_gpu_ge.tar.gz contains the 2 patched versions for GE2011 and SonGE 8.18. Patch to Son of Grid Engine is available now. Patch to SGE 2011.p1, Grid Engine, to enable multiple GPU scheduling. It schedulea GPUs to jobs, or processes of MPI jobs(in file "environment" on work nodes). Recompile the source needed. Also, you need to set a consumable, named "ngpus", which is hard coded in the patched files. And assign value of it to each node. When submit GPU job, run: >qsub -l ngpus=1 ... This also works for parallel jobs. >qsub -pe openmpi 4 -l ngpus=1 ... Here, "-l ngpus=1" request 1 GPU for 1 process. It supports multiple GPU scheduling on one node as well. For example, if node001 has 4 GPUs installed. JobA uses GPU0, JobB uses GPU2, and then JobC requestes 2 GPUs, the patched SGE can dispatch GPU1 and GPU3 to JobC, and set the environment for the job on node001: CUDA_VISIBLE_DEVICES=1,3 For non-GPU jobs, CUDA_VISIBLE_DEVICES is set to empty. With this patch, you do not need any wrapper tools and loadsensor script anymore. Download the tar file and expand it to get the patched source files. In the tar file, there is a script named "apply_patch.sh". You can run it to copy the patched files to the dedicated folder to replace the original ones. Then recompile the whole package. Developed on CentOS 6.2, Kernel 2.6.32-220.2.1.el6.x86_64, GCC 4.4.6 This patch is only tested partially on a small simulating environment. NO Guarantee!