Patch to SGE 2011.p1, Son of Grid Engine 8.18, to enable multiple GPU scheduling.

It schedules GPUs to jobs, or processes for MPI jobs(in file "environment" on work nodes).

Recompile the source needed. Also, you need to set a consumable, named "ngpus". And assign value of it to each node.

When submit GPU job, run:

>qsub -l ngpus=1 ...

This also works for parallel jobs.

>qsub -pe openmpi 4 -l ngpus=1 ...

Here, "-l ngpus=1" request 1 GPU for 1 process.

It supports multiple GPU scheduling on nodes as well. For example, if node001 has 4 GPUs installed. JobA uses GPU0, JobB uses GPU2, and then JobC requestes 2 GPUs, it can dispatch GPU1 and GPU3 to JobC, and set the environment for the job on node001:

CUDA_VISIBLE_DEVICES=1,3

For non-GPU jobs, it is set to empty.

No loadsensor needed.

Developed on CentOS 6.2, Kernel 2.6.32-220.2.1.el6.x86_64

This patch is only tested partially on a small simulating environment. NO Guarantee!

Features

  • internally build-in supports multi GPU scheduling
  • SGE
  • Grid Engine
  • No loadsensor needed

Project Activity

See All Activity >

Follow SGE_GPU

SGE_GPU Web Site

nel_h2
Secure User Management, Made Simple | Frontegg Icon
Secure User Management, Made Simple | Frontegg

Get 7,500 MAUs, 50 tenants, and 5 SSOs free – integrated into your app with just a few lines of code.

Frontegg powers modern businesses with a user management platform that’s fast to deploy and built to scale. Embed SSO, multi-tenancy, and a customer-facing admin portal using robust SDKs and APIs – no complex setup required. Designed for the Product-Led Growth era, it simplifies setup, secures your users, and frees your team to innovate. From startups to enterprises, Frontegg delivers enterprise-grade tools at zero cost to start. Kick off today.
Start for Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of SGE_GPU!

Additional Project Details

Registered

2015-10-09