#8 h_vmem enforcement misbehavior in single node smp
open
nobody
None
5
2012-09-04
2012-09-04
Anonymous
No
I set h_vmem limits for a simple smp single execution node configuration.
The queued script launches 4 indipendent processing each of which allocates 1MB/10ms until killed by sge_shepherd.
I set up a very simple single node configuration, with 1 queue and with h_vmem limit enforced.
The attached testmem.c is compiled with gcc and run with the script
===== runme.sh ====================================
#!/bin/sh
# nome in coda
#$ -N provamarc
#
# shell di esecuzione
#$ -S /bin/sh
#
# Make sure that the .e and .o file arrive in the
# working directory
#$ -cwd
#
#Merge the standard out and standard error to one file
#$ -j y
Hence we have 4 independent processes per submission run, and 4 output files to chech at which allocation level (roughly) the process is killed.
I submit (non concurrently) the following scripts with the commands
1) qsub -pe smp 4 -l h_vmem=3.9G ./runme.sh
2) qsub -pe smp 4 -l h_vmem=4G ./runme.sh
3) qsub -pe smp 2 -l h_vmem=7.8G ./runme.sh
4) qsub -pe smp 2 -l h_vmem=8G ./runme.sh
sge_shepherd is expected to kill the processes as the SUM of the virtual memory used by the processes reaches the [slots]*[h_vmem] limit.
Such behavior is respected in runs 1), 3), but not in 2), 4), where the processes are killed only when they TAKEN SINGULARLY reach the [slots]*[h_vmem] limit.
In cases 2) 4), at the time the 4 process are killed the machine has already allocated 4*16Gb of memory.
It seems such behavior is consistent below and above the (roughly) 4Gb per process limit.
Just check if the behavior can be reproduced.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
View and moderate all "bugs Discussion" comments posted by this user
Mark all as spam, and block user from posting to "Bugs"
allocates memory at fixed rate, until killed
View and moderate all "bugs Discussion" comments posted by this user
Mark all as spam, and block user from posting to "Bugs"
sorry for the incomplete submission..
Using GE 6.2u5, installed from debian squeeze packages following
http://helms-deep.cable.nu/~rwh/blog/?p=159
I set up a very simple single node configuration, with 1 queue and with h_vmem limit enforced.
The attached testmem.c is compiled with gcc and run with the script
===== runme.sh ====================================
#!/bin/sh
# nome in coda
#$ -N provamarc
#
# shell di esecuzione
#$ -S /bin/sh
#
# Make sure that the .e and .o file arrive in the
# working directory
#$ -cwd
#
#Merge the standard out and standard error to one file
#$ -j y
./a.out > ./out.0.txt &
./a.out > ./out.1.txt &
./a.out > ./out.2.txt &
./a.out > ./out.3.txt
===================================================
Hence we have 4 independent processes per submission run, and 4 output files to chech at which allocation level (roughly) the process is killed.
I submit (non concurrently) the following scripts with the commands
1) qsub -pe smp 4 -l h_vmem=3.9G ./runme.sh
2) qsub -pe smp 4 -l h_vmem=4G ./runme.sh
3) qsub -pe smp 2 -l h_vmem=7.8G ./runme.sh
4) qsub -pe smp 2 -l h_vmem=8G ./runme.sh
sge_shepherd is expected to kill the processes as the SUM of the virtual memory used by the processes reaches the [slots]*[h_vmem] limit.
Such behavior is respected in runs 1), 3), but not in 2), 4), where the processes are killed only when they TAKEN SINGULARLY reach the [slots]*[h_vmem] limit.
In cases 2) 4), at the time the 4 process are killed the machine has already allocated 4*16Gb of memory.
It seems such behavior is consistent below and above the (roughly) 4Gb per process limit.
Just check if the behavior can be reproduced.
View and moderate all "bugs Discussion" comments posted by this user
Mark all as spam, and block user from posting to "Bugs"
script, non-inlined version
View and moderate all "bugs Discussion" comments posted by this user
Mark all as spam, and block user from posting to "Bugs"
testmem.c has been compiled with gcc version 4.4.5, and the machine has a 64bit debian squeeze OS, and 96Gb (plenty) of ram, 12 slots (cores).
Last edit: Anonymous 2014-06-10