Open Grid Scheduler / Bugs / #8 h_vmem enforcement misbehavior in single node smp

#8 h_vmem enforcement misbehavior in single node smp

Status: open

Owner: nobody

Labels: None

Priority: 5

Updated: 2012-09-04

Created: 2012-09-04

Creator: Anonymous

Private: No

I set h_vmem limits for a simple smp single execution node configuration.
The queued script launches 4 indipendent processing each of which allocates 1MB/10ms until killed by sge_shepherd.

Discussion

Comment has been marked as spam.
Undo

View and moderate all "bugs Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Bugs"

Anonymous - 2012-09-04

allocates memory at fixed rate, until killed

allocates memory at fixed rate, until killed

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

testmem.c

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "bugs Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Bugs"

Anonymous - 2012-09-04

sorry for the incomplete submission..

Using GE 6.2u5, installed from debian squeeze packages following
http://helms-deep.cable.nu/~rwh/blog/?p=159

I set up a very simple single node configuration, with 1 queue and with h_vmem limit enforced.
The attached testmem.c is compiled with gcc and run with the script
===== runme.sh ====================================
#!/bin/sh
# nome in coda
#$ -N provamarc
#
# shell di esecuzione
#$ -S /bin/sh
#
# Make sure that the .e and .o file arrive in the
# working directory
#$ -cwd
#
#Merge the standard out and standard error to one file
#$ -j y

./a.out > ./out.0.txt &
./a.out > ./out.1.txt &
./a.out > ./out.2.txt &
./a.out > ./out.3.txt
===================================================

Hence we have 4 independent processes per submission run, and 4 output files to chech at which allocation level (roughly) the process is killed.

I submit (non concurrently) the following scripts with the commands
1) qsub -pe smp 4 -l h_vmem=3.9G ./runme.sh
2) qsub -pe smp 4 -l h_vmem=4G ./runme.sh
3) qsub -pe smp 2 -l h_vmem=7.8G ./runme.sh
4) qsub -pe smp 2 -l h_vmem=8G ./runme.sh

sge_shepherd is expected to kill the processes as the SUM of the virtual memory used by the processes reaches the [slots]*[h_vmem] limit.

Such behavior is respected in runs 1), 3), but not in 2), 4), where the processes are killed only when they TAKEN SINGULARLY reach the [slots]*[h_vmem] limit.

In cases 2) 4), at the time the 4 process are killed the machine has already allocated 4*16Gb of memory.

It seems such behavior is consistent below and above the (roughly) 4Gb per process limit.

Just check if the behavior can be reproduced.

sorry for the incomplete submission.. Using GE 6.2u5, installed from debian squeeze packages following http://helms-deep.cable.nu/~rwh/blog/?p=159 I set up a very simple single node configuration, with 1 queue and with h\_vmem limit enforced. The attached testmem.c is compiled with gcc and run with the script ===== runme.sh ==================================== \#\!/bin/sh \# nome in coda \#$ -N provamarc \# \# shell di esecuzione \#$ -S /bin/sh \# \# Make sure that the .e and .o file arrive in the \# working directory \#$ -cwd \# \#Merge the standard out and standard error to one file \#$ -j y ./a.out > ./out.0.txt & ./a.out > ./out.1.txt & ./a.out > ./out.2.txt & ./a.out > ./out.3.txt =================================================== Hence we have 4 independent processes per submission run, and 4 output files to chech at which allocation level $roughly$ the process is killed. I submit $non concurrently$ the following scripts with the commands 1\) qsub -pe smp 4 -l h\_vmem=3.9G ./runme.sh 2\) qsub -pe smp 4 -l h\_vmem=4G ./runme.sh 3\) qsub -pe smp 2 -l h\_vmem=7.8G ./runme.sh 4\) qsub -pe smp 2 -l h\_vmem=8G ./runme.sh sge\_shepherd is expected to kill the processes as the SUM of the virtual memory used by the processes reaches the \[slots\]\*\[h\_vmem\] limit. Such behavior is respected in runs 1\), 3\), but not in 2\), 4\), where the processes are killed only when they TAKEN SINGULARLY reach the \[slots\]\*\[h\_vmem\] limit. In cases 2\) 4\), at the time the 4 process are killed the machine has already allocated 4\*16Gb of memory. It seems such behavior is consistent below and above the $roughly$ 4Gb per process limit. Just check if the behavior can be reproduced.

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "bugs Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Bugs"

Anonymous - 2012-09-04

script, non-inlined version

script, non-inlined version

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

runme.sh

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "bugs Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Bugs"

Anonymous - 2012-09-04

testmem.c has been compiled with gcc version 4.4.5, and the machine has a 64bit debian squeeze OS, and 96Gb (plenty) of ram, 12 slots (cores).

Last edit: Anonymous 2014-06-10

testmem.c has been compiled with gcc version 4.4.5, and the machine has a 64bit debian squeeze OS, and 96Gb $plenty$ of ram, 12 slots $cores$.

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

h_vmem enforcement misbehavior in single node smp

Group

Searches

Help

#8 h_vmem enforcement misbehavior in single node smp

Discussion