[Gridscheduler-users] no free queue for job 27378
Brought to you by:
rayrayson,
ron_chen_123
From: Matt M. <ma...@to...> - 2011-04-08 08:22:38
|
Hi, Not sure if this is the best place to fins help for Grid Engine, thanks to Oracle. We have lately been getting the error : "main|pace11|E|no free queue for job 27378 of user plee@pace (localhost = pace11)" We are not 100% why we are getting this error, but it is intermittent. We are running openmpi (typically 48 processes on 24 machines) through grid engine, and sometimes grid engine gives the error above in the spool directory. With also the error below. error: executing task of job 27378 failed: execution daemon on host "pace10" didn't accept task [pace:10619] ERROR: A daemon on node pace10 failed to start as expected. [pace:10619] ERROR: There may be more information available from [pace:10619] ERROR: the 'qstat -t' command on the Grid Engine tasks. [pace:10619] ERROR: If the problem persists, please restart the [pace:10619] ERROR: Grid Engine PE job [pace:10619] ERROR: The daemon exited unexpectedly with status 1. Any ideas as to why this is happening or how to find why this is happening would be most appreciated. Thanks for your time, Best regards Matt |