[Osgmm-discuss] Condor Negotiator Crashing
Brought to you by:
mats_rynge
From: Peter D. <do...@cr...> - 2009-06-22 19:02:20
|
I don't know what's going on here, but my jobs submitted to the MatchMaker aren't being matched, and I found out the condor negotiator keeps crashing. If I shut down osgmm, the negotiator keeps running, but then if I start up osgmm, the negotiator crashes when it starts to match one of my jobs. Here are some of the errors I'm getting. I'm not sure where to start with this. Thanks --Peter NegotiatorLog 6/22 14:41:25 ****************************************************** 6/22 14:41:25 ** condor_negotiator (CONDOR_NEGOTIATOR) STARTING UP 6/22 14:41:25 ** /opt/osg-shared/se/app/site/condor-7.2.1/sbin/ condor_negotiator 6/22 14:41:25 ** SubsystemInfo: name=NEGOTIATOR type=NEGOTIATOR(4) class=DAEMON(1) 6/22 14:41:25 ** Configuration: subsystem:NEGOTIATOR local:<NONE> class:DAEMON 6/22 14:41:25 ** $CondorVersion: 7.2.1 Feb 18 2009 BuildID: 133382 $ 6/22 14:41:25 ** $CondorPlatform: X86_64-LINUX_RHEL5 $ 6/22 14:41:25 ** PID = 4322 6/22 14:41:25 ** Log last touched 6/22 14:36:34 6/22 14:41:25 ****************************************************** 6/22 14:41:25 Using config source: /opt/osg-shared/se/app/site/condor/ etc/condor_config 6/22 14:41:25 Using local config sources: 6/22 14:41:25 /opt/osg-local/condor/condor_config.local 6/22 14:41:25 DaemonCore: Command Socket at <10.0.10.39:51423> 6/22 14:41:25 About to rotate ClassAd log /opt/osg-local/condor/spool/ Accountantnew.log 6/22 14:41:25 NEGOTIATOR_SOCKET_CACHE_SIZE = 16 6/22 14:41:25 PREEMPTION_REQUIREMENTS = ( (CurrentTime - EnteredCurrentState) > (1 * (60 * 60)) && RemoteUserPrio > SubmittorPrio * 1.2 ) || (MY.NiceUser == True) 6/22 14:41:25 ACCOUNTANT_HOST = None (local) 6/22 14:41:25 NEGOTIATOR_INTERVAL = 25 sec 6/22 14:41:25 NEGOTIATOR_TIMEOUT = 30 sec 6/22 14:41:25 MAX_TIME_PER_SUBMITTER = 31536000 sec 6/22 14:41:25 MAX_TIME_PER_PIESPIN = 31536000 sec 6/22 14:41:25 PREEMPTION_RANK = (RemoteUserPrio * 1000000) - TARGET.ImageSize 6/22 14:41:25 NEGOTIATOR_PRE_JOB_RANK = RemoteOwner =?= UNDEFINED 6/22 14:41:25 NEGOTIATOR_POST_JOB_RANK = None 6/22 14:41:25 ---------- Started Negotiation Cycle ---------- 6/22 14:41:25 Phase 1: Obtaining ads from collector ... 6/22 14:41:25 Getting all public ads ... 6/22 14:41:25 Sorting 175 ads ... 6/22 14:41:25 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE 6/22 14:41:25 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE 6/22 14:41:25 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE 6/22 14:41:25 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE 6/22 14:41:25 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE 6/22 14:41:25 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE 6/22 14:41:25 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE 6/22 14:41:25 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE 6/22 14:41:25 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE 6/22 14:41:25 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE 6/22 14:41:25 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE 6/22 14:41:25 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE 6/22 14:41:25 Can't evaluate STARTD_AD_REEVAL_EXPR target.UpdateSequenceNumber > my.UpdateSequenceNumber as a bool, treating as TRUE 6/22 14:41:25 Getting startd private ads ... 6/22 14:41:25 Got ads: 175 public and 123 private 6/22 14:41:25 Public ads include 6 submitter, 137 startd 6/22 14:41:25 Phase 2: Performing accounting ... 6/22 14:41:25 ERROR "Assertion ERROR on (resource_hash.insert( ResourceName, ResourceAd ) == 0)" at line 785 in file Accountant.cpp after starting up osgmm: [root@abitibi condor]# /etc/init.d/osgmm start Starting up OSGMM [root@abitibi condor]# Exception in thread "Thread-1" java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.String.substring(String.java:1768) at org.renci.osgmm.Site.getHostName(Site.java:141) at org.renci.osgmm.Sites.addSite(Sites.java:106) at org.renci.osgmm.ReSS.processReSSAd(ReSS.java:228) at org.renci.osgmm.ReSS.pullReSS(ReSS.java:178) at org.renci.osgmm.ReSS.run(ReSS.java:102) |