Re: [Osgmm-discuss] Condor Negotiator Crashing
Brought to you by:
mats_rynge
From: Mats R. <ry...@re...> - 2009-06-22 20:18:12
|
Peter Doherty wrote: > > The osgmm.log file is entirely filled with the permission change > attempts. 4000 jobs in the queue, and it checks every 2 seconds on a > different job. > the osgmm.log.1.gz is a day old, and the osgmm.log.1 file has binary > data in it. Is that normal? No, I have never seen binary data in there. Maybe related is the ReSS server changing IP address, that is why you have so many "site has been dropped from ReSS". > Hmm... suddenly I'm wondering if that's a clue. I used 'strings' on > the osgmm.log.1 file, and the last entry is at 10:42 this morning. > That's about when things started to go wrong. I'll have to check if I > did something that would have caused something like that. I've > restarted the match maker a couple times with no success. > I'm attaching the log file anyhow. It's 5MB. It got rejected by the > mailing list... > so it's available here: > http://abitibi.sbgrid.org/osgmm.log.1 The permissions problem might lead to lower success rates. If OSGMM can't read the log files, some errors will not be picked up and you might end up sending a lot of jobs to a broken site. -- Mats Rynge Renaissance Computing Institute <http://www.renci.org> |