From: Malcolm T. <mt...@wu...> - 2009-12-01 19:02:05
|
I'd like to be able collect some statistics for the jobs running on our Opal server (e.g. run time, memory usage, some info about the input parameters) and am wondering about the best way to go about this. One important detail is that I'm using the DRMAA Job Manager. It seems like there are multiple approaches, but I'm thinking the cleanest approach might be to parse the temporary directories (where I'd have direct access to the input files), but I'm not sure how to correlate the Opal job number (app1259160756494) with the job number that the queuing system associates with the job. If I had this latter info, I could query the queuing system to find the run time and memory usage. I noticed that this information does show up in the Tomcat logs, but it never shows up at the same time as the Opal job number: 2009-11-25 08:52:36,501 DEBUG edu.sdsc.nbcr.opal.manager.DRMAAJobManager.launchJob(DRMAAJobManager.java:210) - Working directory: /export/home/opal/jakarta-tomcat-5.0.30/webapps/ROOT/app1259160756494/ ... 2009-11-25 08:52:36,545 INFO edu.sdsc.nbcr.opal.manager.DRMAAJobManager.launchJob(DRMAAJobManager.java:233) - DRMAA job has been submitted with id 4549 I could assume that the line after the launchJob message contains the queuing system job id, but I can imagine this getting screwed up if multiple jobs are running at the same time which is entirely possible. Can anyone recommend a better solution? Would it be possible to include the queuing system job id into the temporary directory somehow (say in a file called jobid)? Thanks in advance, Malcolm -- Malcolm Tobias 314.362.1594 |