Menu

#4 icps should not fix temporary filenames

accepted
None
5
2017-08-31
2012-07-17
seismick
No

If a job is launched from cfe by clicking ICPS, some temporary files are created in the working directory.
One of these is always named %HISTORY_RECORDS.
Hence, it is not possible to run another instance of cfe, unless it is in another directory.
This limits the potential for multi-tasking somewhat.
Any files created should be distinguished by a suffix generated from process ID or time or jobname et cetera.

Discussion

  • seismick

    seismick - 2012-07-18

    There is also a file named "time_stamp" in the working directory when a job is run. This needs a suffix as well.

     
  • seismick

    seismick - 2017-07-15

    icps_script deletes several files which are created during that job, namely:
    %trin_filenames* %HISTORY_RECORDS time_stamp
    which prevents running concurrent jobs, in same directory. Even if user runs
    icps program instead of script, temporary files are still created with the
    the same name. Parallel jobs use individual temporary directories, so that no
    conflict occurs. By modifying a few files, concurrent icps jobs can run
    without clobbering each other.

    manhist.f90 creates 1 or 3 history records files, but does not delete the last
    file. So an extra subroutine is needed to delete it:

      public :: manhist_mopup
      subroutine manhist_mopup(kstat)
      integer,intent(out) :: kstat
      close(lun,status='delete',IOSTAT=kstat)
      end subroutine manhist_mopup
    

    Then this must be called near the end of subroutine cps_finish_processing (in cps.f90).
    It must be called after the last call to subroutine manhist_phist.
    So the code would look like this:
    if (history_opt .ne. 'NONE') then
    call manhist_phist(history_opt)
    endif
    endif
    call manhist_mopup(istat) ! delete history records file
    !--------- gather and print statistics

    Then the fixed filename %HISTORY_RECORDS must be replaced by unique filenames.
    In subroutine manhist_init, change the file open to this:
    write(histrec,'(A,I6.6)') 'tEmp1_',getsys_pid()
    open(lun,access='direct',recl=length8,status='replace',iostat=istat,&
    file=histrec)

    In subroutine manhist_realloc, change the file opens to this:
    write(histrec,'(A,I6.6)') 'teMp2_',getsys_pid()
    open(templun,access='direct',recl=length,status='new',iostat=istat,&
    file=histrec)

      write(histrec,'(A,I6.6)') 'temP3_',getsys_pid()
      open(lun,access='direct',recl=length,status='new',iostat=istat,&
           file=histrec)
    

    Also change the error messages following each file open.

    Note that you also need to disable time_stamp file by setting time stamp
    increment to zero in JOB_DATA. I doubt that most users would look at this
    file anyway. I propose changing default obj%tstamp_inc to zero in
    subroutine job_data_initialize within job_data.f90

    Lastly, the %trin_filenames files are only created so TTROT can write velocity
    files. In case somebody gets TTROT going, the deletion of this file could
    be made more selective as described below.

    Thus modify icps_script by adding "& export P_ID=$!" to get the process ID of icps:
    ${CPSEIS_INSTALL_DIR}/platforms/${CPSEIS_ARCH}/bin/icps $WORKFILE & export P_ID=$!
    and change the rm command at end to this:

    wait
    rm -f ._license_splash core hs_err_pid.log
    rm -f %trin_filenames_${HOSTNAME}*${P_ID}
    This will delete the file created by TRIN in that job; there is tiny chance it
    could delete another file when the process ID differs by 10,000.

    I have run two concurrent icps jobs in same directory, using various history options
    and found no corruption.

     
  • Bill Menger

    Bill Menger - 2017-08-31
    • status: open --> accepted
    • assigned_to: Bill Menger
    • Group: --> Next Release (example)
     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.