From: ADOLFO J. B. <ba...@fa...> - 2018-05-19 01:07:55
|
Hi, As a follow up, if I compile with ifort version 18 with the static option, the program runs fine, but with the the WARNING for beeing statically linked. This confirms that the problem lies in the intel *libiomp5.so* library. If the problem can not be solved. What are the inconvinients of statically linking and the use of DMTCP? regards, adolfo 2018-05-18 19:21 GMT-03:00 Kapil Arya <kap...@gm...>: > I just tried the example compiled with gcc-fortran and don't see any > issues: > > $ export OMP_NUM_THREADS="3" > $ dmtcp_launch ./omp_test.x > [42000] NOTE at socketconnlist.cpp:220 in scanForPreExisting; > REASON='found pre-existing socket... will not be restored' > fd = 30 > device = socket:[1530774] > [42000] WARNING at socketconnection.cpp:236 in TcpConnection; > REASON='JWARNING((domain == AF_INET || domain == AF_UNIX || domain == > AF_INET6) && (type & 077) == SOCK_STREAM) failed' > domain = 0 > type = 0 > protocol = 0 > [42000] NOTE at socketconnlist.cpp:220 in scanForPreExisting; > REASON='found pre-existing socket... will not be restored' > fd = 31 > device = socket:[1530775] > [42000] WARNING at socketconnection.cpp:236 in TcpConnection; > REASON='JWARNING((domain == AF_INET || domain == AF_UNIX || domain == > AF_INET6) && (type & 077) == SOCK_STREAM) failed' > domain = 0 > type = 0 > protocol = 0 > [42000] NOTE at socketconnlist.cpp:220 in scanForPreExisting; > REASON='found pre-existing socket... will not be restored' > fd = 39 > device = socket:[1536308] > [42000] WARNING at socketconnection.cpp:236 in TcpConnection; > REASON='JWARNING((domain == AF_INET || domain == AF_UNIX || domain == > AF_INET6) && (type & 077) == SOCK_STREAM) failed' > domain = 0 > type = 0 > protocol = 0 > [42000] NOTE at socketconnlist.cpp:220 in scanForPreExisting; > REASON='found pre-existing socket... will not be restored' > fd = 40 > device = socket:[1536309] > [42000] WARNING at socketconnection.cpp:236 in TcpConnection; > REASON='JWARNING((domain == AF_INET || domain == AF_UNIX || domain == > AF_INET6) && (type & 077) == SOCK_STREAM) failed' > domain = 0 > type = 0 > protocol = 0 > Hello ... > num threads = 622879781 > 0 / 0 -- > 8 > 0 / 0 -- > 8 > 0 / 0 -- > 8 > 0 / 0 -- > 8 > 0 / 0 -- > 8 > 0 / 0 -- > 8 > 0 / 0 -- > 8 > 0 / 0 -- > 8 > 0 / 0 -- > 8 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > 0 / 0 -- > 9 > $ > > On Fri, May 18, 2018 at 4:00 PM Kapil Arya <kap...@gm...> > wrote: > >> Hi Adolfo, >> >> Can you also provide instructions to compile this code? >> >> Kapil >> >> On Fri, May 18, 2018 at 3:53 PM ADOLFO JAVIER BANCHIO < >> ba...@fa...> wrote: >> >>> >>> >>> Hi all, >>> >>> After having googled quite a lot without success and also having >>> checked archive posts, I still can not run fortran compiled openmp codes >>> using dmtcp_launch. >>> >>> I have installed on a Rocks 7 (CENTOS 7) cluster dmtcp version 2.5.2 >>> (from rpm and also compiled with --enable-openm flag), >>> and I still can not run openmp executables produced by ifort compilded >>> f90 codes. >>> >>> I run: >>> >>> in *shell 1* >>> >>> /export/added_soft/dmtcp/dmtcp-2.5.2/bin/dmtcp_coordinator >>> >>> >>> and in *shell 2* >>> >>> export OMP_NUM_THREADS=3 >>> >>> /export/added_soft/dmtcp/dmtcp-2.5.2/bin/dmtcp_launch ./omp_test.x >>> >>> >>> output in *shell 1 *is: >>> >>> $ /export/added_soft/dmtcp/dmtcp-2.5.2/bin/dmtcp_coordinator >>> dmtcp_coordinator starting... >>> Host: bandurria.fis.uncor.edu (0.0.0.0) >>> Port: 7779 >>> Checkpoint Interval: disabled (checkpoint manually instead) >>> Exit on last client: 0 >>> Type '?' for help. >>> >>> [28865] NOTE at dmtcp_coordinator.cpp:1368 in updateCheckpointInterval; >>> REASON='CheckpointInterval updated (for this computation only)' >>> oldInterval = 0 >>> theCheckpointInterval = 0 >>> [28865] NOTE at dmtcp_coordinator.cpp:917 in onConnect; REASON='worker >>> connected' >>> hello_remote.from = 1ba5f63f5ba22d27-29111-99b9e2da0f18 >>> [28865] NOTE at dmtcp_coordinator.cpp:667 in onData; REASON='Updating >>> process Information after exec()' >>> progname = omp_test.x >>> msg.from = 1ba5f63f5ba22d27-40000-99b9e3d17fe2 >>> client->identity() = 1ba5f63f5ba22d27-29111-99b9e2da0f18 >>> >>> >>> And* in shell 2*, the code starts (if I do top, it is running with one >>> thread >>> only, using 100% of cpu, but it seems not to spawn the threads, it seems >>> that it gets stuck when it reaches a parallel section (the part of the >>> code previous to parallel block it is actually executed). >>> >>> >>> Thank you in advance for any help. >>> I am new with dmtcp (coming from blcr), so, my apologies if this is >>> a stupid issue ... >>> >>> regards, >>> >>> adolfo >>> >>> >>> >>> >>> P.S.: the code I am using for testing (other real codes fail in the same >>> way) >>> program omp_test >>> implicit none >>> integer(8) :: i,j >>> integer :: nt,tn,omp_get_num_threads,omp_get_thread_num >>> >>> write(*,*) "Hello ..." >>> >>> !nt = omp_get_num_threads() >>> write(*,*) 'num threads = ',nt >>> >>> !$OMP PARALLEL PRIVATE(i,tn,nt) >>> do i = 1, 10**9 >>> j = int( sqrt( log( real(i)/real(i**2.4) ) ) ) >>> if (mod(i,10**8) == 0) then >>> ! nt = omp_get_num_threads() >>> ! tn = omp_get_thread_num() >>> write(*,*) tn, '/',nt,' -- > ', nint( log(real(i))/log(10.) ) >>> endif >>> enddo >>> !$OMP END PARALLEL >>> >>> end program >>> >>> >>> ------------------------------------------------------------ >>> ------------------ >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot______ >>> _________________________________________ >>> Dmtcp-forum mailing list >>> Dmt...@li... >>> https://lists.sourceforge.net/lists/listinfo/dmtcp-forum >>> >> |