<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Recent changes to OpenMP Fortran</title><link>https://sourceforge.net/p/ichnaea/wiki/OpenMP%2520Fortran/</link><description>Recent changes to OpenMP Fortran</description><atom:link href="https://sourceforge.net/p/ichnaea/wiki/OpenMP%20Fortran/feed" rel="self"/><language>en</language><lastBuildDate>Thu, 30 Jan 2014 11:00:02 -0000</lastBuildDate><atom:link href="https://sourceforge.net/p/ichnaea/wiki/OpenMP%20Fortran/feed" rel="self" type="application/rss+xml"/><item><title>OpenMP Fortran modified by Iain Miller</title><link>https://sourceforge.net/p/ichnaea/wiki/OpenMP%2520Fortran/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v2
+++ v3
@@ -29,7 +29,7 @@

     ! Launch the parallel region. Most importantly the loop_timer needs to be "private".

-    $omp parallel default(none), private(err_code , loop_timer), reduction(+:res)
+    !$omp parallel default(none), private(err_code , loop_timer), reduction(+:res)

     ! Running parallel, create a timer in each thread and start it counting. So, if    
     ! OMP_NUM_THREADS=16 for example, there would be 16 timers running after these two
@@ -39,16 +39,16 @@
     call PMTM_timer_start(loop_timer) 

     ! Run the parallel loop.
-    $omp do
+    !$omp do
     do loop_idx = 1, N
         res = res + loop_idx
     end do
-    $omp end do
+    !$omp end do

     ! Stop all the timers
     call PMTM_timer_stop(loop_timer) 

-    $omp end parallel
+    !$omp end parallel

     ! Only one thread will exist here since the "end parallel" will have stopped all but 
     ! one. Call finalize in series which will tidy up and report.
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Iain Miller</dc:creator><pubDate>Thu, 30 Jan 2014 11:00:02 -0000</pubDate><guid>https://sourceforge.net1796b4ff813b1fd941556a8157c4af3920a8bea7</guid></item><item><title>OpenMP Fortran modified by Iain Miller</title><link>https://sourceforge.net/p/ichnaea/wiki/OpenMP%2520Fortran/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v1
+++ v2
@@ -57,3 +57,4 @@
     call MPI_finalize(err_code)

 end program
+~~~~
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Iain Miller</dc:creator><pubDate>Thu, 30 Jan 2014 10:39:43 -0000</pubDate><guid>https://sourceforge.net6018fc2cdf1198ed3b863cb22aafc798b5eec509</guid></item><item><title>OpenMP Fortran modified by Iain Miller</title><link>https://sourceforge.net/p/ichnaea/wiki/OpenMP%2520Fortran/</link><description>&lt;div class="markdown_content"&gt;&lt;h2 id="openmp-fortran"&gt;OpenMP Fortran&lt;/h2&gt;
&lt;p&gt;The following example highlights how to measure thread contributions to a running loop. The main thing to notice is how it uses separate &lt;strong&gt;parallel&lt;/strong&gt; and &lt;strong&gt;do&lt;/strong&gt; directives in order to allow the timing calls to occur in the gap after becoming multi-threaded and before the loop starts. Any other way would cause different behaviour. For example, placing the timer calls before the &lt;strong&gt;parallel&lt;/strong&gt; would lead to a single timer capturing the end to end time of the loop oblivious to any parallel activity. The timer calls could also be moved into the loop body and as such inside the &lt;strong&gt;do&lt;/strong&gt; directives, and in this case the timing is per thread, but it only includes an accumulation of the loop bodies run by each, the loop overhead will be left out. This wouldn’t be ideal as there might be a large number of calls in some cases and if the loop was light as below the overhead to the user would be noticeable. Note that the &lt;strong&gt;PMTM_create_timer&lt;/strong&gt; call could not be moved inside the loop as only one create is allowed per timer.&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="k"&gt;program &lt;/span&gt;&lt;span class="nv"&gt;example&lt;/span&gt; 

    &lt;span class="k"&gt;use &lt;/span&gt;&lt;span class="nv"&gt;PMTM&lt;/span&gt;
    &lt;span class="k"&gt;use &lt;/span&gt;&lt;span class="nv"&gt;MPI&lt;/span&gt; 
    &lt;span class="k"&gt;implicit none&lt;/span&gt;

&lt;span class="k"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;integer&lt;/span&gt; &lt;span class="kd"&gt;::&lt;/span&gt; &lt;span class="nv"&gt;err_code&lt;/span&gt; 
    &lt;span class="k"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;pmtm_timer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;::&lt;/span&gt; &lt;span class="nv"&gt;loop_timer&lt;/span&gt; 
    &lt;span class="kt"&gt;integer&lt;/span&gt; &lt;span class="kd"&gt;::&lt;/span&gt; &lt;span class="nv"&gt;loop_idx&lt;/span&gt;
    &lt;span class="kt"&gt;real&lt;/span&gt; &lt;span class="kd"&gt;::&lt;/span&gt; &lt;span class="nv"&gt;res&lt;/span&gt;
    &lt;span class="kt"&gt;integer&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;parameter&lt;/span&gt; &lt;span class="kd"&gt;::&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt;

    &lt;span class="k"&gt;call &lt;/span&gt;&lt;span class="nv"&gt;MPI_Init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;err_code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;! These calls should only occur once, hence they appear before the parallel region.&lt;/span&gt;

    &lt;span class="k"&gt;call &lt;/span&gt;&lt;span class="nv"&gt;PMTM_init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;example_file_&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Example Application&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;err_code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;call &lt;/span&gt;&lt;span class="nv"&gt;PMTM_parameter_output&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;PMTM_DEFAULT_INSTANCE&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;&amp;amp;&lt;/span&gt;
              &lt;span class="s2"&gt;&amp;quot;Loop Count&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;PMTM_OUTPUT_ALWAYS&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;.false.&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;err_code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;! Launch the parallel region. Most importantly the loop_timer needs to be &amp;quot;private&amp;quot;.&lt;/span&gt;

    &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="nv"&gt;omp&lt;/span&gt; &lt;span class="nv"&gt;parallel&lt;/span&gt; &lt;span class="nv"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;none&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;private&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;err_code&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;loop_timer&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nv"&gt;reduction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;! Running parallel, create a timer in each thread and start it counting. So, if    &lt;/span&gt;
    &lt;span class="c"&gt;! OMP_NUM_THREADS=16 for example, there would be 16 timers running after these two&lt;/span&gt;
    &lt;span class="c"&gt;! calls.&lt;/span&gt;

    &lt;span class="k"&gt;call &lt;/span&gt;&lt;span class="nv"&gt;PMTM_create_timer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;PMTM_DEFAULT_GROUP&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Loop Timer&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;PMTM_TIMER_ALL&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;err_code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;call &lt;/span&gt;&lt;span class="nv"&gt;PMTM_timer_start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;loop_timer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 

    &lt;span class="c"&gt;! Run the parallel loop.&lt;/span&gt;
    &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="nv"&gt;omp&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
&lt;span class="k"&gt;    do &lt;/span&gt;&lt;span class="nv"&gt;loop_idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;N&lt;/span&gt;
        &lt;span class="nv"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;res&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nv"&gt;loop_idx&lt;/span&gt;
    &lt;span class="k"&gt;end do&lt;/span&gt;
    &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="nv"&gt;omp&lt;/span&gt; &lt;span class="k"&gt;end do&lt;/span&gt;

    &lt;span class="c"&gt;! Stop all the timers&lt;/span&gt;
    &lt;span class="k"&gt;call &lt;/span&gt;&lt;span class="nv"&gt;PMTM_timer_stop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;loop_timer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 

    &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="nv"&gt;omp&lt;/span&gt; &lt;span class="k"&gt;end &lt;/span&gt;&lt;span class="nv"&gt;parallel&lt;/span&gt;

    &lt;span class="c"&gt;! Only one thread will exist here since the &amp;quot;end parallel&amp;quot; will have stopped all but &lt;/span&gt;
    &lt;span class="c"&gt;! one. Call finalize in series which will tidy up and report.&lt;/span&gt;

    &lt;span class="k"&gt;call &lt;/span&gt;&lt;span class="nv"&gt;PMTM_finalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;err_code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;call &lt;/span&gt;&lt;span class="nv"&gt;MPI_finalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;err_code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 

&lt;span class="k"&gt;end program&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Iain Miller</dc:creator><pubDate>Thu, 30 Jan 2014 10:37:25 -0000</pubDate><guid>https://sourceforge.net1d5693da49ca08e3cecc2e845e2bc6c781c5382b</guid></item></channel></rss>