From: Andreas Z. <zw...@ki...> - 2016-03-04 12:02:40
Attachments:
signature.asc
|
The x10.util.concurrent.Fences implementation looks reasonable for C++, but the Java implementation seems weird to me. For example, loadLoadBarrier is implemented via two volatile reads. Why *two*? I could ask more questions ... The code was added by David Grove in 2009 it seems: https://github.com/x10-lang/x10/commit/f4f5420ce476f2538f358509ad13ef52b8c2c4c1 Any idea where that came from? Any sources? Any reasons why it is the way it is? -- Andreas Zwinkau KIT IPD Snelting Web: http://pp.ipd.kit.edu/personhp/andreas_zwinkau.php |
From: David P G. <gr...@us...> - 2016-03-04 16:30:38
|
Andreas Zwinkau <zw...@ki...> wrote on 03/04/2016 07:02:30 AM: > From: Andreas Zwinkau <zw...@ki...> > To: x10...@li... > Date: 03/04/2016 07:03 AM > Subject: [X10-users] Fences Java implementation > > The x10.util.concurrent.Fences implementation looks reasonable for C++, > but the Java implementation seems weird to me. > > For example, loadLoadBarrier is implemented via two volatile reads. Why > *two*? I could ask more questions ... > > The code was added by David Grove in 2009 it seems: > https://github.com/x10-lang/x10/commit/ > f4f5420ce476f2538f358509ad13ef52b8c2c4c1 > > Any idea where that came from? Any sources? Any reasons why it is the > way it is? Hi Andreas, Best as I can remember, the Java code that is there was based on discussion being had in 2009 about how to get memory fences given the definition of the Java memory model and java.util.concurrent as it existed at the time. --dave |
From: Andreas Z. <zw...@ki...> - 2016-03-11 12:31:18
Attachments:
signature.asc
|
Am 04.03.2016 um 17:27 schrieb David P Grove: > > > Andreas Zwinkau <zw...@ki...> wrote on 03/04/2016 07:02:30 AM: > >> From: Andreas Zwinkau <zw...@ki...> >> To: x10...@li... >> Date: 03/04/2016 07:03 AM >> Subject: [X10-users] Fences Java implementation >> >> The x10.util.concurrent.Fences implementation looks reasonable for C++, >> but the Java implementation seems weird to me. >> >> For example, loadLoadBarrier is implemented via two volatile reads. Why >> *two*? I could ask more questions ... >> >> The code was added by David Grove in 2009 it seems: >> https://github.com/x10-lang/x10/commit/ >> f4f5420ce476f2538f358509ad13ef52b8c2c4c1 >> >> Any idea where that came from? Any sources? Any reasons why it is the >> way it is? > > Hi Andreas, > > Best as I can remember, the Java code that is there was based on > discussion being had in 2009 about how to get memory fences given the > definition of the Java memory model and java.util.concurrent as it existed > at the time. > > --dave I compared it with the JMM Cookbook [0] and it follows it very directly. For example, "volatile load then volatile store means loadStore barrier". The problem is the JMM considers volatile variables separately, so the JVM can optimize volatiles. If the JVM figures out the volatile variables in FencesUtils.java are never accessed concurrently, then they can be optimized away. I made a toy example and this became loadLoadBarrier assembly: # {method} 'loadLoadBarrier' '()V' in 'FencesUtils' # [sp+0x20] (sp of caller) 0x00007fa3f905f5c0: sub $0x18,%rsp 0x00007fa3f905f5c7: mov %rbp,0x10(%rsp) ;*synchronization entry ; - FencesUtils::loadLoadBarrier@-1 (line 38) 0x00007fa3f905f5cc: mov $0x7acd4eee8,%r10 ; {oop(a 'java/lang/Class' = 'FencesUtils')} 0x00007fa3f905f5d6: mov 0x58(%r10),%r8d 0x00007fa3f905f5da: mov %r8d,0x60(%r10) ;*getstatic v1 ; - FencesUtils::loadLoadBarrier@0 (line 38) 0x00007fa3f905f5de: mov 0x5c(%r10),%r11d 0x00007fa3f905f5e2: mov %r11d,0x64(%r10) ;*getstatic v2 ; - FencesUtils::loadLoadBarrier@6 (line 39) 0x00007fa3f905f5e6: add $0x10,%rsp 0x00007fa3f905f5ea: pop %rbp 0x00007fa3f905f5eb: test %eax,0xc967a0f(%rip) # 0x00007fa4059c7000 ; {poll_return} 0x00007fa3f905f5f1: retq No synchronization instruction. The other three barriers contained a "lock addl" instruction. I think this sometimes might work, but not reliably. Afaik there is no way to correctly implement FencesUtils.java. This implies x10.util.concurrent.Fences cannot be provided with the Java backend. [0] http://gee.cs.oswego.edu/dl/jmm/cookbook.html -- Andreas Zwinkau KIT IPD Snelting Web: http://pp.ipd.kit.edu/personhp/andreas_zwinkau.php |
From: David P G. <gr...@us...> - 2016-03-11 17:59:07
|
Would it make a difference if the static fields were declared to be public instead of their current package-level visibility? Then it might be a little harder for the JIT to justify "knowing" that the variables were never accessed concurrently. --dave Andreas Zwinkau <zw...@ki...> wrote on 03/11/2016 07:31:07 AM: > > Am 04.03.2016 um 17:27 schrieb David P Grove: > > > > Andreas Zwinkau <zw...@ki...> wrote on 03/04/2016 07:02:30 AM: > > > >> From: Andreas Zwinkau <zw...@ki...> > >> To: x10...@li... > >> Date: 03/04/2016 07:03 AM > >> Subject: [X10-users] Fences Java implementation > >> > >> The x10.util.concurrent.Fences implementation looks reasonable for C+ +, > >> but the Java implementation seems weird to me. > >> > >> For example, loadLoadBarrier is implemented via two volatile reads. Why > >> *two*? I could ask more questions ... > >> > >> The code was added by David Grove in 2009 it seems: > >> https://github.com/x10-lang/x10/commit/ > >> f4f5420ce476f2538f358509ad13ef52b8c2c4c1 > >> > >> Any idea where that came from? Any sources? Any reasons why it is the > >> way it is? > > > > Hi Andreas, > > > > Best as I can remember, the Java code that is there was based on > > discussion being had in 2009 about how to get memory fences given the > > definition of the Java memory model and java.util.concurrent as it existed > > at the time. > > > > --dave > > I compared it with the JMM Cookbook [0] and it follows it very directly. > For example, "volatile load then volatile store means loadStore barrier". > > The problem is the JMM considers volatile variables separately, so the > JVM can optimize volatiles. If the JVM figures out the volatile > variables in FencesUtils.java are never accessed concurrently, then they > can be optimized away. I made a toy example and this became > loadLoadBarrier assembly: > > # {method} 'loadLoadBarrier' '()V' in 'FencesUtils' > # [sp+0x20] (sp of caller) > 0x00007fa3f905f5c0: sub $0x18,%rsp > 0x00007fa3f905f5c7: mov %rbp,0x10(%rsp) ;*synchronization entry > ; - > FencesUtils::loadLoadBarrier@-1 (line 38) > 0x00007fa3f905f5cc: mov $0x7acd4eee8,%r10 ; {oop(a > 'java/lang/Class' = 'FencesUtils')} > 0x00007fa3f905f5d6: mov 0x58(%r10),%r8d > 0x00007fa3f905f5da: mov %r8d,0x60(%r10) ;*getstatic v1 > ; - > FencesUtils::loadLoadBarrier@0 (line 38) > 0x00007fa3f905f5de: mov 0x5c(%r10),%r11d > 0x00007fa3f905f5e2: mov %r11d,0x64(%r10) ;*getstatic v2 > ; - > FencesUtils::loadLoadBarrier@6 (line 39) > 0x00007fa3f905f5e6: add $0x10,%rsp > 0x00007fa3f905f5ea: pop %rbp > 0x00007fa3f905f5eb: test %eax,0xc967a0f(%rip) # > 0x00007fa4059c7000 > ; {poll_return} > 0x00007fa3f905f5f1: retq > > No synchronization instruction. The other three barriers contained a > "lock addl" instruction. > > I think this sometimes might work, but not reliably. Afaik there is no > way to correctly implement FencesUtils.java. This implies > x10.util.concurrent.Fences cannot be provided with the Java backend. > > [0] http://gee.cs.oswego.edu/dl/jmm/cookbook.html > -- > Andreas Zwinkau > > KIT IPD Snelting > Web: http://pp.ipd.kit.edu/personhp/andreas_zwinkau.php > > [attachment "signature.asc" deleted by David P Grove/Watson/IBM] > ------------------------------------------------------------------------------ > Transform Data into Opportunity. > Accelerate data analysis in your applications with > Intel Data Analytics Acceleration Library. > Click to learn more. > http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140 > _______________________________________________ > X10-users mailing list > X10...@li... > https://lists.sourceforge.net/lists/listinfo/x10-users |
From: Andreas Z. <zw...@ki...> - 2016-03-18 13:02:58
Attachments:
signature.asc
|
Am 11.03.2016 um 18:57 schrieb David P Grove: > > Would it make a difference if the static fields were declared to be public > instead of their current package-level visibility? Then it might be a > little harder for the JIT to justify "knowing" that the variables were > never accessed concurrently. Harder yes, but not guaranteed. > > Andreas Zwinkau <zw...@ki...> wrote on 03/11/2016 07:31:07 AM: >> >> Am 04.03.2016 um 17:27 schrieb David P Grove: >>> >>> Andreas Zwinkau <zw...@ki...> wrote on 03/04/2016 07:02:30 AM: >>> >>>> From: Andreas Zwinkau <zw...@ki...> >>>> To: x10...@li... >>>> Date: 03/04/2016 07:03 AM >>>> Subject: [X10-users] Fences Java implementation >>>> >>>> The x10.util.concurrent.Fences implementation looks reasonable for C+ > +, >>>> but the Java implementation seems weird to me. >>>> >>>> For example, loadLoadBarrier is implemented via two volatile reads. > Why >>>> *two*? I could ask more questions ... >>>> >>>> The code was added by David Grove in 2009 it seems: >>>> https://github.com/x10-lang/x10/commit/ >>>> f4f5420ce476f2538f358509ad13ef52b8c2c4c1 >>>> >>>> Any idea where that came from? Any sources? Any reasons why it is the >>>> way it is? >>> >>> Hi Andreas, >>> >>> Best as I can remember, the Java code that is there was based on >>> discussion being had in 2009 about how to get memory fences given the >>> definition of the Java memory model and java.util.concurrent as it > existed >>> at the time. >>> >>> --dave >> >> I compared it with the JMM Cookbook [0] and it follows it very directly. >> For example, "volatile load then volatile store means loadStore barrier". >> >> The problem is the JMM considers volatile variables separately, so the >> JVM can optimize volatiles. If the JVM figures out the volatile >> variables in FencesUtils.java are never accessed concurrently, then they >> can be optimized away. I made a toy example and this became >> loadLoadBarrier assembly: >> >> # {method} 'loadLoadBarrier' '()V' in 'FencesUtils' >> # [sp+0x20] (sp of caller) >> 0x00007fa3f905f5c0: sub $0x18,%rsp >> 0x00007fa3f905f5c7: mov %rbp,0x10(%rsp) ;*synchronization entry >> ; - >> FencesUtils::loadLoadBarrier@-1 (line 38) >> 0x00007fa3f905f5cc: mov $0x7acd4eee8,%r10 ; {oop(a >> 'java/lang/Class' = 'FencesUtils')} >> 0x00007fa3f905f5d6: mov 0x58(%r10),%r8d >> 0x00007fa3f905f5da: mov %r8d,0x60(%r10) ;*getstatic v1 >> ; - >> FencesUtils::loadLoadBarrier@0 (line 38) >> 0x00007fa3f905f5de: mov 0x5c(%r10),%r11d >> 0x00007fa3f905f5e2: mov %r11d,0x64(%r10) ;*getstatic v2 >> ; - >> FencesUtils::loadLoadBarrier@6 (line 39) >> 0x00007fa3f905f5e6: add $0x10,%rsp >> 0x00007fa3f905f5ea: pop %rbp >> 0x00007fa3f905f5eb: test %eax,0xc967a0f(%rip) # >> 0x00007fa4059c7000 >> ; {poll_return} >> 0x00007fa3f905f5f1: retq >> >> No synchronization instruction. The other three barriers contained a >> "lock addl" instruction. >> >> I think this sometimes might work, but not reliably. Afaik there is no >> way to correctly implement FencesUtils.java. This implies >> x10.util.concurrent.Fences cannot be provided with the Java backend. >> >> [0] http://gee.cs.oswego.edu/dl/jmm/cookbook.html >> -- >> Andreas Zwinkau >> >> KIT IPD Snelting >> Web: http://pp.ipd.kit.edu/personhp/andreas_zwinkau.php >> >> [attachment "signature.asc" deleted by David P Grove/Watson/IBM] >> > ------------------------------------------------------------------------------ > >> Transform Data into Opportunity. >> Accelerate data analysis in your applications with >> Intel Data Analytics Acceleration Library. >> Click to learn more. >> http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140 >> _______________________________________________ >> X10-users mailing list >> X10...@li... >> https://lists.sourceforge.net/lists/listinfo/x10-users > > > > ------------------------------------------------------------------------------ > Transform Data into Opportunity. > Accelerate data analysis in your applications with > Intel Data Analytics Acceleration Library. > Click to learn more. > http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140 > > > > _______________________________________________ > X10-users mailing list > X10...@li... > https://lists.sourceforge.net/lists/listinfo/x10-users > -- Andreas Zwinkau KIT IPD Snelting Web: http://pp.ipd.kit.edu/personhp/andreas_zwinkau.php |