From: Timo B. <tim...@gm...> - 2019-03-14 00:50:09
|
Hi, I have pinned down the next failed test. It still seems related to the multi-indexing even with your bugfixed version. The corresponding gist is here: https://gist.github.com/tbetcke/0bf7e12a2f3ab8032339cc38b8441b6e At the end of the kernel all entries in shapeIntegral should have the value 1.0. However, while shapeIntegral[0][0] is correct, shapeIntegral[1][0] is not. If I move the second print statement for shapeIntegral[1][0] into the for loop the variables are correctly updated. Just something for context. The actual kernel from which this example is derived, is doing a finite element integral on a triangle. The test values are from the test space and the trial values from the domain space. Via C Macros I am adapting the dimensions of the arrays to the actual number of test and trial functions. The crash happens for trial dimension 1 and test dimension 3. Thanks again for your help. I am excited about getting Pocl to work with our software. Best wishes Timo On Wed, 13 Mar 2019 at 23:23, Timo Betcke <tim...@gm...> wrote: > Hi Michal, > > thanks for the bugfix. The crashes have now disappeared and more tests are > passing with your bugfix version. However, several unit tests still fail > that work with AMD and Intel. Briefly looking at the results I see lots of > nan entries in the pocl output. I will try to pin this down more and then > report back to you. > > Best wishes > > Timo > > On Mon, 11 Mar 2019 at 10:50, Michal Babej (TAU) <mic...@tu...> > wrote: > >> Hello, >> >> >> I remember trying to fix this bug last year, but then i got sidetracked >> by other things. (BTW it would be preferable if you reported bugs as github >> issues in the future) >> >> >> Anyway, i've hopefully fixed it. Can you test your program with master >> branch from https://github.com/franz/pocl >> >> >> Regards, >> >> -- mb >> ------------------------------ >> *From:* Timo Betcke <tim...@gm...> >> *Sent:* Friday, March 8, 2019 3:48:34 AM >> *To:* Portable Computing Language development discussion >> *Subject:* Re: [pocl-devel] POCL Crash in vmovaps operation >> >> Dear Pekka, >> >> I have now cooked up a small example that crashes in vmovaps. The gist is >> available here (uses PyOpenCL to run): >> >> https://gist.github.com/tbetcke/b4da01465b587e85cc88801aafdced0a >> >> The example is fairly nonsensical and was derived by reducing a crashing >> kernel as far as possible while retaining the crash. >> It runs fine under Intel CPU OpenCL on a Xeon and Rocm OpenCL on an AMD >> GPU. My platform is Ubuntu 18.04 with llvm 6. If necessary >> I can create an environment with updated llvm, but would like to avoid it >> (unless it is llvm 6 related). Pocl is the most recent git master. >> >> The code crashes at the following assembler instructions: >> >> 0x00007fffe02575e3 <+195>: xor r9d,r9d >> 0x00007fffe02575e6 <+198>: xor r10d,r10d >> 0x00007fffe02575e9 <+201>: nop DWORD PTR [rax+0x0] >> 0x00007fffe02575f0 <+208>: mov QWORD PTR [rdx+r9*1],0x0 >> => 0x00007fffe02575f8 <+216>: vmovaps XMMWORD PTR [rdi+r9*1-0x10],xmm0 >> 0x00007fffe02575ff <+223>: mov QWORD PTR [rdi+r9*1],0x0 >> 0x00007fffe0257607 <+231>: vmovaps XMMWORD PTR [rdx+r9*1-0x10],xmm0 >> 0x00007fffe025760e <+238>: vmovupd xmm1,XMMWORD PTR [rdi+r9*1-0x8] >> 0x00007fffe0257615 <+245>: vaddpd xmm1,xmm1,XMMWORD PTR >> [rdx+r9*1-0x8] >> 0x00007fffe025761c <+252>: vmovupd XMMWORD PTR [rdx+r9*1-0x8],xmm1 >> 0x00007fffe0257623 <+259>: mov r8,r11 >> 0x00007fffe0257626 <+262>: sar r8,0x20 >> 0x00007fffe025762a <+266>: lea rsi,[r8+r8*2] >> >> Removing any of the for loops or the localResult variable (or removing >> its __local attribute) leads to the kernel working on Pocl. >> It would be great to get to the source of this. Please let me know if you >> need more information from me. >> >> Best wishes >> >> Timo >> >> >> On Wed, 6 Mar 2019 at 21:21, Timo Betcke <tim...@gm...> wrote: >> >> Hi Pekka, >> >> thanks for your hints and the link. I had one buffer in the kernel call >> that had a cast from a float type to a vector type. I have fixed this. But >> the segfault remains. In the next few days I will try to cook up a simple >> example that produces the segfault. Fortunately, the kernel itself is not >> too complicated, so should be able to reduce it. >> >> Best wishes >> >> Timo >> >> On Wed, 6 Mar 2019 at 10:20, Pekka Jääskeläinen (TAU) < >> pek...@tu...> wrote: >> >> Yes, now that I look at it more closely, >> your stack trace looks _very_ much to the common data alignment >> issues people have. I think this might be worth a FAQ item somewhere. >> >> >> https://stackoverflow.com/questions/5983389/how-to-align-stack-at-32-byte-boundary-in-gcc >> >> On 6.3.2019 8.45, Pekka Jääskeläinen (TAU) wrote: >> > Hi Timo, >> > >> > Shooting in the dark here, but since just yesterday I debugged a >> similar >> > looking issue >> > which was caused by an illegal cast in the source code from float* to >> > float4*. It trusted >> > the alignment is still fine, which it wasn't after vectorization. A >> very >> > target specific programming >> > error which many ocl targets can easily hide. >> > >> > If this is something else, we need a test case, smaller the better, to >> > help you here. >> > Before opening an issue though, please with the latest master and LLVM >> 8. >> > >> > Pekka >> > >> > ------------------------------------------------------------------------ >> > *From:* Timo Betcke <tim...@gm...> >> > *Sent:* Tuesday, March 5, 2019 11:27:12 PM >> > *To:* Portable Computing Language development discussion >> > *Subject:* [pocl-devel] POCL Crash in vmovaps operation >> > Dear Pocl community, >> > >> > I was just testing the newest Pocl Version (github master branch) with >> > our software. During execution of one of our kernels Pocl crashed. >> > Disassembling the crash shows the following operations during the crash: >> > >> > ------------------ >> > 0x00007fffb81efdd8 <+664>: vmulpd xmm2,xmm2,xmm6 >> > 0x00007fffb81efddc <+668>: vsubpd xmm2,xmm5,xmm2 >> > 0x00007fffb81efde0 <+672>: vpermilpd xmm5,xmm4,0x1 >> > 0x00007fffb81efde6 <+678>: vmulsd xmm3,xmm3,xmm5 >> > 0x00007fffb81efdea <+682>: vmulsd xmm4,xmm15,xmm4 >> > 0x00007fffb81efdee <+686>: vsubsd xmm3,xmm3,xmm4 >> > 0x00007fffb81efdf2 <+690>: vpermilpd xmm1,xmm1,0x1 >> > 0x00007fffb81efdf8 <+696>: vmulpd xmm0,xmm0,xmm1 >> > 0x00007fffb81efdfc <+700>: vpermilpd xmm1,xmm0,0x1 >> > 0x00007fffb81efe02 <+706>: vsubsd xmm0,xmm0,xmm1 >> > 0x00007fffb81efe06 <+710>: lea rsi,[rdx+rdx*2] >> > 0x00007fffb81efe0a <+714>: mov rdx,QWORD PTR [rbx+0x38] >> > => 0x00007fffb81efe0e <+718>: vmovaps XMMWORD PTR [rdx+rsi*8],xmm12 >> > ---Type <return> to continue, or q <return> to quit--- >> > 0x00007fffb81efe13 <+723>: mov QWORD PTR [rbx+0x40],rsi >> > 0x00007fffb81efe17 <+727>: mov QWORD PTR [rdx+rsi*8+0x10],0x0 >> > 0x00007fffb81efe20 <+736>: vinsertf32x4 ymm1,ymm16,xmm0,0x1 >> > ----------------------------- >> > This seems to be a similar bug that I discussed a year ago on the >> > mailing list. See the thread here: >> > >> https://www.mail-archive.com/poc...@li.../msg01087.html. >> >> > In summary, the issue was related to us using arrays of arrays within >> > our kernels and pocl creating wrong code for it. >> > >> > During that time a gist was suggested for Pocl, which I tested but did >> > not improve things. Afterwards I let it drop for a while as we were in >> > early development and had loads of building sites. But our software is >> > now close to release ready and it would be great to get it working with >> > pocl. >> > >> > Any help would be greatly appreciated. >> > Best wishes >> > >> > Timo >> > >> > -- >> > Timo Betcke >> > Professor of Computational Mathematics >> > University College London >> > Department of Mathematics >> > E-Mail: t.b...@uc... <mailto:t.b...@uc...> >> > Tel.: +44 (0) 20-3108-4068 >> > >> > >> > _______________________________________________ >> > pocl-devel mailing list >> > poc...@li... >> > https://lists.sourceforge.net/lists/listinfo/pocl-devel >> > >> >> -- >> Pekka >> >> >> _______________________________________________ >> pocl-devel mailing list >> poc...@li... >> https://lists.sourceforge.net/lists/listinfo/pocl-devel >> >> >> >> -- >> Timo Betcke >> Professor of Computational Mathematics >> University College London >> Department of Mathematics >> E-Mail: t.b...@uc... >> Tel.: +44 (0) 20-3108-4068 >> >> >> >> -- >> Timo Betcke >> Professor of Computational Mathematics >> University College London >> Department of Mathematics >> E-Mail: t.b...@uc... >> Tel.: +44 (0) 20-3108-4068 >> _______________________________________________ >> pocl-devel mailing list >> poc...@li... >> https://lists.sourceforge.net/lists/listinfo/pocl-devel >> > > > -- > Timo Betcke > Professor of Computational Mathematics > University College London > Department of Mathematics > E-Mail: t.b...@uc... > Tel.: +44 (0) 20-3108-4068 > -- Timo Betcke Professor of Computational Mathematics University College London Department of Mathematics E-Mail: t.b...@uc... Tel.: +44 (0) 20-3108-4068 |