Menu

Seg-fault on compilation of custom ViennaCL code through ocl::backend

Steve B
2017-08-21
2017-08-21
  • Steve B

    Steve B - 2017-08-21

    TL;DR: I'm getting a seg fault at the first execution of the executable. It traces through ViennaCL's ocl::backend into /libstdc++.so.6. It's unclear to me whether this is a ViennaCL issue or package mis-management issue or (???)

    I'm currently attempting to add GPU support to the Stan-math project. The easiest way to do this has been to use ViennaCL's decompositions and custom kernel features. Everything is wonderful and all the tests pass when I compile the GPU test's for Stan using clang++. But when I switch over to g++ some odd errors happen.

    You can view the project here and checkout the branch gpu_choleksy to run the following example

    (The part of the PR relevant to this discussion begins here)

    g++ -I . -isystem lib/eigen_3.3.3 -isystem lib/boost_1.62.0 -isystemlib/cvodes_2.9.0/include -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DNO_FPRINTF_OUTPUT -pipe   -c -O0 -g -DGTEST_USE_OWN_TR1_TUPLE -fPIC -DGTEST_HAS_PTHREAD=0 -isystem lib/gtest_1.7.0/include -isystem lib/gtest_1.7.0 lib/gtest_1.7.0/src/gtest-all.cc -o test/gtest.o
    ar rv test/libgtest.a test/gtest.o
    
    g++ -I . -isystem lib/eigen_3.3.3 -isystem lib/boost_1.62.0 -isystemlib/cvodes_2.9.0/include -isystem lib/viennacl_1.7.1 -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DNO_FPRINTF_OUTPUT -pipe   -c -O0 -g -DGTEST_USE_OWN_TR1_TUPLE -DGTEST_HAS_PTHREAD=0 -D STAN_GPU -isystem lib/gtest_1.7.0/include -isystem lib/gtest_1.7.0 test/unit/math/rev/mat/fun/cholesky_decompose_gpu_test.cpp -o test/unit/math/rev/mat/fun/cholesky_decompose_gpu_test.o -lOpenCL
    
    g++ -I . -isystem lib/eigen_3.3.3 -isystem lib/boost_1.62.0 -isystemlib/cvodes_2.9.0/include -Wall -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DNO_FPRINTF_OUTPUT -pipe  -lpthread -D STAN_GPU  -O0 -g lib/gtest_1.7.0/src/gtest_main.cc test/unit/math/rev/mat/fun/cholesky_decompose_gpu_test.o -DGTEST_USE_OWN_TR1_TUPLE -DGTEST_HAS_PTHREAD=0 -isystem lib/gtest_1.7.0/include -isystem lib/gtest_1.7.0 -o test/unit/math/rev/mat/fun/cholesky_decompose_gpu_test -lOpenCL test/libgtest.a  lib/cvodes_2.9.0/lib/libsundials_nvecserial.a lib/cvodes_2.9.0/lib/libsundials_cvodes.a
    
    rm test/unit/math/rev/mat/fun/cholesky_decompose_gpu_test.o 
    
    gdb test/unit/math/rev/mat/fun/cholesky_decompose_gpu_test
    ...
    gdb stuff
    ...
    (gdb) start
    Temporary breakpoint 1 at 0xfbdf: file lib/gtest_1.7.0/src/gtest_main.cc, line 35.
    Starting program: /home/steve/open_source/open_projects/stan/math/test/unit/math/rev/mat/fun/cholesky_decompose_gpu_test 
    
    Program received signal SIGSEGV, Segmentation fault.
    0x00007ffff78eb3fa in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    (gdb) bt
    #0  0x00007ffff78eb3fa in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    #1  0x00005555555f87e3 in std::_Rb_tree_iterator<std::pair<long const, bool> >::operator-- (this=0x7fffffffccb0) at /usr/include/c++/6/bits/stl_tree.h:224
    #2  0x00005555555f8732 in std::_Rb_tree<long, std::pair<long const, bool>, std::_Select1st<std::pair<long const, bool> >, std::less<long>, std::allocator<std::pair<long const, bool> > >::_M_get_insert_unique_pos (this=0x55555595d380 <viennacl::ocl::backend<false>::initialized_>, __k=@0x555555982280: 0) at /usr/include/c++/6/bits/stl_tree.h:1845
    #3  0x00005555555d9bc2 in std::_Rb_tree<long, std::pair<long const, bool>, std::_Select1st<std::pair<long const, bool> >, std::less<long>, std::allocator<std::pair<long const, bool> > >::_M_get_insert_hint_unique_pos (this=0x55555595d380 <viennacl::ocl::backend<false>::initialized_>, __position={first = 0, second = false}, __k=@0x555555982280: 0)
        at /usr/include/c++/6/bits/stl_tree.h:1942
    #4  0x00005555555c1def in std::_Rb_tree<long, std::pair<long const, bool>, std::_Select1st<std::pair<long const, bool> >, std::less<long>, std::allocator<std::pair<long const, bool> > >::_M_emplace_hint_unique<std::piecewise_construct_t const&, std::tuple<long const&>, std::tuple<> >(std::_Rb_tree_const_iterator<std::pair<long const, bool> >, std::piecewise_construct_t const&, std::tuple<long const&>&&, std::tuple<>&&) (this=0x55555595d380 <viennacl::ocl::backend<false>::initialized_>, __pos={first = 0, second = false}, __args#0=..., 
        __args#1=<unknown type in /home/steve/open_source/open_projects/stan/math/test/unit/math/rev/mat/fun/cholesky_decompose_gpu_test, CU 0x63b9, DIE 0x25b550>, 
        __args#2=<unknown type in /home/steve/open_source/open_projects/stan/math/test/unit/math/rev/mat/fun/cholesky_decompose_gpu_test, CU 0x63b9, DIE 0x2a1b0d>)
        at /usr/include/c++/6/bits/stl_tree.h:2200
    #5  0x00005555555b25a1 in std::map<long, bool, std::less<long>, std::allocator<std::pair<long const, bool> > >::operator[] (
        this=0x55555595d380 <viennacl::ocl::backend<false>::initialized_>, __k=@0x7fffffffce68: 0) at /usr/include/c++/6/bits/stl_map.h:483
    #6  0x000055555559dc1b in viennacl::ocl::backend<false>::context (id=0) at lib/viennacl_1.7.1/viennacl/ocl/backend.hpp:51
    #7  0x000055555559dbe1 in viennacl::ocl::backend<false>::current_context () at lib/viennacl_1.7.1/viennacl/ocl/backend.hpp:81
    #8  0x00005555555793db in viennacl::ocl::current_context () at lib/viennacl_1.7.1/viennacl/ocl/backend.hpp:215
    #9  0x00005555555700b0 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at ./stan/math/prim/mat/fun/ViennaCL.hpp:30
    #10 0x0000555555570ef6 in _GLOBAL__sub_I_in () at test/unit/math/rev/mat/fun/cholesky_decompose_gpu_test.cpp:368
    #11 0x00005555556ab9ed in __libc_csu_init ()
    #12 0x00007ffff6f7d380 in __libc_start_main (main=0x555555563bd0 <main(int, char**)>, argc=1, argv=0x7fffffffe018, init=0x5555556ab9a0 <__libc_csu_init>, fini=<optimized out>, 
        rtld_fini=<optimized out>, stack_end=0x7fffffffe008) at ../csu/libc-start.c:247
    #13 0x0000555555563aca in _start ()
    

    I'm fairly new to this level of depth in terms of 'things going very wrong'.

    1. Does this appear to be a ViennaCL issue or a seperate compilation issue?
    2. Is there any other information I can give you that can help you help me resolve this?

    I understand you are all very busy and I appreciate any time you can give to look at this.

    Regards,

    Steve Bronder

     
  • Karl Rupp

    Karl Rupp - 2017-08-22

    Hi Steve,
    from what you describe it looks a lot like there's a problem with a singleton. More precisely, we use static member variables to hold references to the OpenCL contexts. Apparently there is some problem when you link the object files together. I don't know why the error shows up with GCC, but not with Clang.

    Possible workaround: Is it possible for you to put all code that deals with ViennaCL into a single compilation unit?

    Best regards,
    Karli

     

Log in to post a comment.