|
From: Amit B. <ami...@gm...> - 2015-03-31 11:51:45
|
Thanks Jan and Dan for the help On Mon, Mar 30, 2015 at 8:24 PM, Daniel Povey <dp...@gm...> wrote: > The configure script is supposed to work with the libraries named > libsatlas.so.3 and libtatlas.so.3, but perhaps it hasn't been set up to > work with just the .so extension. These versions of the libraries are > called "fat" libraries. See the part linux_configure_redhat_fat of the > script. What is your OS? Red hat? Ubuntu? Debian? And what is the full > pathname of your installed libraries? > I see this, but I'm using Ubuntu and the installed library is under /usr/local/atlas/lib/libtatlas.so, so linux_configure_redhat_fat doesn't find them, and also linux_configure_dynamic with ATLASROOT. I'm patching it now and will submit a commit to solve cases like mine > > As an aside, sometimes there are shell variables that you have to set in > order to enable threaded operation- you can find them by searching online. > Just linking against a threaded library is not always sufficient. > I didn't know that, but because my C3 setup worked fine with the same procedure, I thought it might work on C4 as well. I'll keep this in mind. > > Dan > > > > Re the problems with compiling atlas -- libsatlas and libtatlas mean that >> you compiled both sequential and threaded version of them. Probably you >> could symlink one or another as libatlas? Anyway, I think this will be >> probably caused by either ATLAS configure or the linux system setup. >> Usually we advice the users just to use the distribution-provided versions >> of atlas, as the optimizing and compilation is usually fairly complex >> process that takes a long time (and usually the speedup gains are not >> really huge). >> > You can't replace the "thin" libraries with the "fat" ones, because when using fat libraries there are no liblapack.so (and others), only one fat shared object. So if I'd change the configure script to look either for all the libraries or one fat library, I believe it will work. Regarding the distribution-provided version - it performs much worse for some reason, so I prefer to try and succeed in building it locally :) > Kaldi support also MKL, which to my experience is usually as fast >> (sometimes faster) as the openSource alternatives and might be more robust >> performance-wise. >> > I'll check it out - thanks > >> Re the latency: I'm not sure about this. Usually, the BLAS libraries >> optimized for performance (GFLOPS) and especially for huge matrices and I'm >> not sure how this correlates with latency. In ideal world there would be >> 1:1 correspondence, but kaldi typically uses small to medium-size matrices >> (depending on what part of kaldi you are using). >> Also, I'm not really convinced this is an BLAS issue -- did you tried >> openBLAS on C3? You can always use gprof to help you to figure out where >> the delay really is >> > I've tried OpenBLAS with both C3/C4, and and both give approximately the same results (C4 is slightly faster), but they are inferior to ATLAS > >> y. >> >> >> On Mon, Mar 30, 2015 at 7:12 AM, Amit Beka <ami...@gm...> wrote: >> >>> Hi, >>> >>> I'm trying to use kaldi with ATLAS/OpenBLAS on two EC2 machines: >>> c3.large and c4.large (both have 2 cores, c3 with 2.8GHz and c4 with >>> 2.9GHz). >>> >>> I have compiled ATLAS locally on both machines, and in C3 I have a great >>> latency of ~130ms for a response (after the last chunk of an utterance is >>> sent). In C4 I have a problem compiling ATLAS (both stable and development >>> versions), and even when I does succeed, I can't seem to compile Kaldi with >>> it - it only finds the static library, because the shared libraries are >>> named lib*s*atlas.so and lib*t*atlas.so, and I get this error: >>> >>> /usr/local/atlas/lib/liblapack.a(clapack_dgetri.o): In function >>> `clapack_dgetri': >>> clapack_dgetri.c:(.text+0x2d): undefined reference to >>> `ATL_dGetNB' >>> /usr/local/atlas/lib/liblapack.a(clapack_sgetri.o): In function >>> `clapack_sgetri': >>> clapack_sgetri.c:(.text+0x2d): undefined reference to `ATL_sGetNB' >>> >>> When I compile with OpenBLAS (using tools/Makefile), I get a latency >>> which is much worse, around 320ms per utterance, with both serial and >>> threaded versions (I compiled twice with USE_THREADS=0/1). >>> >>> So I wonder - does OpenBLAS and ATLAS should have the same performance >>> when used throught kaldi? am I missing something? >>> >>> I configure kaldi with --shared and --threaded-math=yes (+ the mathlib >>> flags) >>> >>> Thanks in advance, >>> Beka >>> >>> >>> ------------------------------------------------------------------------------ >>> Dive into the World of Parallel Programming The Go Parallel Website, >>> sponsored >>> by Intel and developed in partnership with Slashdot Media, is your hub >>> for all >>> things parallel software development, from weekly thought leadership >>> blogs to >>> news, videos, case studies, tutorials and more. Take a look and join the >>> conversation now. http://goparallel.sourceforge.net/ >>> _______________________________________________ >>> Kaldi-users mailing list >>> Kal...@li... >>> https://lists.sourceforge.net/lists/listinfo/kaldi-users >>> >>> >> >> >> ------------------------------------------------------------------------------ >> Dive into the World of Parallel Programming The Go Parallel Website, >> sponsored >> by Intel and developed in partnership with Slashdot Media, is your hub >> for all >> things parallel software development, from weekly thought leadership >> blogs to >> news, videos, case studies, tutorials and more. Take a look and join the >> conversation now. http://goparallel.sourceforge.net/ >> _______________________________________________ >> Kaldi-users mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-users >> >> > |