|
From: Yajie M. <yaj...@gm...> - 2014-12-14 00:01:21
|
Hi Jan, This is very nice work! In our PDNN toolkit, we also have simple python wrappers to read and write Kaldi features, mainly for DNN training. Your implementation looks like a more comprehensive version. Do you have the functions/commands to do feature splicing? I ask this because we found doing splicing on the fly with Python highly expensive. That's why we still stick to PFiles instead of Kaldi features (.scp .ark) for DNN triaining. I am very interested to know the efficiency of your splicing implementation. Thanks, Yajie On Sat, Dec 13, 2014 at 5:59 PM, Daniel Povey <dp...@gm...> wrote: > OK, thanks. > cc'ing Yajie in case he wants to comment. > Dan > > > On Sat, Dec 13, 2014 at 2:31 PM, Jan Chorowski <jan...@gm...> > wrote: > > Hi All, > > > > the wrapper is built during Kaldi compilation. I build it using provided > > Makefile. The build depends on: > > 1. Python and numpy (by default it queries the python interpreter found > on > > the path for header file location) > > 2. Boost with Boost::Python library. It is quite heavy to build, but most > > Linux distributions ship it. Boost python doesn't require any code > > generation steps, the wrapper is defined in a normal c++ code file. > > > > During build Python and Boost libraries and Kaldi object files are linked > > into a CPython extention module, kaldi/src/python/kaldi_io_internal.so. > It > > works with both static and shared Kaldi builds. Further usage requires > that > > python finds kaldi_io.py and kaldi_io_internal.so on the PYTHONPATH - it > can > > be for example added to the PYTHONPATH variable in the path.sh script of > a > > recipe. > > > > Jan > > > > > > On 12/13/2014 3:33 PM, Daniel Povey wrote: > >> > >> Also, Jan- could you send us an email explaining how this works- > >> How does Python "see" the C++ headers? Do you have to invoke some > >> special program, like swig? Do you have to write some special kind of > >> header that shows how the C++ objects are to be interpreted by python? > >> A brief example would be helpful, if so. > >> How is the resulting program linked, if at all? If you require > >> functions C++ libraries, are these obtained from the .a or .so files > >> at runtime, or compiled into some kind of executable-like blob at > >> compile time? Does your framework require that Kaldi be compiled > >> using dynamic (.so) libraries? > >> > >> Dan > >> > >> > >> On Sat, Dec 13, 2014 at 12:04 PM, Jan Chorowski < > jan...@gm...> > >> wrote: > >>> > >>> Hello Dan, > >>> > >>> thank you for the comments. I tried to make it in the Kaldi spirit, > >>> consistency is important. Of course, the scripts can be removed and > >>> replaced > >>> with some more useful examples. I don't have too much experience with > >>> bridging Python to C++, so any critique on the wrappers and the > approach > >>> taken is welcome. > >>> > >>> Jan > >>> > >>> > >>> On 12/13/2014 2:55 PM, Daniel Povey wrote: > >>>> > >>>> Hi all. > >>>> From a first look, it does look very impressive, and nicely > >>>> documented. > >>>> I would appreciate it if people on the list who have Python experience > >>>> would comment on this- you can either reply to this thread, or to me. > >>>> I don't know if this has been done in the "natural" way, or if there > >>>> is some reason why people in the future will say, "why did you do it > >>>> this way, you should have done XXX". > >>>> > >>>> Jan: > >>>> in the scripts/ directory you seem to have some examples of how you > >>>> can create python programs that behave very much like Kaldi > >>>> command-line programs, using your framework. This is very useful. > >>>> However, the programs > >>>> apply-global-cmvn.py > >>>> compute-global-cmvn-stats.py > >>>> are perhaps a little confusing because they provide the same > >>>> functionality that you could get with "compute-cmvn-stats -> > >>>> matrix-sum" and "apply-cmvn" on the output of that command; and they > >>>> do so using different formats for the CMVN information. I know the > >>>> format of storing the CMVN stats in a two-row matrix is perhaps not > >>>> perfectly ideal, but it's a standard within Kaldi and it would be > >>>> confusing to deviate from that standard. > >>>> Of course, this is a very minor issue that doesn't affect the validity > >>>> of the framework as a whole. I am just pointing this out; the main > >>>> discussion should be about the framework and whether people feel it's > >>>> the "right" way to do this. > >>>> > >>>> Dan > >>>> > >>>> On Sat, Dec 13, 2014 at 6:28 AM, Jan Chorowski < > jan...@gm...> > >>>> wrote: > >>>>> > >>>>> Hi all! > >>>>> > >>>>> I've written wrappers to access Kaldi data files from within Python > >>>>> using boost::python (the code is on github > >>>>> https://github.com/janchorowski/kaldi-git/tree/python/src/python). > If > >>>>> you think this would be an interesting addition please instruct me > how > >>>>> to contribute. > >>>>> > >>>>> Best Regards, > >>>>> Jan Chorowski > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > ------------------------------------------------------------------------------ > >>>>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > >>>>> from Actuate! Instantly Supercharge Your Business Reports and > >>>>> Dashboards > >>>>> with Interactivity, Sharing, Native Excel Exports, App Integration & > >>>>> more > >>>>> Get technology previously reserved for billion-dollar corporations, > >>>>> FREE > >>>>> > >>>>> > >>>>> > http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk > >>>>> _______________________________________________ > >>>>> Kaldi-developers mailing list > >>>>> Kal...@li... > >>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers > >>> > >>> > > > |