From: Ondrej P. <ond...@gm...> - 2014-12-19 11:48:20
|
Hi Matthew, I made some subjective comments below. PS: Note that I like the proposed wrappers, but I am not sure how boost::python is easy to install on all supported platforms. On Fri, Dec 19, 2014 at 9:30 AM, Matthew Aylett <mat...@gm...> wrote: > > Hi > > Apologies, I've been snowed under here. > > I haven' had a chance to look over your work. I also don't have any views > on the 'right' way to do it. My thoughts on this are in a previous thread. > See subject "Using SWIG to wrap kaldi for python" where I discussed this > with ondrej platek and > Vassil Panayotov. > > In the idlak branch there is an example of python wrappers that I put > together some time ago. These are based on SWIG. In the end I didn't need > this at this stage because in the build system command line executables > work very well. Its in run time wrappers are very useful. The advantage > with SWIG is that the much of the same work will also contribute to C#, > Java, Perl wrappers as well. In my experience the most important were Java > wrappers to help produce a library for Android. I have no experience with > C# and moved to Python from Perl so only use Perl in legacy code ;-). > > So some questions to consider: > > 1. Why is python wrapping required for training. using sys.Process to run > command lines, structured output directories etc mirrors the current Perl > recipes, what is the added benefit in this case? > Well bash and Perl is the current scripting language for Kaldi. For example I prefer to use Python instead of both of them. > 2. If its for run time decoding shouldn't we create a cross platfom C > API? Perhaps things have changed but C++ APIs were never cross compiler > compatible in the past so you couldn't do stuff like compile using gnu and > link in MSN. With a C interface you can distribute libraries. But I am > possibly out of date on this. > Well, I tried that and I gave it up since Kaldi nicely uses OpenFST and I was not able to wrap OpenFST with just plain C (It may be possible). I used Cython and pyfst mainly because pyfst solved for me wrapping up OpenFST and I am really glad that 99% of wrapping OpenFST templates was carried out by somebody else (Victor Chahuneau). > > 3. If 2 is correct shouldn't we define our API and wrap that? Producing a > formal list of functionality that should be exposed to things like client > and server applications? > > I would encourage some care here. Unconstrained wrapping can lead to > systems which HAVE to use the scripting language (We can already see how > difficult it is to move away from the Perl scripting if you wish to). Also > never, never, never reverse wrap (i.e. call python from within C++), yes it > can be done but that way lays madness. > > v best > > Matthew > > > On Thu, Dec 18, 2014 at 11:37 PM, Daniel Povey <dp...@gm...> wrote: >> >> Jan- >> I haven't seen any objections to your setup. I'd say we should plan >> to include it in Kaldi at some point (e.g. within the next few >> months), but in the meantime hopefully you can continue to work on it, >> and maybe come up with some other examples of how it's useful to do >> the interfacing with Python- e.g. some kind of application level or >> service-level thing? >> Dan >> >> >> On Sat, Dec 13, 2014 at 4:01 PM, Yajie Miao <yaj...@gm...> wrote: >> > Hi Jan, >> > This is very nice work! In our PDNN toolkit, we also have simple python >> > wrappers to read and write Kaldi features, mainly for DNN training. Your >> > implementation looks like a more comprehensive version. >> > >> > Do you have the functions/commands to do feature splicing? I ask this >> > because we found doing splicing on the fly with Python highly expensive. >> > That's why we still stick to PFiles instead of Kaldi features (.scp >> .ark) >> > for DNN triaining. I am very interested to know the efficiency of your >> > splicing implementation. >> > >> > Thanks, >> > Yajie >> > >> > On Sat, Dec 13, 2014 at 5:59 PM, Daniel Povey <dp...@gm...> wrote: >> >> >> >> OK, thanks. >> >> cc'ing Yajie in case he wants to comment. >> >> Dan >> >> >> >> >> >> On Sat, Dec 13, 2014 at 2:31 PM, Jan Chorowski < >> jan...@gm...> >> >> wrote: >> >> > Hi All, >> >> > >> >> > the wrapper is built during Kaldi compilation. I build it using >> provided >> >> > Makefile. The build depends on: >> >> > 1. Python and numpy (by default it queries the python interpreter >> found >> >> > on >> >> > the path for header file location) >> >> > 2. Boost with Boost::Python library. It is quite heavy to build, but >> >> > most >> >> > Linux distributions ship it. Boost python doesn't require any code >> >> > generation steps, the wrapper is defined in a normal c++ code file. >> >> > >> >> > During build Python and Boost libraries and Kaldi object files are >> >> > linked >> >> > into a CPython extention module, >> kaldi/src/python/kaldi_io_internal.so. >> >> > It >> >> > works with both static and shared Kaldi builds. Further usage >> requires >> >> > that >> >> > python finds kaldi_io.py and kaldi_io_internal.so on the PYTHONPATH >> - it >> >> > can >> >> > be for example added to the PYTHONPATH variable in the path.sh >> script of >> >> > a >> >> > recipe. >> >> > >> >> > Jan >> >> > >> >> > >> >> > On 12/13/2014 3:33 PM, Daniel Povey wrote: >> >> >> >> >> >> Also, Jan- could you send us an email explaining how this works- >> >> >> How does Python "see" the C++ headers? Do you have to invoke >> some >> >> >> special program, like swig? Do you have to write some special kind >> of >> >> >> header that shows how the C++ objects are to be interpreted by >> python? >> >> >> A brief example would be helpful, if so. >> >> >> How is the resulting program linked, if at all? If you require >> >> >> functions C++ libraries, are these obtained from the .a or .so files >> >> >> at runtime, or compiled into some kind of executable-like blob at >> >> >> compile time? Does your framework require that Kaldi be compiled >> >> >> using dynamic (.so) libraries? >> >> >> >> >> >> Dan >> >> >> >> >> >> >> >> >> On Sat, Dec 13, 2014 at 12:04 PM, Jan Chorowski >> >> >> <jan...@gm...> >> >> >> wrote: >> >> >>> >> >> >>> Hello Dan, >> >> >>> >> >> >>> thank you for the comments. I tried to make it in the Kaldi spirit, >> >> >>> consistency is important. Of course, the scripts can be removed and >> >> >>> replaced >> >> >>> with some more useful examples. I don't have too much experience >> with >> >> >>> bridging Python to C++, so any critique on the wrappers and the >> >> >>> approach >> >> >>> taken is welcome. >> >> >>> >> >> >>> Jan >> >> >>> >> >> >>> >> >> >>> On 12/13/2014 2:55 PM, Daniel Povey wrote: >> >> >>>> >> >> >>>> Hi all. >> >> >>>> From a first look, it does look very impressive, and nicely >> >> >>>> documented. >> >> >>>> I would appreciate it if people on the list who have Python >> >> >>>> experience >> >> >>>> would comment on this- you can either reply to this thread, or to >> me. >> >> >>>> I don't know if this has been done in the "natural" way, or if >> there >> >> >>>> is some reason why people in the future will say, "why did you do >> it >> >> >>>> this way, you should have done XXX". >> >> >>>> >> >> >>>> Jan: >> >> >>>> in the scripts/ directory you seem to have some examples of how >> you >> >> >>>> can create python programs that behave very much like Kaldi >> >> >>>> command-line programs, using your framework. This is very useful. >> >> >>>> However, the programs >> >> >>>> apply-global-cmvn.py >> >> >>>> compute-global-cmvn-stats.py >> >> >>>> are perhaps a little confusing because they provide the same >> >> >>>> functionality that you could get with "compute-cmvn-stats -> >> >> >>>> matrix-sum" and "apply-cmvn" on the output of that command; and >> they >> >> >>>> do so using different formats for the CMVN information. I know >> the >> >> >>>> format of storing the CMVN stats in a two-row matrix is perhaps >> not >> >> >>>> perfectly ideal, but it's a standard within Kaldi and it would be >> >> >>>> confusing to deviate from that standard. >> >> >>>> Of course, this is a very minor issue that doesn't affect the >> >> >>>> validity >> >> >>>> of the framework as a whole. I am just pointing this out; the >> main >> >> >>>> discussion should be about the framework and whether people feel >> it's >> >> >>>> the "right" way to do this. >> >> >>>> >> >> >>>> Dan >> >> >>>> >> >> >>>> On Sat, Dec 13, 2014 at 6:28 AM, Jan Chorowski >> >> >>>> <jan...@gm...> >> >> >>>> wrote: >> >> >>>>> >> >> >>>>> Hi all! >> >> >>>>> >> >> >>>>> I've written wrappers to access Kaldi data files from within >> Python >> >> >>>>> using boost::python (the code is on github >> >> >>>>> https://github.com/janchorowski/kaldi-git/tree/python/src/python >> ). >> >> >>>>> If >> >> >>>>> you think this would be an interesting addition please instruct >> me >> >> >>>>> how >> >> >>>>> to contribute. >> >> >>>>> >> >> >>>>> Best Regards, >> >> >>>>> Jan Chorowski >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> ------------------------------------------------------------------------------ >> >> >>>>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server >> >> >>>>> from Actuate! Instantly Supercharge Your Business Reports and >> >> >>>>> Dashboards >> >> >>>>> with Interactivity, Sharing, Native Excel Exports, App >> Integration & >> >> >>>>> more >> >> >>>>> Get technology previously reserved for billion-dollar >> corporations, >> >> >>>>> FREE >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk >> >> >>>>> _______________________________________________ >> >> >>>>> Kaldi-developers mailing list >> >> >>>>> Kal...@li... >> >> >>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> >> >>> >> >> >>> >> >> > >> > >> > >> >> >> ------------------------------------------------------------------------------ >> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server >> from Actuate! Instantly Supercharge Your Business Reports and Dashboards >> with Interactivity, Sharing, Native Excel Exports, App Integration & more >> Get technology previously reserved for billion-dollar corporations, FREE >> >> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> > > > ------------------------------------------------------------------------------ > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > > http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > -- Ondřej Plátek, +420 737 758 650, skype:ondrejplatek, ond...@gm... |