|
From: Daniel P. <dp...@gm...> - 2013-12-11 18:27:40
|
Sorry, changing the list recipient to kaldi-users which is a real, existing list. Dan On Wed, Dec 11, 2013 at 1:16 PM, Daniel Povey <dp...@gm...> wrote: > Matthew, > > This is interesting. People frequently ask me about how to wrap Kaldi in > scripting languages such as Python. > I'm cc'ing kaldi-discuss, so it gets archived for web search, and a couple > more people who I think may be interested. > For people's reference, the change you just checked in is here: > http://sourceforge.net/p/kaldi/code/3297/ > Ondrej Platek has another method, based on CFFI, that can be found > at ^/sandbox/oplatek2/src/pykaldi > (where ^ is the top of the Kaldi source tree, i.e. svn:// > svn.code.sf.net/p/kaldi/code/), and Danijel Korzinek has used Java, which > can be found at ^/sandbox/korzinek/src/onlinebin/java-online-audio-client/ > At this point, I'm not sure that I want to settle on one "official" way to > wrap Kaldi, I'd rather have people demonstrate various ways to do it and > people can choose what they want for their application. > > Dan > > > > On Wed, Dec 11, 2013 at 11:06 AM, Matthew Aylett <mat...@gm...>wrote: > >> Hi all >> >> I've checked in some code which is a fairly vanilla use of SWIG to wrap >> kaldi functionality. >> >> The logic for this for me was to allow me to easily convert kaldi trees >> into HTK ones to test v HTS (not a long term functionality so maybe not >> the best example). >> >> The idea was to wrap the io code to avoid rewriting scripts to read in >> and parse ascii outputs. >> >> In effect you give python function to load the object and traverse it as >> well as accessing info that is required. >> >> In order to do this I had to add some accesor function on event-map.h >> >> If you check out the idlak branch you will find: >> >> idlak-voice-build/pythonlib >> >> In here we have: >> >> Makefile: >> >> This is to wrap the api using SWIG. There is an issue of where it gets >> the binary swig from (versions can behave differently and error >> differently). At present it just picks it up from the path which is not >> ideal. >> >> Also it needs to find the python library. This varies by platfom (it is >> present on Mac but not completely obvious). I've only tested this for linux. >> >> idlakapi.i: >> >> This is the swig interface file. As you will see I don't do anything much >> in this apart from include headers. Although if you need to add callbacks >> etc there are ways of doing this here to take into account the python >> GlobalInterpreterLock. >> >> idlakapi.h: >> >> This si the API I am wrapping. In effect I am adding another layer which >> then indirectly accesses kaldi structures. I do this because I don't like >> wrapping .h files that haven't been written for the purpose. In my >> experience the dependencies get out of hand and you end up wrapping tons >> more stuff than you want. In this approach you add functions as you require >> them for real pythin (or java etc) applications. >> >> idlakapi.cc: >> >> This is the code which directly accesses the kaldi library functions. >> >> example.py: >> >> A little python program which reads in a context dependency tree and >> prints it out. In effect duplicating copy-tree --binary=false in.tree - >> >> (well almost it adds a CR at the end of the file) >> >> However this does it by traversing and accessing the EventMaps in the >> tree. Currently only works on binary trees. >> >> Questions: >> >> Why wrap at all? >> >> Well handy for things like realtime decoders. For me I find it a very >> quick way of developing new stuff quickly because you can jump into python >> and use a whole load of available stuff and jump back to C++ then you write >> the C++ once it works. >> >> However it is a bit dangerous with kaldi because you have a nice command >> line style format and if mis-handled could lead to a much dirtier looking >> code base. The last thing I'd want to see is the way festival used Scheme. >> So an absolute ban on scary things like C++ code calling python programs. >> >> For the voice building process it will speed up development for me but if >> you guys want me to replace this with command line versions I can do so. >> Probably should in the long run anyway. >> >> Why wrap like this? >> >> Well I think SWIG is good because it allows you to wrap in multiple >> languages (even Perl ;-) >> >> I like it to be clear what is being wrapped and to keep it seperate from >> the main code base. I would suggest the same thing with producing a C API >> as well. >> >> But there are a lot of ways to skin this and its really up to the >> community to decide which they want to do it (if at all). Here is an >> example. If the consensus is to do it a different way I'm happy to change >> it but I think its useful to have a working example. I am by no means a >> Makefile expert either so this may need changing, it is basically assuming >> a load of static libraries and pretending the spi is like a bin program. >> >> By the way is anyone now keeping my branch up to date with the trunk? >> Arnab was doing so but he may not have time now. I can do it if you remind >> me what the commands are. >> >> Happy Xmas etc. >> >> Matthew >> >> >> >> >> >> >> >> >> >> >> >> On Tue, Jul 16, 2013 at 4:00 PM, ondrej platek <ond...@se...>wrote: >> >>> Thanks, for a very positive answer and useful information. I should >>> learn SWIG probably:) >>> >>> I am very interested in the example SWIG wrapping into idlak. >>> >>> Cheers >>> >>> Ondra >>> >>> >>> On Tue, Jul 16, 2013 at 4:27 PM, Matthew Aylett <mat...@ce...>wrote: >>> >>>> Hi >>>> >>>> See below >>>> >>>> I will add very quick notes about SWIG in following pros and cons list: >>>>> >>>>> Pros: >>>>> - Multiple languages - not interesting for me >>>>> >>>> Faire enough but might be useful for kaldi community >>>> >>>> >>>> - Once setup makefiles it works - interesting for me >>>>> >>>> >>>> Well that ain't so easy but there should be examples. I have tended to >>>> use cmake to do the heavy lifting for me here but its not ideal and not >>>> used in kaldi. >>>> >>>> >>>> >>>>> Cons >>>>> - If the API C/C++ API is changed you need to changed on 3 places (1 >>>>> more place for *.i) >>>>> >>>> >>>> 1 place in the C API. The header file is included into SWIG not copied. >>>> However... >>>> >>>> >>>> What worse if the API changes just in C++ and somebody forget to >>>>> update the interface file, >>>>> >>>> >>>> Yes this is an issue. But one that comes with a C API. So not specific >>>> to SWIG. Maintaining a seperate C API requires keeping it updated with the >>>> underlying C++ libs. This can be more problematic if you say define a >>>> struct once for an internal C header and then again for an external one >>>> (typically the compiler won't spot that either so you are in trouble). >>>> >>>> I would say that having the external C API is good discipline if we are >>>> serious at cross platform support but maybe no really an issue at this time. >>>> >>>> >>>> I am unable to debug it (with cffi I successfully used gdb already) >>>>> -> if you can I will be really interested. >>>>> >>>> >>>> No thats fine. You just gdb python and run the code in there. Can be >>>> messy if the bug is in the interface (very unusual) but fine for the >>>> underlying C code. Hence my use of this approach for rapid prototyping. >>>> >>>> >>>> - Heavier dependancy than cffi-python (I guess it is arguable). >>>>> >>>> >>>> I think SWIG is self contained but could be wrong. >>>> >>>> >>>> I use quite a lot of machines (5-6) without sudo access where is >>>>> very old Swig. >>>>> >>>> >>>> It is easy to download and compile locally without sudo. You just need >>>> it on your path to run. >>>> >>>> >>>> I can install cffi very easily. Actually I wrote installation script >>>>> for it (True it requires Python headers and libffi. Hopefully the libffi >>>>> dependancy >>>>> disappears in future.) >>>>> >>>>> To conclude, the main objective against SWIG is, >>>>> that for developing interfaces (which changes quickly) SWIG is a heavy >>>>> choice for me. >>>>> With cffi-python it seems to me that I can work much faster. >>>>> >>>> >>>> I think if you are building a clean C API then it is easy to switch to >>>> SWIG at a later date if required anyway. I could use SWIG for idlak and we >>>> could compare and then go with whichever the community thought best in the >>>> long run. >>>> >>>> At some point I'll check in an example SWIG wrapping into idlak for >>>> people to look at. >>>> >>>> All the very best >>>> >>>> >>>> M. >>>> >>>> >>>>> Best >>>>> >>>>> Ondra >>>>> >>>>> PS: With cffi-python the docs is quite good but the included demos >>>>> gives you better idea how it can be used: >>>>> https://bitbucket.org/cffi/cffi/src/ab9e53ebcfb97eb8bee8d15cac57c3 >>>>> f41d42f6f4/demo?at=default >>>>> PS2: I would like to learn how to use SWIG effectively. I was just >>>>> incapable of doing it so far :) >>>>> >>>>> >>>>> On Tue, Jul 16, 2013 at 2:54 PM, Matthew Aylett < >>>>> mat...@gm...> wrote: >>>>> Hi >>>>> No prefer to be involved because will want to wrap Idlak too and will >>>>> need to do it the same way. >>>>> >>>>> I very much agree that if we wrap code it should be round a clear C >>>>> API. As such how it's wrapped becomes less of an issue. Wrapping C++ tends >>>>> to make >>>>> memory management less explicit. >>>>> >>>>> Doing this is most problematic if 1. you want to return anything other >>>>> than values or 2. if you need callbacks. >>>>> >>>>> On returning things like char * for example you need to decide on a >>>>> standard approach. What I do is make sure there is a structure or class >>>>> which manages >>>>> the memory, then have a formal python function to create it into a >>>>> python pointer and formally delete it when done. Then use C API calls to >>>>> access any >>>>> days. >>>>> >>>>> For callbacks the approach is more problematic because of things like >>>>> the Global Lock in Python. >>>>> >>>>> If you want to write the callback in python you end up with Python >>>>> calls C calls Python returns to C returns to Python. However I'm guessing >>>>> we don't use >>>>> callbacks much in kaldi (although if you write a realtime decoder that >>>>> might well change). >>>>> >>>>> Funnily enough Ondrej, your point about it all being in C as a >>>>> positive is not so much for me. Me experience of SWIG has been very good >>>>> (once you get the >>>>> makefiles to work). For standard C API just running it over the header >>>>> file is enough to generate everything you need. (Care required for 1 and >>>>> yes SWIG >>>>> specific code required for 2.) So you don't generally need to know >>>>> anything about the SWIG language to wrap 99% of everything. >>>>> >>>>> Because it produces python versions of the functions you can use >>>>> command line completion (which I can't generally live without in python). >>>>> >>>>> However the best thing about it is that wrapping Java, C# is just as >>>>> easy (with the caveat about 2. I had to write java specific code for >>>>> callbacks in >>>>> particular to support the Android compiler.) >>>>> >>>>> However 90% of the work is making the C style API so if thats done its >>>>> easy enough to change to SWIG if you want to wrap Java etc. >>>>> >>>>> I can see where you are coming from but in my experience when you wrap >>>>> code it precisely because the user DOESN'T know C. Also the SWIG wrapping >>>>> is so >>>>> fast an easy I have used it often in development to try out C routines >>>>> and code and hacking pything for all the IO display stuff. >>>>> >>>>> So for SWIG an example my be the following in the header: >>>>> >>>>> typedef struct CPRC_buf CPRC_buf; >>>>> >>>>> /** Create a buffer. (if max_sz == -1 unlimited size)*/ >>>>> extern CPRC_buf * CPRC_buf_new(int min_sz, int max_sz); >>>>> /** Remove a buffer and its associated memory. */ >>>>> extern void CPRC_buf_delete(CPRC_buf * buf); >>>>> /** Add a string emsg to the buffer. */ >>>>> extern void CPRC_buf_add(CPRC_buf *buf, const char * emsg); >>>>> /** Empty the buffer. */ >>>>> extern const char * CPRC_buf_clear(CPRC_buf * buf); >>>>> /** Return a pointer to the contents of the buffer. */ >>>>> extern const char * CPRC_buf_get(CPRC_buf * buf); >>>>> >>>>> >>>>> and the .i file: >>>>> >>>>> /*file: mymod.i */ >>>>> %feature("autodoc", "1"); >>>>> >>>>> %module mymod >>>>> %{ >>>>> extern "C" { >>>>> #include "mymod.h" >>>>> } >>>>> %} >>>>> >>>>> >>>>> extern "C" { >>>>> #include "mymod.h" >>>>> } >>>>> %include "mymod.h" >>>>> >>>>> >>>>> Thats it. Add another function to the header it just automatically >>>>> wraps it. >>>>> >>>>> On saying all this the big issue is still the vanilla C API and >>>>> consistency. I just think you may have been a bit quick to dismiss SWIG. >>>>> >>>>> However I have a very poor grasp of most of the code you are >>>>> discussing so take my comments with that in mind. >>>>> >>>>> Very best >>>>> >>>>> Matthew >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Tue, Jul 16, 2013 at 9:39 AM, Vassil Panayotov < >>>>> vas...@gm...> wrote: >>>>> Hi, >>>>> >>>>> On Tue, Jul 16, 2013 at 1:34 AM, ondrej platek < >>>>> ond...@se...> wrote: >>>>> > I am CCopying Vassil, because he might be interested in the >>>>> Python/C >>>>> > interface of his online decoder. >>>>> >>>>> Ondra, thanks for adding me. >>>>> >>>>> I think I was somewhat confused at first about the scope of your >>>>> project. >>>>> I thought, that you are working on a fine grained Python API, which >>>>> gives direct access to the core classes in the online decoder. I only >>>>> glanced through your code, but by looking at KaldiDecoderWrapper it >>>>> seems to me it's more like the user gets access to a commonly used >>>>> decoder configuration - i.e. KaldiDecoderWrapper::Setup() is basically >>>>> the same as the main() functions of the sample binaries in onlinebin/. >>>>> It seems to me, that in that regard, the functionality provided by >>>>> your code is very similar to that of Tanel Alumae's gst-plugin >>>>> (GStreamer has Python bindings). Just an observation - not saying it >>>>> as if it's something bad... >>>>> >>>>> By the way I would prefer to keep the core online decoder code as >>>>> compact as possible, and I think the code that is used only by the >>>>> extensions should be placed in their respective directories. I think >>>>> OnlineBlockSource should be placed in separate .h/.cc files, like this >>>>> is the case with GstBufferSource (in gst-plugin/gst-audio-source.h). >>>>> >>>>> PS: Mathew, I know you are working mostly on Idlak, but since you said >>>>> you have plenty of experience with wrapping code for interfacing with >>>>> Python (and I don't have such experience) I decided against removing >>>>> you from this discussion, that I realize may not be very interesting >>>>> to you. >>>>> >>>>> Vassil >>>>> >>>>> > See the second half: Concrete example. >>>>> > >>>>> > What is CFFI? >>>>> > There is a CFFI library known for interfacing C to list. >>>>> > However, there is also a quite new Python library completely >>>>> unrelated to >>>>> > the lisp one. >>>>> > http://cffi.readthedocs.org/en/release-0.6/ >>>>> > I liked it a lot since it interfaces the languages Python and C >>>>> without any >>>>> > intermediate language >>>>> > and it does it really elegant way. >>>>> > >>>>> > I used swig several times and is is a way, but not for me after >>>>> trying >>>>> > python-cffi. >>>>> > Mainly because swig intermediate language it is not standard >>>>> language with >>>>> > ready tools for debugging, lints and other helper stuff for >>>>> developing. >>>>> > Debugging the language interface is not really fun. >>>>> > >>>>> > I have to admit I am highly bias to python-cffi against swig. >>>>> > Mainly, because I am able to write things in Python-cffi very easily >>>>> which I >>>>> > was unable to do in Swig. >>>>> > >>>>> > I wrap the C++ interface by the C interface and the using the >>>>> python-cffi I >>>>> > access the C code from Python (and vice versa). >>>>> > I highly recommend to restrict the C++ to C interface, because the >>>>> > simplification with >>>>> > >>>>> > a) NO function overloading/virtual function tables (forced by the C >>>>> syntax) >>>>> > b) NO problems with mangling (if you define correct C interface) >>>>> > >>>>> > pays back very early. >>>>> > >>>>> > Note, that python-cffi is not able to interface C++ code and I am >>>>> quite glad >>>>> > for it. >>>>> > I guess it pays off use as simple as possible interface between any >>>>> > languages also for Swig. >>>>> > >>>>> > ================= >>>>> > Concrete example: >>>>> > >>>>> > Last week I was working on C interface for online-gmm-faster decoder. >>>>> > I wrapped the functionality of the binary >>>>> > kaldi/trunk/src/online-gmm-decode-faster to C++ class >>>>> KaldiDecoderWrapper >>>>> > In addition I added C wrapper to the C++ class, so it supports C >>>>> linkage. >>>>> > >>>>> > Both the C and C++ interface can be found at >>>>> > kaldi/src/pykaldi/online-python-gmm-decode-faster.h >>>>> > either in sandbox/oplatek2 or at github >>>>> > (concretaly at : >>>>> > https://github.com/oplatek/pykaldi/blob/master/src/ >>>>> pykaldi/online-python-gmm-decode-faster.h) >>>>> > >>>>> > How the python-cffi is used against the C linkage can be seen at >>>>> > OnlineDecoder: >>>>> > https://github.com/oplatek/pykaldi/blob/master/src/ >>>>> pykaldi/decoders/kaldi_decoders.py#L50 >>>>> > >>>>> > How to try the demo for OnlineDecoder which works for me under Linux >>>>> is >>>>> > described at: >>>>> > https://github.com/oplatek/pykaldi/tree/master/src/ >>>>> pykaldi/binutils#how-to-run-the-demo >>>>> > >>>>> > Let me know if you have problems running the demo (I did not test it >>>>> well, >>>>> > but I wanted to reply with something real). >>>>> > >>>>> > >>>>> > PS: Thanks Vassil for updating the online-audio interface. The Read >>>>> function >>>>> > without timetout makes better sense to me. >>>>> > >>>>> > >>>>> > On Mon, Jul 15, 2013 at 9:22 PM, Daniel Povey <dp...@gm...> >>>>> wrote: >>>>> >> >>>>> >> Attaching Ondrej, he may have some comments. >>>>> >> Dan >>>>> >> >>>>> >> >>>>> >> On Mon, Jul 15, 2013 at 3:22 AM, Matthew Aylett < >>>>> mat...@gm...> >>>>> >> wrote: >>>>> >> > Hi >>>>> >> > >>>>> >> > CFFI - >>>>> >> > Common Foreign Function Interface, purports to be a portable >>>>> foreign >>>>> >> > function interface for Common Lisp. >>>>> >> > >>>>> >> > Eh? >>>>> >> > >>>>> >> > I have quite lot of experience wrapping for Python. My advice is: >>>>> >> > >>>>> >> > 1. Define a C API for wrapping. Avoid blanking wrapping of >>>>> headers with >>>>> >> > something like SWIG and define and justify which functions and >>>>> >> > structures >>>>> >> > should be wrapped. Do not wrap C++ objects (fine for dev work) >>>>> but very >>>>> >> > hard >>>>> >> > to control in a release environment if we ever want to release >>>>> kaldi as >>>>> >> > a >>>>> >> > binary library. In addition C API becomes basic API for such a >>>>> library >>>>> >> > avoiding C++ name mangling issues. Same API is then used to wrap >>>>> for C# >>>>> >> > Java >>>>> >> > etc. >>>>> >> > >>>>> >> > 2. Use something like SWIG to automatically generate >>>>> python/C#/Java API >>>>> >> > code >>>>> >> > from the C API. >>>>> >> > >>>>> >> > Of course when you find a stable way to do something you just end >>>>> up >>>>> >> > sticking with it so its perfectly possible there are better ways >>>>> and all >>>>> >> > this has been explored etc. (Arnab mentioned they had already >>>>> looked at >>>>> >> > SWIG >>>>> >> > but perhaps only in the context of wrapping already written >>>>> headers not >>>>> >> > based on writing a formal API header.) I'm guessing the CFFI is >>>>> more or >>>>> >> > less >>>>> >> > 1. above. Anyway happy to discuss my experiences with Ondrej if >>>>> its any >>>>> >> > help. >>>>> >> > >>>>> >> > Best >>>>> >> > >>>>> >> > M. >>>>> >> >>>>> >> > >>>>> >> > >>>>> >> > >>>>> >> > On Mon, Jul 15, 2013 at 6:27 AM, Daniel Povey <dp...@gm...> >>>>> wrote: >>>>> >> >> >>>>> >> >> Hi everyone, >>>>> >> >> I have been a bit lax in the last few weeks in sending out >>>>> updates. >>>>> >> >> >>>>> >> >> People might have noticed a change in the build process. We >>>>> >> >> reorganized the setup to use dynamic libraries (so Kaldi does >>>>> not take >>>>> >> >> up so much space on disk) and to consolidate the shared parts of >>>>> the >>>>> >> >> Makefiles. Thanks to Jan Trmal and Ondrej Platek who did most >>>>> of the >>>>> >> >> work. >>>>> >> >> >>>>> >> >> There have been some contributions from new people, regarding >>>>> various >>>>> >> >> ways to call Kaldi from other languages and frameworks: >>>>> >> >> >>>>> >> >> - Tanel Alumae has created an example Kaldi plugin for the >>>>> >> >> GStreamer framework (see http://en.wikipedia.org/wiki/GStreamerfor >>>>> >> >> what that is). see src/gst-plugin/ >>>>> >> >> - Danijel Korzinek has written some Java code that can >>>>> function as >>>>> >> >> a source of audio for the online decoder (there is some info >>>>> about >>>>> >> >> this at http://kaldi.sourceforge.net/online_programs.html) >>>>> >> >> - Ondrej Platek is currently working on ways to call Kaldi >>>>> from >>>>> >> >> Python (this is not yet merged to the trunk, it's still in >>>>> >> >> ^/sandbox/oplatek). It's based on "CFFI" (the Common Foreign >>>>> Function >>>>> >> >> Interface). >>>>> >> >> >>>>> >> >> Vassil Panayotov has been helping to test these additions and >>>>> has been >>>>> >> >> improving the online-decoding code. >>>>> >> >> >>>>> >> >> I have added a recipe for the Fisher English database. >>>>> >> >> >>>>> >> >> Gilles Boulianne improved the (time and memory) efficiency of the >>>>> >> >> arpa2fst program. >>>>> >> >> >>>>> >> >> Karel Vesely continues to improve his neural-network training >>>>> setup >>>>> >> >> and the associated recipes. >>>>> >> >> >>>>> >> >> I had been hoping by now to have already committed changes >>>>> regarding >>>>> >> >> the use of ReLUs (rectified linear units) to my neural-network >>>>> setup, >>>>> >> >> but it's not ready yet. ReLUs seemed to be helpful on RM and >>>>> WSJ-84, >>>>> >> >> but when I ran the recipe on the WSJ full SI284 data I didn't >>>>> see a >>>>> >> >> clear improvement. Right now I am working on larger changes to >>>>> that >>>>> >> >> setup, including a rewrite of the scripts that's intended to >>>>> make the >>>>> >> >> training more efficient, but it will probably be at least a few >>>>> weeks >>>>> >> >> before I can check it in. Also a CUDA-enabled version of my >>>>> parallel >>>>> >> >> neural network training setup is in the works (i.e. training on >>>>> >> >> multiple CUDA cards) but this will be at least a couple months. >>>>> I did >>>>> >> >> not get time (as I had hoped I would) to add any speaker-id >>>>> stuff to >>>>> >> >> Kaldi. >>>>> >> >> >>>>> >> >> Dan >>>>> >> >> >>>>> >> >> -- >>>>> >> >> >>>>> >> >> --- >>>>> >> >> You received this message because you are subscribed to the >>>>> Google >>>>> >> >> Groups >>>>> >> >> "but10" group. >>>>> >> >> To unsubscribe from this group and stop receiving emails from >>>>> it, send >>>>> >> >> an >>>>> >> >> email to but...@go.... >>>>> >> >> For more options, visit https://groups.google.com/groups/opt_out >>>>> . >>>>> >> >> >>>>> >> >> >>>>> >> > >>>>> > >>>>> > >>>>> >>>>> >>>>> >>>>> >>>>> >>> >> > |