Kaldi / Discussion / Developers: Kaldi Behind a REST API

fatlog - 2015-02-24

Hi,

I am working on a project where we are looking to possibly put Kaldi behind a REST API.
I was wondering if there are suggested/preferred ways of doing this.
In the documentation it is suggested to use GridEngine to deploy Kaldi to work in parallel.
Would this be considered "production" ready? i.e. Could you deploy Kaldi on GridEngine and build a REST API to wrap it?
Or are you better off just having X number of slaves, each with their own version of Kaldi and wrap this with an API?
Is there any reason not to use GridEngine in a production environment?

I suppose what I am really looking for here is advice on how best to deploy Kaldi in a cloud type environment so that it can handle a large volume of requests and is both scalable and performant.

Thanks
Robert

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Jan "yenda" Trmal - 2015-02-24
  
  I'd say Kaldi SGE interface is in the stage of being production ready. It
  is being used on daily basis at many both research and commercial sites.
  
  SGE itself is not being actively developed (AFAIK) -- Sun Grid Engine was
  bought by Oracle (and it was renamed to Oracle Grid Engine). Oracle, for
  some reason, lost interest in it and either gave it or sold it or
  transferred the rights to Univa, so now there is Univa Grid Engine. Univa
  is your company if you want/need commercial support -- I think they provide
  support for SGE and OGE as well.
  The SGE predecessor was PBS (portable batch system) and OpenPBS (I'm not
  sure of their relationship). There is open source project Torque, which
  continues on the top of the OpenPBS codebase to add new features and
  bugfixes.
  I'm mentioning it because I believe the job submission interface is largely
  the same so Kaldi SGE interface could work on OpenPBS/Torque as well. I'm
  stressing the 'could', I have no experience with that. Perhaps someone
  else could confirm or disclaim this?
  
  For some reason, lot of HPC sites are using SLURM nowadays. We have a SLURM
  interface as well. It's not used that much (as far as I know it is/was used
  in ICSI) but SLURM itself is actively developed and you can buy commercial
  support for it as well.
  
  y.
  
  On Tue, Feb 24, 2015 at 5:54 AM, Robert robertoregan@users.sf.net wrote:
  
  Hi,
  
  I am working on a project where we are looking to possibly put Kaldi
  behind a REST API.
  I was wondering if there are suggested/preferred ways of doing this.
  In the documentation it is suggested to use GridEngine to deploy Kaldi to
  work in parallel.
  Would this be considered "production" ready? i.e. Could you deploy Kaldi
  on GridEngine and build a REST API to wrap it?
  Or are you better off just having X number of slaves, each with their own
  version of Kaldi and wrap this with an API?
  Is there any reason not to use GridEngine in a production environment?
  
  I suppose what I am really looking for here is advice on how best to
  deploy Kaldi in a cloud type environment so that it can handle a large
  volume of requests and is both scalable and performant.
  
  Thanks
  Robert
  
  Kaldi Behind a REST API
  https://sourceforge.net/p/kaldi/discussion/1355349/thread/51df5dd6/?limit=25#2ad5
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355349/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - RickyChan - 2015-05-14
    
    To let the job submission interface (I guess you meant queue.pl ) works on PBS (portable batch system), it requires some minimum changes:
    
    variable: $SGE_TASK_ID -> $PBS_ARRAYID
    qsub: qsub in PBS doesn't support "-cwd" and "-j y", just remove them when jobs are submitted on PBS
    qstat command: qstat -j $sge_job_id -> qstat -t $sge_job_id
    
    cue for process $queue_logfile in queue.pl:
    SGE qsub output -> Your job job_id *** has been submitted
    PBS qsub output -> job_id
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

fatlog - 2015-02-24

Hi Jan,

Thanks for the feedback and advice.
I have been playing around with StarCluster to manage cluster deployment with SGE and it seems fairly stable to me at present. I'll continue to investigate this.

Separately, would you have any other suggestions on how you would deploy Kaldi in a cloud type environment?

Are there specific ways of deploying it or am I realistically looking at spinning up an AWS instance for example (possibly from a custom image with Kaldi pre-installed), building a web service wrapper on top of this and managing instance lists in the applicatrion itself?

Thanks
Robert

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Jan "yenda" Trmal - 2015-02-24
  
  I think StarCluster is a good solution. There is Rocks cluster as well, but
  afaik it does not have the AWS dynamicity. Dan wrote some scripts and
  documentation dealing with Amazon EC2 and SGE and Kaldi -- you can find it
  here: https://sourceforge.net/projects/kluster/
  But I think StarCluster has much of the functionality already built in.
  
  After that, you are on your own I guess. There is something that is called
  DRMAA. I don't know the meaning of the abbreviation, but you can see it as
  a "job API", which allows you to submit jobs and to query about their
  properties without running the shell commands. Both SGE and Torque (and I
  think SLURM as well) support that API (again, I don't know how much or how
  good is the support), but I think this might be the way to go -- there are
  bindings for most of the scripting languages, including PHP and Java, I
  think, so you could avoid some of the complexities of using shell calls in
  your setup.
  
  y.
  
  On Tue, Feb 24, 2015 at 11:17 AM, Robert robertoregan@users.sf.net wrote:
  
  Hi Jan,
  
  Thanks for the feedback and advice.
  I have been playing around with StarCluster to manage cluster deployment
  with SGE and it seems fairly stable to me at present. I'll continue to
  investigate this.
  
  Separately, would you have any other suggestions on how you would deploy
  Kaldi in a cloud type environment?
  
  Are there specific ways of deploying it or am I realistically looking at
  spinning up an AWS instance for example (possibly from a custom image with
  Kaldi pre-installed), building a web service wrapper on top of this and
  managing instance lists in the applicatrion itself?
  
  Thanks
  Robert
  
  Kaldi Behind a REST API
  https://sourceforge.net/p/kaldi/discussion/1355349/thread/51df5dd6/?limit=25#f6ca
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355349/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

fatlog - 2015-02-24

Great, thanks for all the help.
Really appreciate it.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Daniel Povey - 2015-02-24
  
  Guys, in my mind SGE is something that's useful for training models, but
  you'd probably want a completely different solution for scalable
  recognition in a cloud environment.
  Dan
  
  On Tue, Feb 24, 2015 at 11:56 AM, Robert robertoregan@users.sf.net wrote:
  
  Great, thanks for all the help.
  Really appreciate it.
  
  Kaldi Behind a REST API
  https://sourceforge.net/p/kaldi/discussion/1355349/thread/51df5dd6/?limit=25#be3f
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355349/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Kaldi Behind a REST API

Forums

Help

Kaldi Behind a REST API

Kaldi Behind a REST API

Forums

Help

Kaldi Behind a REST API document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Kaldi Behind a REST API