Hi,
Me and my university project team are new to Kaldi so maybe the problem we encountered isn't that hard to solve:
What we are supposed to do:
We're getting a WFSA, which describes a lattice/utterance, out of an RNN and want to decode it with the Kaldi tool. The WFSA has states for each frame and arcs out of each state for each phoneme. Each arc's weight describes probability for that phoneme during that frame/state. Decoding part shall include pruning.
What we are trying to do:
Having studied Kaldi for a few weeks now, our strategy is to convert our WFSA into a kaldi-lattice and use lattice-compose with Kaldi's (LoG) and then use lattice-best-path for decoding.
Where we got stuck:
We see a method "lattice-to-fst" but not vice-versa. We need something similar to what is happening in lattice-compose, when using with an fst as second argument (in particular via fst::MapFst<StdArc, LatticeArc,="" fst::StdToLatticeMapper<BaseFloat=""> >. Yet it declares the FST's weights as graph weights and, as far as we understand, the utterance's weights are meant to be the acoustic weights since the utterance comes out of our RNN describing an acoustic model. However, writing a toy lattice by hand into a .txt-file doesn't work neither because methods like lattice-compose can't recognize that file as a lattice.
Why don't we use Kaldi's neural network implementation?:
Our RNN implements CTC (Connectionist Temporal Classification). Thus our utterance consists of a lot of blanks we still need to figure out how to deal with but that's an other issue. It is mandatory for our project to use our RNN.
Does anyone know a simple way to convert an fst to lattice and put weights to acoustic weights of that lattice?
Is there even a much easier way to accomplish what we have to ?
Every answer is highly appreciated,
Wilhelm K.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You don't need to use kaldi's lattice format, you can use standard FST
tools such as fstcompose and fstebestpath.
However, you might need to remove the disambiguation symbols from LG.
You could use fstrmsymbols (a Kaldi program) or write a script.
Read more at openfst.org; and search for hbka.pdf and read the
decoding-graph construction section at kaldi.sf.net to understand the
decoding framework.
Dan
Hi,
Me and my university project team are new to Kaldi so maybe the problem we
encountered isn't that hard to solve:
What we are supposed to do:
We're getting a WFSA, which describes a lattice/utterance, out of an RNN and
want to decode it with the Kaldi tool. The WFSA has states for each frame
and arcs out of each state for each phoneme. Each arc's weight describes
probability for that phoneme during that frame/state. Decoding part shall
include pruning.
What we are trying to do:
Having studied Kaldi for a few weeks now, our strategy is to convert our
WFSA into a kaldi-lattice and use lattice-compose with Kaldi's (LoG) and
then use lattice-best-path for decoding.
Where we got stuck:
We see a method "lattice-to-fst" but not vice-versa. We need something
similar to what is happening in lattice-compose, when using with an fst as
second argument (in particular via fst::MapFst<StdArc, LatticeArc,="" fst::StdToLatticeMapper<BaseFloat=""> >. Yet it declares the FST's weights
as graph weights and, as far as we understand, the utterance's weights are
meant to be the acoustic weights since the utterance comes out of our RNN
describing an acoustic model. However, writing a toy lattice by hand into a
.txt-file doesn't work neither because methods like lattice-compose can't
recognize that file as a lattice.
Why don't we use Kaldi's neural network implementation?:
Our RNN implements CTC (Connectionist Temporal Classification). Thus our
utterance consists of a lot of blanks we still need to figure out how to
deal with but that's an other issue. It is mandatory for our project to use
our RNN.
Does anyone know a simple way to convert an fst to lattice and put weights
to acoustic weights of that lattice?
Is there even a much easier way to accomplish what we have to ?
Hi,
Me and my university project team are new to Kaldi so maybe the problem we encountered isn't that hard to solve:
What we are supposed to do:
We're getting a WFSA, which describes a lattice/utterance, out of an RNN and want to decode it with the Kaldi tool. The WFSA has states for each frame and arcs out of each state for each phoneme. Each arc's weight describes probability for that phoneme during that frame/state. Decoding part shall include pruning.
What we are trying to do:
Having studied Kaldi for a few weeks now, our strategy is to convert our WFSA into a kaldi-lattice and use lattice-compose with Kaldi's (LoG) and then use lattice-best-path for decoding.
Where we got stuck:
We see a method "lattice-to-fst" but not vice-versa. We need something similar to what is happening in lattice-compose, when using with an fst as second argument (in particular via fst::MapFst<StdArc, LatticeArc,="" fst::StdToLatticeMapper<BaseFloat=""> >. Yet it declares the FST's weights as graph weights and, as far as we understand, the utterance's weights are meant to be the acoustic weights since the utterance comes out of our RNN describing an acoustic model. However, writing a toy lattice by hand into a .txt-file doesn't work neither because methods like lattice-compose can't recognize that file as a lattice.
Why don't we use Kaldi's neural network implementation?:
Our RNN implements CTC (Connectionist Temporal Classification). Thus our utterance consists of a lot of blanks we still need to figure out how to deal with but that's an other issue. It is mandatory for our project to use our RNN.
Does anyone know a simple way to convert an fst to lattice and put weights to acoustic weights of that lattice?
Is there even a much easier way to accomplish what we have to ?
Every answer is highly appreciated,
Wilhelm K.
You don't need to use kaldi's lattice format, you can use standard FST
tools such as fstcompose and fstebestpath.
However, you might need to remove the disambiguation symbols from LG.
You could use fstrmsymbols (a Kaldi program) or write a script.
Read more at openfst.org; and search for hbka.pdf and read the
decoding-graph construction section at kaldi.sf.net to understand the
decoding framework.
Dan
On Tue, Jul 14, 2015 at 8:26 AM, Will Kirchgaessner
wilhelmk-upb@users.sf.net wrote: