Search discussion: Long Audio alignment

 
1 2 3 .. 10 > >> (Page 1 of 10)

Post by bezude on Failed to align audio to transcript when running bw for accoustic model (Post)
to adapt with? Here are one wav each of the kind that work (short) and the kind that don't (long). Also
Last updated: 2017-02-01

Post by bezude on Failed to align audio to transcript when running bw for accoustic model (Post)
all the longer recordings complain: ERROR: "backward.c", line 421: Failed to align audio to trancript
Last updated: 2017-01-31

Post by gorinars on word_align.pl failed with error code 65280 at /usr (Post)
can happen sometimes if some words are not well pronounced / audio has a too long silence or smth like
Last updated: 2016-12-03

Post by latylus on Live long text alignment with CMU Sphinx (Post)
Thank you for your quick answer. Unfortunatly the AlignerDemo solution doesn't support live long
Last updated: 2016-10-27

Post by nshmyrev on Live long text alignment with CMU Sphinx (Post)
Aligner demo in sphinx4 should do long audio to text alignment https://github.com/cmusphinx/sphinx4
Last updated: 2016-10-27

Post by latylus on Live long text alignment with CMU Sphinx (Post)
Hello, I'm trying to use Sphinx for live long audio-text alignment [ie. if I understand correctly
Last updated: 2016-10-27

Post by ferhuntaylan on What is the maximum length of audio file for convert to text? (Post)
What should be length of audio file? can we increase it? (im using long audio aligner) Regards
Last updated: 2016-10-15

Post by andymakespasta on pocket sphinx phenome recognition / alignment for seperation (Post)
to do "alignment" on the audio to get better segmentations? The audio files are only 5 - 20 s long
Last updated: 2016-10-08

Post by srdhm on CMU Sphinx 5prealpha alignment issue (Post)
displaying result with : List<WordResult> wr = aligner.align(audioUrl, text); for (WordResult result : wr
Last updated: 2016-09-08

Post by saxenauts on How to find timestamps of words in an audio (Post)
I have no experience in Java Programming and I need to run the aligner to get word timestamps
Last updated: 2016-07-26

Post by gaga001 on SENTENCE ERROR 100% Training Sphinx4 (Post)
after 10 iterations In log file details: ERROR: "backward.c" line 421: Failed to align audio
Last updated: 2016-07-12

Post by lupomuc on Long audio alignment in Pocketsphinx (Post)
I'm giving up. If I had a few spare days, I'd love to implement the clean solution: Add a new function to ngram_model_trie.c that takes normalized text, extracts 1..n-grams, calculates probabilities and backoff weights, then creates an ngram_model_trie_t from them. Sadly, I just don't have the time right now. I have already implemented all but the last step in C++, so I'm going to choose the hacky route, export the LM to a temporary ARPA file (that's trivial), then read it back using ngram_model_read.
Last updated: 2016-06-07

Long audio alignment in Pocketsphinx (Thread)
Long audio alignment in Pocketsphinx
Last updated: 2016-06-07

Post by lupomuc on Long audio alignment in Pocketsphinx (Post)
I'll take that as a 'yes': they have to be sorted in the end?
Last updated: 2016-06-07

Post by nshmyrev on Long audio alignment in Pocketsphinx (Post)
Since your lm is small and you do not need very efficient storage, you can use unsorted list of ngrams_raw structure, then you can simply sort it with qsort.
Last updated: 2016-06-07

Post by lupomuc on Long audio alignment in Pocketsphinx (Post)
I'll give it a try. I can't make any promises, though -- I'm more at home with C++ than with plain C. One question in advance: ARPA models have all their n-grams in alphabetical order, so reading them automatically populates the ngram_model_t sub-structures in alphabetical order. Is this a requirement, or can I use any order?
Last updated: 2016-06-07

Post by nshmyrev on Long audio alignment in Pocketsphinx (Post)
Unfortunately there is no way to do that yet, you are welcome to submit a patch. We'd be interested in ngram model which can be initialized from a raw text too.
Last updated: 2016-06-06

Post by lupomuc on Long audio alignment in Pocketsphinx (Post)
I just realized that these functions are static as well. So I cannot use them at all. Is there any way to create an ngram_model_t instance from code?
Last updated: 2016-06-05

Post by lupomuc on Long audio alignment in Pocketsphinx (Post)
I managed to calculate n-gram probabilities and backof weights on-the-fly in C++. Now I'd like to create an ngram_model_t instance directly from this data (rather than writing it to a file and reading it back via ngram_model_read). I've hit a little problem: To initialize an ngram_model_t, I need to call ngram_model_init, which is declared in ngram_model_internal.h. This function takes an ngram_funcs_t* value as argument. So I need an instance of this type to pass along. ngram_model_trie.c defines a static instance of this type, but I don't see a way to access this value. I could try to define an identical value myself, but its definition uses the functions ngram_model_trie_free, trie_apply_weights and four others. All these functions are defined directly within ngram_model_trie.c and not declared in any header file. So the only way I see is to declare these functions myself, have the linker use the definitions in ngram_model_trie.c, and define my own instance of type ngram_funcs_t*. Or is there a better way?
Last updated: 2016-06-05

Post by lupomuc on Long audio alignment in Pocketsphinx (Post)
Thanks -- I'll have a look at it!
Last updated: 2016-05-06

Post by nshmyrev on Long audio alignment in Pocketsphinx (Post)
You calculate model probability with one model, then calcualte probability with another model and then simply take weighted average. Sphinxbase has ngram_model_set class for that, see ngram_model_set_init
Last updated: 2016-05-06

Post by lupomuc on Long audio alignment in Pocketsphinx (Post)
Thanks for the links! The 2nd one was great for understanding the theory, the 1st one for an actual working example. Now I need to learn how to merge two existing language models into a single, biased one. Do you have any articles or actual code that I can look at?
Last updated: 2016-05-06

Post by nshmyrev on Long audio alignment in Pocketsphinx (Post)
You can read a comment in the beginning of quick_lm.pl here: http://www.speech.cs.cmu.edu/tools/download/quick_lm.pl and also http://www.speech.sri.com/projects/srilm/manpages/pdfs/chen-goodman-tr-10-98.pdf
Last updated: 2016-05-03

Post by lupomuc on Long audio alignment in Pocketsphinx (Post)
Thanks Nickolay! I didn't fully understand the role of the language model. I've now done some research and things start to make sense. I've experimented with the Sphinx Knowledge Base Tool and there are two concepts I don't understand yet: discount mass and the ratio method for backoffs. Maybe you can help me? I've noticed that the 1-gram probabilities generated by the Sphinx Knowledge Base Tool add up to 0.5, not to 1. A comment says, 'The (fixed) discount mass is 0.5.', so my guess is that this is intentional. What is a discount mass and why is it used? Another comment says, 'The backoffs are computed using the ratio method.' What is this ratio method? It would be great if you could explain these concepts. Maybe you have a link?
Last updated: 2016-05-03

1 2 3 .. 10 > >> (Page 1 of 10)

Showing results of 228

Sort by relevance or date

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks