Menu

Creating topic-specific language models from Wikipedia (dissertation)

2012-12-22
2012-12-24
  • Stephen Marquard

    Hi all,

    I have recently completed a dissertation on creating topic-specific language models from English Wikipedia to improve recognition of specialist vocabulary for ASR of lectures, using Sphinx4 to evaluate the results against the reference HUB4 open source language model. It may be of interest to anyone working in the area of language model adaptation.

    The abstract is here:

    http://trulymadlywordly.blogspot.com/2012/12/improving-searchability-of.html

    and the full text:

    http://pubs.cs.uct.ac.za/archive/00000846/01/MPhil-Dissertation-StephenMarquard.pdf

    I'd like to extend my thanks to all the Sphinx4 contributors and developers, and especially Nickolay for his always-generous help in answering questions about Sphinx4 and speech recognition generally.

    Regards
    Stephen

     
  • Nickolay V. Shmyrev

    Great research, Stephen, thanks for sharing it!

    Congratulations with graduation!

     

Log in to post a comment.