CMU Sphinx and the Google Summer of Code

By Community Team

18% of all Google Summer of Code projects were SourceForge projects. One of those projects was the CMU Sphinx project. I asked them about their experience with the GSoC, and what their students were working on this year.

Rich: Give us a brief overview of your project. (Yes, I know, I’ve interviewed you guys before, but some of our readers might not have caught that.)

Evandro: CMU Sphinx is a speech recognition project. It provides all the components needed to build a speech recognition system, with a license that allows for commercial use. The project started as a research project at Carnegie Mellon University. It became an open source project around the year 2000, always hosted by SourceForge.net.

Rich: What were your student(s) working on for this year’s GSoC?

Evandro: CMU Sphinx had several students during this year’s GSoC. The successful projects dealt with pronunciation evaluation, grapheme to phoneme conversion, and enlarging in-domain data automatically.

Pronunciation evaluation answers the question about how good a person’t pronunciation is, compared to the language’s standard. It is useful, for example, in automating language learning or assessing reading skills. Grapheme to phoneme conversion is the task where we find how a word is pronounced from the word’s spelling, i.e. what is the sequence of phonemes that gets pronounced when a native speaker utters that word? Enlarging in-domain data automatically is useful when building a language model. A language model is the knowledge base used by a speech recognition that assigns a probability to any possible sequences of words in a language.

Rich: What did you learn from the process? What would you do differently next year?

Evandro: For the students, it is a fantastic learning opportunity. They interact directly with people who have been working in the area for years, in some cases who’ve had the experience of creating successful companies etc. I would not be surprised if the students ended up the Summer being hired by their mentors’ companies.

On the other hand, working from a distance inherently has some major difficulties, and the communication channels have to be well used. If you imagine, for comparison’s sake, a situation where the student would be working directly with the mentor, in the same office. A lot more could be accomplished, just because the interaction would be much easier.

Interaction by e-mail/IM ends up being somewhat impersonal. Ideally, a face-to-face meeting would be great, but this is quite impractical. The next best thing is periodic video conference calls, using Google Plus Hangouts or Skype. Hearing the other person and seeing their face really does wonders for the interaction.

Rich: What advice can you give to other projects considering GSoC for next year?

Evandro: GSoC is great for the project, as it gets new people involved. Mentoring students brings a lot of satisfaction, but also demands time and effort. Having a good plan, so the students don’t spend too much time “learning the ropes” but can get productive work done as quickly as possible makes everyone happy.