sokol - 2004-10-28

Im trying to create my own language an acoustic model for Sphinx 3(or Sphinx4).
How can I calculate the size of required audio data (in hours)?
Where can I find required documentation?