Hi,
During the voxforge recipe, there are very long time periods in which I am seeing very low CPU usage.
I am running on a single CPU with 12 cores, using run.pl, and I made njobs=24.
According to logs, the low CPU is in the loop below right after the "$0: aligning data" print.
Am I doing something wrong?
Thanks!
while [ $x -lt $num_iters ]; do echo "$0: training pass $x" if [ $stage -le $x ]; then if echo $realign_iters | grep -w $x >/dev/null; then echo "$0: aligning data" mdl="gmm-boost-silence --boost=$boost_silence cat $lang/phones/optional_silence.csl $dir/$x.mdl - |" $cmd JOB=1:$nj $dir/log/align.$x.JOB.log \ gmm-align-compiled $scale_opts --beam=$beam --retry-beam=$retry_beam "$mdl" \ "ark:gunzip -c $dir/fsts.JOB.gz|" "$feats" \ "ark:|gzip -c >$dir/ali.JOB.gz" || exit 1; fi $cmd JOB=1:$nj $dir/log/acc.$x.JOB.log \ gmm-acc-stats-ali $dir/$x.mdl "$feats" \ "ark,s,cs:gunzip -c $dir/ali.JOB.gz|" $dir/$x.JOB.acc || exit 1; $cmd $dir/log/update.$x.log \ gmm-est --mix-up=$numgauss --power=$power \ --write-occs=$dir/$[$x+1].occs $dir/$x.mdl \ "gmm-sum-accs - $dir/$x..acc |" $dir/$[$x+1].mdl || exit 1; rm $dir/$x.mdl $dir/$x..acc rm $dir/$x.occs fi [ $x -le $max_iter_inc ] && numgauss=$[$numgauss+$incgauss]; x=$[$x+1]; done
cat $lang/phones/optional_silence.csl
It could be an IO bottleneck, or the memory of the machine is exhausted and the processes are getting swapped in or out.
On 7 August 2014 10:57, Tony Esposito antonioesposito@users.sf.net wrote:
Hi, During the voxforge recipe, there are very long time periods in which I am seeing very low CPU usage. I am running on a single CPU with 12 cores, using run.pl, and I made njobs=24. According to logs, the low CPU is in the loop below right after the "$0: aligning data" print. Am I doing something wrong? Thanks! while [ $x -lt $num_iters ]; do echo "$0: training pass $x" if [ $stage -le $x ]; then if echo $realign_iters | grep -w $x >/dev/null; then echo "$0: aligning data" mdl="gmm-boost-silence --boost=$boost_silence cat $lang/phones/optional_silence.csl $dir/$x.mdl - |" $cmd JOB=1:$nj $dir/log/align.$x.JOB.log \ gmm-align-compiled $scale_opts --beam=$beam --retry-beam=$retry_beam "$mdl" \ "ark:gunzip -c $dir/fsts.JOB.gz|" "$feats" \ "ark:|gzip -c >$dir/ali.JOB.gz" || exit 1; fi $cmd JOB=1:$nj $dir/log/acc.$x.JOB.log \ gmm-acc-stats-ali $dir/$x.mdl "$feats" \ "ark,s,cs:gunzip -c $dir/ali.JOB.gz|" $dir/$x.JOB.acc || exit 1; $cmd $dir/log/update.$x.log \ gmm-est --mix-up=$numgauss --power=$power \ --write-occs=$dir/$[$x+1].occs $dir/$x.mdl \ "gmm-sum-accs - $dir/$x. .acc |" $dir/$[$x+1].mdl || exit 1; rm $dir/$x.mdl $dir/$x..acc rm $dir/$x.occs fi [ $x -le $max_iter_inc ] && numgauss=$[$numgauss+$incgauss]; x=$[$x+1]; done Seeing very low CPU usage https://sourceforge.net/p/kaldi/discussion/1355348/thread/0498f821/?limit=25#9b0d Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/kaldi/discussion/1355348/ To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
while [ $x -lt $num_iters ]; do echo "$0: training pass $x" if [ $stage -le $x ]; then if echo $realign_iters | grep -w $x >/dev/null; then echo "$0: aligning data" mdl="gmm-boost-silence --boost=$boost_silence cat $lang/phones/optional_silence.csl $dir/$x.mdl - |" $cmd JOB=1:$nj $dir/log/align.$x.JOB.log \ gmm-align-compiled $scale_opts --beam=$beam --retry-beam=$retry_beam "$mdl" \ "ark:gunzip -c $dir/fsts.JOB.gz|" "$feats" \ "ark:|gzip -c >$dir/ali.JOB.gz" || exit 1; fi $cmd JOB=1:$nj $dir/log/acc.$x.JOB.log \ gmm-acc-stats-ali $dir/$x.mdl "$feats" \ "ark,s,cs:gunzip -c $dir/ali.JOB.gz|" $dir/$x.JOB.acc || exit 1; $cmd $dir/log/update.$x.log \ gmm-est --mix-up=$numgauss --power=$power \ --write-occs=$dir/$[$x+1].occs $dir/$x.mdl \ "gmm-sum-accs - $dir/$x. .acc |" $dir/$[$x+1].mdl || exit 1; rm $dir/$x.mdl $dir/$x..acc rm $dir/$x.occs fi [ $x -le $max_iter_inc ] && numgauss=$[$numgauss+$incgauss]; x=$[$x+1]; done
Seeing very low CPU usage https://sourceforge.net/p/kaldi/discussion/1355348/thread/0498f821/?limit=25#9b0d
Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/kaldi/discussion/1355348/
To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
Thanks for your answer.
I think that it's definitely not memory.
This machine has 16GBytes of memory, and only 1.8G where used during the execution.
Likely an I/O bottleneck. Dan
On Thu, Aug 7, 2014 at 10:41 AM, Tony Esposito <antonioesposito@users.sf.net
wrote: Thanks for your answer. I think that it's definitely not memory. This machine has 16GBytes of memory, and only 1.8G where used during the execution. Seeing very low CPU usage https://sourceforge.net/p/kaldi/discussion/1355348/thread/0498f821/?limit=25#8f5e Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/kaldi/discussion/1355348/ To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
wrote:
Seeing very low CPU usage https://sourceforge.net/p/kaldi/discussion/1355348/thread/0498f821/?limit=25#8f5e
Hi,
During the voxforge recipe, there are very long time periods in which I am seeing very low CPU usage.
I am running on a single CPU with 12 cores, using run.pl, and I made njobs=24.
According to logs, the low CPU is in the loop below right after the "$0: aligning data" print.
Am I doing something wrong?
Thanks!
while [ $x -lt $num_iters ]; do
echo "$0: training pass $x"
if [ $stage -le $x ]; then
if echo $realign_iters | grep -w $x >/dev/null; then
echo "$0: aligning data"
mdl="gmm-boost-silence --boost=$boost_silence
cat $lang/phones/optional_silence.csl
$dir/$x.mdl - |"$cmd JOB=1:$nj $dir/log/align.$x.JOB.log \ gmm-align-compiled $scale_opts --beam=$beam --retry-beam=$retry_beam "$mdl" \ "ark:gunzip -c $dir/fsts.JOB.gz|" "$feats" \ "ark:|gzip -c >$dir/ali.JOB.gz" || exit 1;
fi
$cmd JOB=1:$nj $dir/log/acc.$x.JOB.log \ gmm-acc-stats-ali $dir/$x.mdl "$feats" \ "ark,s,cs:gunzip -c $dir/ali.JOB.gz|" $dir/$x.JOB.acc || exit 1;
$cmd $dir/log/update.$x.log \ gmm-est --mix-up=$numgauss --power=$power \ --write-occs=$dir/$[$x+1].occs $dir/$x.mdl \ "gmm-sum-accs - $dir/$x..acc |" $dir/$[$x+1].mdl || exit 1;
rm $dir/$x.mdl $dir/$x..acc
rm $dir/$x.occs
fi
[ $x -le $max_iter_inc ] && numgauss=$[$numgauss+$incgauss];
x=$[$x+1];
done
It could be an IO bottleneck, or the memory of the machine is exhausted
and the processes are getting swapped in or out.
On 7 August 2014 10:57, Tony Esposito antonioesposito@users.sf.net wrote:
Thanks for your answer.
I think that it's definitely not memory.
This machine has 16GBytes of memory, and only 1.8G where used during the execution.
Likely an I/O bottleneck.
Dan
On Thu, Aug 7, 2014 at 10:41 AM, Tony Esposito <antonioesposito@users.sf.net