O.k. now the bugs are out some polish. First cut a faster C routines
for quantization. The job is completely done yet - need to tidy up
parameter usage vs. globals and do intra as well and non_intra.
However, it works and its a lot quicker (mainly due to avoiding
the dreaded integer divide (positively glacial on the x86).
The code is now in good share for MMX-ification. I'll probably do
it when I merge the remaining MMX goodies from bbmpeg.