use CPUID intrinsic instead of inline assembler
use new FFMPEG API
new CPU capabilities checking code using CPUID ...