char sl(char a)
return a << 7;
will be compiled to 7 sll instructions, which is not so efficient.
it could be optimized to something like
and a, #0xf0
this patch is about using swap and swapw to optimize left shift while shift_count >= size * 8 / 2