coding CAI encoding
Coding experimental BPE
minor speed gains
fixes memory usage in grouped convolutions
Adding AddCompressedTransformerBlockCAI