Fixing departing branches count
Fixing DotProducts departing branches
closer implementation of transformer decoder
closer implementation of transformer encoder
fixes embedding learning direction
removing min learning rate
removing extra transpose from transformers