Transformers-from-scratch NMT task from English to French using different number of encoder and decoder layers( 3 and 5) and differnt number of attention heads.