Tranformer Decoder Model, Neural Network, Backpropagation Gardient Descent, Masked Attention, Positional Encoding, Transformer Decoder Model