Microsoft Interview Question

Transformer architecture and implement self-attention from scratch