Transformer
-
Reverse-engineering large languages models’ computation circuit to understand their decision-making processes
7 min read -
Implementing CPTR (CaPtion TransformeR) from scratch with PyTorch
33 min read -
On the differences between Transformer and CNN, why Transformer matters, and what its weaknesses are.
24 min read -
Introduction to NMT with sequence-to-sequence architecture and the Transformers
19 min read