Author: Shirley Li

DeepSeek-V3 Explained 1: Multi-head Latent Attention
Deep Learning

Key architecture innovation behind DeepSeek-V2 and DeepSeek-V3 for faster inference

Shirley Li

January 31, 2025

9 min read
Understanding the Evolution of ChatGPT: Part 3- Insights from Codex and InstructGPT
ChatGPT

Mastering the art of fine-tuning: Learnings for training your own LLMs.

Shirley Li

January 21, 2025

22 min read
Understanding the Evolution of ChatGPT: Part 2 – GPT-2 and GPT-3
Deep Learning

Scaling from 117M to 175B: Insights into GPT-2 and GPT-3.

Shirley Li

January 13, 2025

10 min read
Understanding the Evolution of ChatGPT: Part 1-An In-Depth Look at GPT-1 and What Inspired It
Deep Learning

Tracing the roots of ChatGPT: GPT-1, the foundation of OpenAI’s LLMs

Shirley Li

January 6, 2025

11 min read