Reference¶
Papers¶
The original GPT-3 paper Language Models are Few-Shot Learners: it is the ultimate resource to understand the new techniques applied by Open AI.
The GPT-2 paper Language Models are Unsupervised Multitask Learners: the GPT-3 model remains largely the same as the GPT-2 counterpart.
Attention Is All You Need: introduced the encoder-decoder transformer, which is the most tricky part of the GPT model.
Implementation¶
The Annotated GPT-2 is an annotated version of the GPT-2 paper with plenty of
PyTorch
code.This GitHub repo is an
PyTorch
implementation of the GPT-2 by Hugging Face.The minGPT is a minimal
PyTorch
re-implementation.Yet another GPT-2 implementation via
PyTorch
The Annotated Transformer explains in code how the transformer is implemented, and is endorsed by the author of “The Annotated GPT-2”.
The
PyTorch
tutorialtutorial on training a sequence-to-sequence model that uses the
nn.Transformer
module.
APIs¶
OpenAI API¶
The document [1] of the official OpenAI library:
Text Completion
Edit / Correct Inputs
Similarity Comparison
Classification
Text Comprehension
Embedding
Fine-tuning
Transformers¶
The transformers from Hugging Face provides APIs to download and train pre-trained models, including GPT-2 and GPT Neo.
Fine-Tuning¶
This 150kb text contains podcast transcripts of Elon Musk.
This post shows how to retrain the GPT-2 model.
Other Sources¶
The post The GPT-3 Architecture, on a Napkin explains as detailed as possible on the GPT-3 architecture, which is super useful.
This article is an entry point of several GPT-3 related resources, including application tutorials.
Alberto Romero’s medium
Jay Alammar’s blog
GPT-J
Reference¶
Back to GPT.