Date Lecture Readings Logistics
Tue 01/13/26 Lecture #1:
  • Course Introduction
  • Logistics
[ slides ]

Thu 01/15/26 Lecture #2:
  • Word embeddings and vector semantics
[ slides ]
Main readings:
  • Jurafsky & Martin Chapter 6

Tue 01/20/26 Lecture #3:
  • Word embeddings and vector semantics (cont.)
  • Sparse representations
  • Dense representations
[ slides ]
Main readings:
  • Jurafsky & Martin Chapter 6
Optional readings:
  • Distributed Representation of Words and Phrases and their Compositionality (Mikolov et al., 2013) [link]
  • Efficient Estimation of Word Representations in Vector Space (Mikolov et al., 2013) [link]
  • Word2vec Explained- deriving Mikolov et al.'s negative-sampling word-embedding method (Goldberg and Levy, 2014) [link]

HW 1 out

Thu 01/22/26 Lecture #4:
  • Deriving the gradient of Word2vec
  • Evaluation of word embeddings
[ slides ]
Main readings:
  • Jurafsky & Martin Chapter 6
  • Distributed Representation of Words and Phrases and their Compositionality (Mikolov et al., 2013) [link]

Tue 01/27/26 Lecture #5:
  • N-Gram Language Models
  • Smoothing
  • Evaluation of Language Models
Main readings:
  • Jurafsky & Martin Chapter 7

Thu 01/29/26 Lecture #6:
  • Neural network basics
  • Autograd
Main readings:
  • The Matrix Calculus You Need For Deep Learning (Terence Parr and Jeremy Howard) [link]
  • Little book of deep learning (François Fleuret) - Ch 3, 4

Tue 02/03/26 Lecture #7:
  • Auto Grad
  • Building blocks of Deep Learning for Language Modeling
  • CNNs
Main readings:
  • Goldberg Chapter 9

Thu 02/05/26 Lecture #8:
  • CNNs (contd.)
  • RNNs
  • Task specific neural network architectures
  • Machine translation
Main readings:
  • Understanding LSTM Networks (Christopher Olah) [link]
  • Eisenstein, Chapter 18
Optional readings:
  • Neural Machine Translation and Sequence-to-sequence Models- A Tutorial (Graham Neubig) [link]

Tue 02/10/26 Lecture #9:
  • RNNs (contd.)
  • Training sequence models
  • Machine translation (contd.)
Main readings:
  • Statistical Machine Translation (Koehn) [link]
  • Neural Machine Translation and Sequence-to-sequence Models- A Tutorial (Graham Neubig) [link]
  • Learning to Align and Translate with Attention (Bahdanau et al., 2015) [link]
  • Luong et al. (2015) Effective Approaches to Attention-based Neural Machine Translation [link]
  • Attention is All You Need (Vaswani et al., 2017) [link]
  • Illustrated Transformer [link]

Project teams due on 02/08

HW1 due

Thu 02/12/26 Lecture #10:
  • Attention
  • Transformers
Main readings:
  • Neural Machine Translation and Sequence-to-sequence Models- A Tutorial (Graham Neubig) [link]
  • Learning to Align and Translate with Attention (Bahdanau et al., 2015) [link]
  • Luong et al. (2015) Effective Approaches to Attention-based Neural Machine Translation [link]
  • Attention is All You Need (Vaswani et al., 2017) [link]
  • Illustrated Transformer [link]

HW 2 out

Tue 02/17/26 Lecture #11:
  • Transformers (contd.)
  • Language modeling with Transformers
  • Transfer Learning
Main readings:
  • Illustrated Transformer [link]
  • Attention is All You Need (Vaswani et al., 2017) [link]
  • The Annotated Transformer (Harvard NLP) [link]
  • GPT-2 (Radford et al., 2019) [link]

Thu 02/19/26 Lecture #12:
  • Transfer Learning (contd.)
  • Objective functions for pre-training
  • Encoder-decoder pretrained models
  • Architecture and pretraining objectives
Main readings:
  • The Illustrated BERT, ELMo, and co. (Jay Alammar) [link]
  • BERT- Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al., 2018) [link]
  • GPT-2 (Radford et al., 2019) [link]
  • T5- Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (Raffel et al., 2020) [link]
  • BART- Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (Lewis et al., 2019) [link]
  • What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization? (Wang et al, 2022) [link]

Tue 02/24/26 Lecture #13:
  • Decoding and generation
  • Large language models and impact of scale
  • In-context learning and prompting
Main readings:
  • The Curious Case of Neural Text Degeneration (Holtzman et al., 2019) [link]
  • How to generate text- using different decoding methods for language generation with Transformers [link]
  • Scaling Laws for Neural Language Models (Kaplan et al., 2020) [link]
  • Training Compute-Optimal Large Language Models (Hoffmann et al., 2022) [link]
  • GPT3 paper - Language Models are Few-Shot Learners (Brown et al., 2020) [link]

Thu 02/26/26 Midterm Exam 1

Tue 03/03/26 Lecture #14:
  • Guest Lecture 1: TBD

HW 2 due

Thu 03/05/26 Lecture #15:
  • Post-training
  • Supervised Finetuning
  • Instruction Following
Main readings:
  • Multitask Prompted Training Enables Zero-Shot Task Generalization (Sanh et al., 2021) [link]
  • Scaling Instruction-Finetuned Language Models (Chung et al., 2022) [link]
  • Are Emergent Abilities of Large Language Models a Mirage? (Sha et al., 2023) [link]
  • Emergent Abilities of Large Language Models (Wei et al., 2022) [link]

Project proposals due 03/06

03/06/26 - 03/22/26 Spring recess - No classes

Tue 03/24/26 Lecture #16:
  • Guest Lecture 2: TBD

Thu 03/26/26 Lecture #17:
  • Post-training
  • Reinforcement learning from Human Feedback
  • Alignment
Main readings:
  • Training language models to follow instructions with human feedback (Ouyang et al., 2022) [link]
  • Fine-Tuning Language Models from Human Preferences (Ziegler et al., 2019) [link]
  • Direct Preference Optimization- Your Language Model is Secretly a Reward Model (Rafailov et al., 2023) [link]
  • RLAIF- Scaling Reinforcement Learning from Human Feedback with AI Feedback (Lee et al., 2023) [link]

Tue 03/31/26 Lecture #18:
  • Post-training (contd...)

HW 3 out

Thu 04/02/26 Lecture #19:
  • Guest Lecture 3: TBD

Tue 04/07/26 Lecture #20:
  • Retrieval Augmented Generation (RAG)

Thu 04/09/26 Midterm Exam 2

Tue 04/14/26 Lecture #21:
  • Guest Lecture 4: TBD

Thu 04/16/26 Lecture #22:
  • RAG continued, Intro to Agent-based systems

Tue 04/21/26 Lecture #23:
  • Project presentations session 1

Final project presentations

Thu 04/23/26 Lecture #24:
  • Project presentations session 2

Final project presentations, HW 3 due on 4/27,
Final project reports due on 4/30