Welcome to CPSC 4770/5770 – Large Language Models, From Foundations to Modern Practice!

This year, we have updated the syllabus and we’ll focus heavily on Language Modeling and recent advances in the field. The curriculum spans both foundational concepts and cutting-edge developments in the field including Large Language Models (LLMs). The course begins with core neural network concepts in NLP, covering word embeddings, sequence modeling, and attention mechanisms. Students will gain a strong understanding of these building blocks while learning their practical implementations. Building on these foundations, we explore transformer architectures and their evolution, including landmark language models such as GPT. The course examines how these models enable sophisticated language understanding and generation through pre-training and transfer learning. The latter portion covers contemporary advances: LLMs, parameter-efficient fine-tuning, post-training, and efficiency techniques. We’ll analyze the capabilities and limitations of current systems while discussing emerging research directions.

Prerequisites:

Intro to ML or Intro to AI are required.

Important Note: We strongly advise against taking this course if you do not meet the prerequisites.

Resources

  • Dan Jurafsky and James H. Martin. Speech and Language Processing (2024 pre-release)
  • Yoav Goldberg. A Primer on Neural Network Models for Natural Language Processing

We will also using papers from major conferences in the field including ACL, EMNLP, NAACL, ICLR, NeurIPS, etc.