All homeworks should be submitted through Gradscope.

Homework 2

Part 1: Implementing a transformer model (50 points)

In this part, your goal is to design and implement a Transformer-based model. You are expected to gain a deeper understanding of the inner workings of the Transformer by building its core components from scratch.

Download the Google Colab notebook from the following link: Download.

Part 2: Text generation and decoding (50 points)

n this part, you will apply your Transformer model for sequence generation. This will be a task on text summarization in which you will explore using an off-the-shelf language model, and evaluation methods. This also focuses on experimenting with and evaluating different decoding strategies for text generation.

Download the Google Colab notebook from the following link: Download.

Homework 1

Part 1: Hands-on excercise - Implementing sparse representations (50 points)

In this homework, your goal is to implement sparse word and document representations. Sparse representations are crucial in many natural language processing tasks, particularly when working with high-dimensional data like text.

You will need to log in to colab with your Yale account. Then you can copy the notebook to your own account and start working on that.

You will be completing parts indicated with

    #
    # % -- Your Implementation -- %
    #

Or # TODO: Implement.

Then you will be submitting the completed notebook with all the outputs.

Please see the instructions and the notebook here: Colab-1.

Part 2: Handout (20 points)

Download the homework handout from the following link: Download.

For this part, you need to complete the homework in LaTeX and return the pdf solution. Further instructions are provided in the pdf.

Part 3: Hands-on excercise - Implementing the Word2Vec model (50 points)

The third part is implementation of a Word2Vec SkipGram model from scratch. A Colab notebook is provided to guide you through the process of implementing the model from scratch and training it on a toy data sample.

Then you will be submitting the completed notebook with all the outputs.

Please see the instructions and the notebook here: Colab-2.