PA 4: Multitrack Music Generation

PAT 464/564: Generative AI for Music and Audio Creation (Winter 2026)

Due at 11:59pm ET on Apr 13


Instructions


Multitrack Music Generation

In this assignment, you will train a transformer model for multitrack music transformer that can generate multi-instrument music. We will be using the Lakh MIDI dataset (Raffel, 2016). The LMD dataset is a collection of 176,581 unique MIDI files, 45,129 of which have been matched and aligned to entries in the Million Song Dataset. Specifically, we will use a cleaner subset (Dong et al., 2018) consisting of 21,425 files. We will base our model on the Multitrack Music Transformer framework (Dong et al., 2023). To simplify the task, we will group MIDI instruments into five tracks: piano, guitar, bass, strings, and brass.

Multitrack Music Transformer

Here’s the representation we are using in this assignment:

Representation


Pretrained Model Checkpoint

In case you lost the trained model, here is a checkpoint for you to play with. This checkpoint was trained for 35 epochs, where it reached the lowest validation loss in a 50-epoch training session.


Open In Colab

Jupyter notebook


References


Hosted on GitHub Pages. Powered by Jekyll. Theme adapted from minimal by orderedlist.