Hao-Wen Dong

I am an Assistant Professor in the Department of Performing Arts Technology at University of Michigan.

My research aims to augment human creativity with machine learning. I develop human-centered generative AI technology that can be integrated into professional creative workflow, with a focus on music, audio and video content creation My long-term goal is to lower the barrier of entry for content creation and democratize professional content creation for everyone.

My research on Human-Centered Generative AI for Content Generation can be categorized into the following three main pillars:

I develop novel generative models for new domains that aims at enabling deep learning in underexplored application fields. Topics include multitrack music generation (AAAI 2018, ISMIR 2018, ISMIR 2020, ICASSP 2023, ISMIR 2024), controllable music generation (AIMG 2024, arXiv 2024) and documentary teaser generation (arXiv 2024).
I build AI-assisted content creation tools that aim to augment human creativity in their creative workflow. Topics include expressive violin performance synthesis (ICASSP 2022, arXiv 2024), music instrumentation (ISMIR 2021), music arrangement (AAAI 2018) and music harmonization (JNMR 2020).
I develop multimodal generative models for content creation that can process, understand and generate data in multiple modalities at the same time. Topics include text-queried sound separation (ICLR 2023), text-to-audio synthesis (WSS 2023, WASPAA 2023), text-to-music generation (ISMIR LBD 2024, arXiv 2024) and documentary teaser generation (arXiv 2024).

Researchers create knowledge.
Teachers organize knowledge.
Engineers apply knowledge.

News

Our paper “Nested Music Transformer: Sequentially Decoding Compound Tokens in Symbolic Music and Audio Generation” has been accepted to ISMIR 2024 (paper, demo)!
I was awarded the Doctoral Award for Excellence in Research by UCSD CSE Department!
I was selected as one of the Rising Stars in AI by KAUST AI Initiative!
I was selected as one of the Rising Stars in Data Science by UChicago DSI and UCSD HDSI!

Team

Weihan Xu
MS student, Duke University
Seeking PhD positions!

Erfun Ackley
MS student, UMich
Seeking PhD positions!

For prospective students, please read this if you’re interested in working with me.

Selected Honors & Awards

Doctoral Award for Excellence in Research UCSD CSE Department
2024
Rising Stars in AI KAUST AI Initiative
2024
Rising Stars in Data Science UChicago DSI and UCSD HDSI
2023
IEEE SPS Scholarship IEEE Signal Processing Society
2023
Friends of International Center Fellowship UCSD GEPA
2023
Rising Stars in Signal Processing ICASSP
2023
Interdisciplinary Research Award UCSD GPSA
2023
Government Scholarship to Study Abroad Taiwan Ministry of Education
2022
J. Yang Scholarship J. Yang & Family Foundation
2021
ECE Department Fellowship UCSD ECE Department
2019

Selected Publications

See the full list of pulications here (Google Scholar).

CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models
Hao-Wen Dong, Xiaoyu Liu, Jordi Pons, Gautam Bhattacharya, Santiago Pascual, Joan Serrà, Taylor Berg-Kirkpatrick, and Julian McAuley
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023
paper demo video slides reviews
Multitrack Music Transformer
Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley, and Taylor Berg-Kirkpatrick
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
paper demo video slides code reviews
CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos
Hao-Wen Dong, Naoya Takahashi, Yuki Mitsufuji, Julian McAuley, and Taylor Berg-Kirkpatrick
International Conference on Learning Representations (ICLR), 2023
paper demo video slides poster code reviews
Deep Performer: Score-to-Audio Music Performance Synthesis
Hao-Wen Dong, Cong Zhou, Taylor Berg-Kirkpatrick, and Julian McAuley
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022
paper demo video slides poster reviews
Towards Automatic Instrumentation by Learning to Separate Parts in Symbolic Multitrack Music
Hao-Wen Dong, Chris Donahue, Taylor Berg-Kirkpatrick and Julian McAuley
International Society for Music Information Retrieval Conference (ISMIR), 2021
paper demo video slides code reviews
MusPy: A Toolkit for Symbolic Music Generation
Hao-Wen Dong, Ke Chen, Julian McAuley, and Taylor Berg-Kirkpatrick
International Society for Music Information Retrieval Conference (ISMIR), 2020
paper video slides poster code documentation reviews
Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation
Hao-Wen Dong and Yi-Hsuan Yang
International Society for Music Information Retrieval Conference (ISMIR), 2018
paper demo video slides poster code reviews
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment
Hao-Wen Dong,* Wen-Yi Hsiao,* Li-Chia Yang, and Yi-Hsuan Yang (*equal contribution)
AAAI Conference on Artificial Intelligence (AAAI), 2018
paper demo slides code

Education

University of California San Diego
Ph.D. in Computer Science
Advisors: Julian McAuley and Taylor Berg-Kirkpatrick
Dissertation: Generative AI for Music and Audio

Sep 2019 – Jun 2024
University of California San Diego
M.S. in Computer Science

Sep 2019 – Jun 2021
National Taiwan Normal University
Digital Video and Audio Arts Program

Sep 2019 – Jun 2017
National Taiwan University
B.S. in Electrical Engineering

Sep 2013 – Jun 2017

Professional Experience

NVIDIA
Research Intern
Deep Imagination Research Group, NVIDIA Research
Advisors: Siddharth Gururani and Ming-Yu Liu
Topic: Controllable audio generation

Sep 2023 – Dec 2023
Adobe
Research Scientist/Engineer Intern
Audio Research Group, Adobe Research
Advisors: Justin Salamon and Oriol Nieto
Topic: Text-to-audio retrieval

May 2023 – Sep 2023
Dolby
Speech/Audio Deep Learning Intern
Applied AI Team, Advanced Technology
Advisor: Xiaoyu Liu
Topic: Text-to-audio synthesis

Jan 2023 – Apr 2023
Amazon
Applied Scientist Intern
Natural Understanding Team, Alexa AI
Advisors: Wenbo Zhao and Gunnar Sigurdsson
Topic: Text-to-audio synthesis

Sep 2022 – Jan 2023
Sony
Student Intern
Tokyo Laboratory 30, R&D Center
Advisor: Naoya Takahashi
Topic: Universal sound separation

May 2022 – Sep 2022
Dolby
Deep Learning Audio Intern
Applied AI Team, Advanced Technology Group
Advisor: Cong Zhou
Topic: Music performance synthesis

Jun 2021 – Sep 2021
Yamaha
Research Intern
AI Group, Research and Development Division
Advisor: Keijiro Saino
Topic: Deep learning based synthesizer

May 2019 – Aug 2019
Academia Sinica
Research Assistant
Music and AI Lab, Research Center for IT Innovation
Advisor: Yi-Hsuan Yang
Topics: Music generation and deep generative models

Jul 2017 – Apr 2019

For more information, please see my CV.