
I am an Assistant Professor in the Department of Performing Arts Technology at the University of Michigan. I am also an affiliated faculty in the Computer Science and Engineering.
My research aims to augment human creativity with machine learning. I develop human-centered generative AI technology that can be integrated into professional creative workflows, with a focus on music, audio, and video creation. My long-term goal is to make professional content creation accessible to everyone.
My research interests include music generation, music technology, audio synthesis, video editing, and multimodal AI. Here are the three main pillars of my research:
  - I develop generative models for music creation where I pioneer the adoption of deep neural networks for generating multi-instrument music. Topics include multitrack music generation (AAAI 2018, ISMIR 2018, ISMIR 2020, ICASSP 2023, ISMIR 2024), text-to-music generation (ISMIR 2025), video-to-music generation (ISMIR 2025), and symbolic music processing tools (ISMIR LBD 2019, ISMIR 2020).
- I build AI-assisted music creation tools that aim to augment human creativity in their creative workflow. Topics include expressive violin performance synthesis (ICASSP 2022, ICASSP 2025), music instrumentation (ISMIR 2021), music arrangement (AAAI 2018), and music harmonization (JNMR 2020).
- I develop multimodal generative models for content creation that can process, understand and generate data in multiple modalities at the same time. Topics include long-to-short video editing (ICLR 2025, NeurIPS 2025), text-queried sound separation (ICLR 2023), and text-to-audio synthesis (WASPAA 2023).
Currently, I am most interested in multimodal generative AI and human-AI co-creative tools for music, audio, and video creation.
Researchers create knowledge.
Teachers organize knowledge.
Engineers apply knowledge.
Artists challenge knowledge.
 Prospective Students
Please read this if you’re interested in working with me.
 Currently Teaching
 Recent Talks
 News
  - 📜 Our paper “REGen: Multimodal Retrieval-Embedded Generation for Long-to-Short Video Editing” led by Weihan Xu has been accepted to NeurIPS 2025 (paper, demo)!
- 💡 We are organizing the AI for Music Workshop at NeurIPS 2025!
- 📜 Our paper “Generating Symbolic Music from Natural Language Prompts using an LLM-Enhanced Dataset” led by Weihan Xu has been accepted to ISMIR 2025 (paper, demo)!
- 📜 Our paper “Video-Guided Text-to-Music Generation Using Public Domain Movie Collections” led by Haven Kim has been accepted to ISMIR 2025 (paper, demo)!
- 📜 Our paper “TeaserGen: Generating Teasers for Long Documentaries” led by Weihan Xu has been accepted to ICLR 2025 (paper, demo)!
- 📜 Our paper “ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend Conditioning” led by Daewoong Kim has been accepted to ICASSP 2025 (paper, demo)!
- 📜 Our paper “FUTGA-MIR: Enhancing Fine-grained and Temporally-aware Music Understanding with Music Information Retrieval” led by Junda Wu has been accepted to ICASSP 2025! (paper, demo)
- 🏅 I was awarded the  Doctoral Award for Excellence in Research by UCSD CSE Department!
 Selected Honors & Awards
  - 
    
  
- 
    
  
- 
    
  
- 
    
  
- 
    Friends of International Center Fellowship UCSD GEPA 2023 
- 
    
  
- 
    
  
- 
    Government Scholarship to Study Abroad Taiwan Ministry of Education 2022 
- 
    
  
- 
    ECE Department Fellowship UCSD ECE Department 2019 
 Selected Publications
  See the full list of publications here (Google Scholar).
  - REGen: Multimodal Retrieval-Embedded Generation for Long-to-Short Video Editing
 Weihan Xu, Yimeng Ma, Jingyue Huang, Yang Li, Wenye Ma, Taylor Berg-Kirkpatrick, Julian McAuley, Paul Pu Liang, and Hao-Wen Dong
 Advances in Neural Information Processing Systems (NeurIPS), 2025
 paper 
 demo 
 code 
 reviews
- Generating Symbolic Music from Natural Language Prompts using an LLM-Enhanced Dataset
 Weihan Xu, Julian McAuley, Taylor Berg-Kirkpatrick, Shlomo Dubnov, and Hao-Wen Dong
 International Society for Music Information Retrieval Conference (ISMIR), 2025
 paper 
 demo 
 poster 
 code 
 reviews
- 
    Video-Guided Text-to-Music Generation Using Public Domain Movie Collections
 Haven Kim, Zachary Novack, Weihan Xu, Julian McAuley, and Hao-Wen Dong
 International Society for Music Information Retrieval Conference (ISMIR), 2025
 paper 
 demo 
 poster 
 code 
 reviews
 
- TeaserGen: Generating Teasers for Long Documentaries
 Weihan Xu, Paul Pu Liang, Haven Kim, Julian McAuley, Taylor Berg-Kirkpatrick, and Hao-Wen Dong
 International Conference on Learning Representations (ICLR), 2025
 paper 
 demo 
 poster 
 code 
 reviews
- 
    CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models
 Hao-Wen Dong, Xiaoyu Liu, Jordi Pons, Gautam Bhattacharya, Santiago Pascual, Joan Serrà, Taylor Berg-Kirkpatrick, and Julian McAuley
 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023
 paper 
 demo 
 video 
 slides 
 reviews
 
- 
    Multitrack Music Transformer
 Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley, and Taylor Berg-Kirkpatrick
 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
 paper 
 demo 
 video 
 slides 
 code 
 reviews
 
- 
    CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos
 Hao-Wen Dong, Naoya Takahashi, Yuki Mitsufuji, Julian McAuley, and Taylor Berg-Kirkpatrick
 International Conference on Learning Representations (ICLR), 2023
 paper 
 demo 
 video 
 slides 
 poster 
 code 
 reviews
 
- 
    Deep Performer: Score-to-Audio Music Performance Synthesis
 Hao-Wen Dong, Cong Zhou, Taylor Berg-Kirkpatrick, and Julian McAuley
 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022
 paper 
 demo 
 video 
 slides 
 poster 
 reviews
 
- 
    Towards Automatic Instrumentation by Learning to Separate Parts in Symbolic Multitrack Music
 Hao-Wen Dong, Chris Donahue, Taylor Berg-Kirkpatrick and Julian McAuley
 International Society for Music Information Retrieval Conference (ISMIR), 2021
 paper 
 demo 
 video 
 slides 
 code 
 reviews
 
- 
    MusPy: A Toolkit for Symbolic Music Generation
 Hao-Wen Dong, Ke Chen, Julian McAuley, and Taylor Berg-Kirkpatrick
 International Society for Music Information Retrieval Conference (ISMIR), 2020
 paper 
 video 
 slides 
 poster 
 code 
 documentation 
 reviews
 
- 
    Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation
 Hao-Wen Dong and Yi-Hsuan Yang
 International Society for Music Information Retrieval Conference (ISMIR), 2018
 paper 
 demo 
 video 
 slides 
 poster 
 code 
 reviews
 
- MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment
 Hao-Wen Dong,* Wen-Yi Hsiao,* Li-Chia Yang, and Yi-Hsuan Yang (*equal contribution)
 AAAI Conference on Artificial Intelligence (AAAI), 2018
 paper 
 demo 
 slides 
 code
 Education
  - 
    
  
  
    Sep 2019 – Jun 2024   
 
 
- 
    
  
    University of California San Diego
 M.S. in Computer Science
 
    Sep 2019 – Jun 2021   
 
 
- 
    
  
    National Taiwan Normal University
 Digital Video and Audio Arts Program
 
    Sep 2019 – Jun 2017   
 
 
- 
    
  
    National Taiwan University
 B.S. in Electrical Engineering
 
    Sep 2013 – Jun 2017   
 
 
 Professional Experience
  - 
    
  
    NVIDIA
    Research Intern
    Deep Imagination Research Group, NVIDIA Research
     Advisors: Siddharth Gururani and Ming-Yu Liu
 Topic: Controllable audio generation
 
    Sep 2023 – Dec 2023   
 
 
- 
    
  
    Adobe
    Research Scientist/Engineer Intern
    Audio Research Group, Adobe Research
     Advisors: Justin Salamon and Oriol Nieto
 Topic: Text-to-audio retrieval
 
    May 2023 – Sep 2023   
 
 
- 
    
  
    Dolby
    Speech/Audio Deep Learning Intern
    Applied AI Team, Advanced Technology
     Advisor: Xiaoyu Liu
 Topic: Text-to-audio synthesis
 
    Jan 2023 – Apr 2023   
 
 
- 
    
  
    Amazon
    Applied Scientist Intern
    Natural Understanding Team, Alexa AI
     Advisors: Wenbo Zhao and Gunnar Sigurdsson
 Topic: Text-to-audio synthesis
 
    Sep 2022 – Jan 2023   
 
 
- 
    
  
    Sony
    Student Intern
    Tokyo Laboratory 30, R&D Center
     Advisor: Naoya Takahashi
 Topic: Universal sound separation
 
    May 2022 – Sep 2022   
 
 
- 
    
  
    Dolby
    Deep Learning Audio Intern
    Applied AI Team, Advanced Technology Group
     Advisor: Cong Zhou
 Topic: Music performance synthesis
 
    Jun 2021 – Sep 2021   
 
 
- 
    
  
    Yamaha
 Research Intern
 AI Group, Research and Development Division
 Advisor: Keijiro Saino
 Topic: Deep learning based synthesizer
 
    May 2019 – Aug 2019   
 
 
- 
    
  
    Academia Sinica
    Research Assistant
    Music and AI Lab, Research Center for IT Innovation
     Advisor: Yi-Hsuan Yang
 Topics: Music generation and deep generative models
 
    Jul 2017 – Apr 2019   
 
 
For more information, please see my CV.