Arash Vahdat

I am a principal research scientist at NVIDIA research specializing in machine learning and computer vision. Prior to NVIDIA, I was a senior research scientist at D-Wave Systems Inc. where I worked on deep generative learning with discrete latent variables and weakly supervised learning from noisy labels. Before joining D-Wave, I was a research faculty member at Simon Fraser University (SFU), where I led research on deep video analysis and taught graduate-level courses on big data analysis. I obtained my Ph.D. and MSc from SFU under Greg Mori’s supervision working on latent variable frameworks for visual analysis. I grew up in Iran where I completed my undergrad in computer engineering at Sharif University of Technology.

At NVIDIA, I am working on a broad range of problems including deep generative learning, neural architecture search and representation learning. I have built a state-of-the-art variational autoencoder (VAE) called NVAE that enables generating large high-quality images for the first time using VAEs. I led the extension of these models to hybrid energy-based and variational autoencoders in NCP-VAE and VAEBM. I also (co-)introduced a framework called LSGM that allows training score-based generative models (a.k.a. denoising diffusion models) in latent space. This work was followed up by our works on diffusion models including denoising diffusion GANs and score-based generative modeling with critically-damped Langevin diffusion. I have also proposed UNAS, a unified framework for neural architecture search (NAS) that unifies differentiable and reinforcement learning-based NAS. This work initiated our research collaboration with Microsoft research on NAS that produced LANA, a framework for network acceleration with operation optimization. Finally, at NVIDIA, we introduced a contrastive learning framework for weakly-supervised phrase grounding.

At D-Wave, I solely developed a robust image labeling framework that was presented at the main NeurIPS conference in 2017. A variant of this framework later won the CATARACTS surgical video analysis challenge in collaboration with Siemense Healthinears. This framework later became one of the founding services for, D-Wave’s machine learning business unit. I also led the extension of this framework to object detection and semantic object segmentation.

At D-Wave, I also developed new frameworks for training deep generative models with discrete latent variables. These frameworks, known as discrete variational autoencoders (DVAEs), have pushed the state-of-the-art in this area (DVAE++DVAE#, DVAE##), and have expanded our understanding of the existing frameworks (arXiv’18).

Google Scholar

Research Interests

  • Deep generative learning: diffusion models, variational autoencoders, energy-based models, hybrid models
  • Applications: image/graph/text/3D synthesis, controllable generation, weakly supervised learning
  • Efficient neural networks: neural architecture search, network acceleration techniques
  • Representation learning

Students and Interns

Throughout my career, I was fortunate to have the opportunity of mentoring bright students and interns including:

* I served as a co-mentor.

Recent Invited Talks

  • Tackling the Generative Learning Trilemma with Accelerated Diffusion Models, hosted by Ying Nian Wu at the center for vision, cognition, learning, and autonomy, the university of California, Los Angeles, Feb 2022.
  • Tackling the Generative Learning Trilemma with Accelerated Diffusion Models, computer vision group at the university of Bern, Feb 2022.
  • Tackling the Generative Learning Trilemma with Accelerated Diffusion Models, hosted by Rosanne Liu at ML Collective, Feb 2022.
  • New Frontiers in Deep Generative Learning, Open Data Science Conference, Nov 2021.
  • Hybrid Hierarchical Generative Models for Image Synthesis, hosted by Mohammad Norouzi at Google Brain Toronto, Dec 2020.
  • Deep Hierarchical Variational Autoencoder for Image Synthesis, hosted by Amir Khash Ahmadi at Autodesk AI Lab, Nov 2020.
  • Deep Hierarchical Variational Autoencoder for Image Synthesis, hosted by Juan Felipe Carrasquilla at the Vector Institute, Oct 2020.
  • NVAE: A Deep Hierarchical Variational Autoencoder, hosted by Danilo Rezende at DeepMind, Sept 2020.
  • On Continuous Relaxation of Discrete Latent Variables, hosted by Stefano Ermon, Stanford University, Nov 2019.


  • Area Chair:
    • NeurIPS (2021, 2022)
    • ICLR (2021, 2022)
  •  Reviewer:
    • NeurIPS (2017, 2019, 2020)
    • ICML (2018, 2020)
    • CVPR (2015, 2018, 2019, 2021, 2022)
    • ICCV (2015)
    • ECCV (2014)
    • PAMI (2011, 2013, 2015)
    • SIGGRAPH (2022)
    • Pattern Recognition (2015)