Arash Vahdat

I am a senior research scientist at NVIDIA research specializing in machine learning and computer vision. Prior to NVIDIA, I was a senior research scientist at D-Wave Systems Inc. where I worked on deep generative learning with discrete latent variables and weakly supervised learning from noisy labels. Before joining D-Wave, I was a research faculty member at Simon Fraser University (SFU), where I led research on deep video analysis and taught graduate-level courses on big data analysis. I obtained my Ph.D. and MSc from SFU under Greg Mori’s supervision working on latent variable frameworks for visual analysis. I grew up in Iran where I completed my undergrad in computer engineering at Sharif University of Technology.

At NVIDIA, I am working on a broad range of problems including deep generative learning, neural architecture search and representation learning. I have built a state-of-the-art variational autoencoder (VAE) called NVAE that enables generating large high-quality images for the first time using VAEs. I led the extension of these models to hybrid energy-based and variational autoencoders in NCP-VAE and VAEBM. I also (co-)introduced a framework called LSGM that allows training score-based generative models (a.k.a. denoising diffusion models) in latent space. This work was followed up by our works on diffusion models including denoising diffusion GANs and score-based generative modeling with critically-damped Langevin diffusion. I have also proposed UNAS, a unified framework for neural architecture search (NAS) that unifies differentiable and reinforcement learning-based NAS. This work initiated our research collaboration with Microsoft research on NAS that produced LANA, a framework for network acceleration with operation optimization. Finally, at NVIDIA, we introduced a contrastive learning framework for weakly-supervised phrase grounding.

At D-Wave, I solely developed a robust image labeling framework that was presented at the main NeurIPS conference in 2017. A variant of this framework later won the CATARACTS surgical video analysis challenge in collaboration with Siemense Healthinears. This framework later became one of the founding services for, D-Wave’s machine learning business unit. I also led the extension of this framework to object detection and semantic object segmentation.

At D-Wave, I also developed new frameworks for training deep generative models with discrete latent variables. These frameworks, known as discrete variational autoencoders (DVAEs), have pushed the state-of-the-art in this area (DVAE++DVAE#, DVAE##), and have expanded our understanding of the existing frameworks (arXiv’18).

Google Scholar

Research Interests

  • Probabilistic deep learning, generative learning, variational inference, energy-based models
  • Representation learning
  • Neural architecture search
  • Semi-supervised learning, structured latent variable models
  • Weakly-supervised learning with noisy labels

Students and Interns

Throughout my career, I was fortunate to have the opportunity of mentoring bright students and interns including:

* I served as a co-mentor.

Recent Invited Talks

  • New Frontiers in Deep Generative Learning, Open Data Science Conference, Nov 2021.
  • Hybrid Hierarchical Generative Models for Image Synthesis, hosted by Mohammad Norouzi at Google Brain Toronto, Dec 2020.
  • Deep Hierarchical Variational Autoencoder for Image Synthesis, hosted by Amir Khash Ahmadi at Autodesk AI Lab, Nov 2020.
  • Deep Hierarchical Variational Autoencoder for Image Synthesis, hosted by Juan Felipe Carrasquilla at the Vector Institute, Oct 2020.
  • NVAE: A Deep Hierarchical Variational Autoencoder, hosted by Danilo Rezende at DeepMind, Sept 2020.
  • On Continuous Relaxation of Discrete Latent Variables, hosted by Stefano Ermon, Stanford University, Nov 2019.


  • Area Chair:
    • NeurIPS (2021)
    • ICLR (2021, 2022)
  •  Reviewer:
    • NeurIPS (2017, 2019, 2020)
    • ICML (2018, 2020)
    • CVPR (2015, 2018, 2019, 2021, 2022)
    • ICCV (2015)
    • ECCV (2014)
    • PAMI (2011, 2013, 2015)
    • Pattern Recognition (2015)