Publications:

BlobGEN-3D: Compositional 3D-Consistent Freeview Image Generation with 3D Blobs.
Chao Liu, Weili Nie, Sifei Liu, Abhishek Badki, Hang Su, Morteza Mardani, , Benjamin Eckart, Arash Vahdat
ACM SIGGRAPH Asia, 2024.
[arXiv]
DiffUHaul: A Training-Free Method for Object Dragging in Images.
Omri Avrahami, Rinon Gal, Gal Chechik, Ohad Fried, Dani Lischinski, Arash Vahdat, Weili Nie
ACM SIGGRAPH Asia, 2024.
[arXiv] [project page]
Compositional Text-to-Image Generation with Dense Blob Representations.
Weili Nie, Sifei Liu, Morteza Mardani, Chao Liu, Benjamin Eckart, Arash Vahdat
International Conference on Machine Learning (ICML), 2024.
[arXiv] [project page]
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents.
Yilun Xu, Gabriele Corso, Tommi Jaakkolo, Arash Vahdat, Karsten Kreis
International Conference on Machine Learning (ICML), 2024.
[arXiv]
A Variational Perspective on Solving Inverse Problems with Diffusion Models.
Morteza Mardani, Jiaming Song, Jan Kautz, Arash Vahdat
International Conference on Learning Representations (ICLR), 2024.
[arXiv] [code]
State-Specific Protein-Ligand Complex Structure Prediction with a Multi-Scale Deep Generative Model.
Zhuoran Qiao, Weili Nie, Arash Vahdat, Thomas F. Miller III, Anima Anandkumar
Nature Machine Intelligence, 2024.
[arXiv]
Differentially Private Diffusion Models.
Tim Dockhorn, Tianshi Cao, Arash Vahdat, Karsten Kreis.
Transactions on Machine Learning Research (TMLR), 2023.
[arXiv] [code] [project page]
Loss-Guided Diffusion Models for Plug-and-Play Controllable Generation.
Jiaming Song, Qinsheng Zhang, Hongxu Yin, Morteza Mardani, Ming-Yu Liu, Jan Kautz, Yongxin Chen, Arash Vahdat.
International Conference on Machine Learning (ICML), 2023.
[arXiv] [bibtex]
Fast Sampling of Diffusion Models via Operator Learning.
Hongkai Zheng, Weili Nie, Arash Vahdat, Kamyar Azizzadenesheli, Anima Anandkumar
International Conference on Machine Learning (ICML), 2023.
[arXiv] [bibtex]
I2SB: Image-to-Image Schrödinger Bridge.
Guan-Horng Liu, Arash Vahdat, De-An Huang, Evangelos Theodorou, Weili Nie, and Anima Anandkumar
International Conference on Machine Learning (ICML), 2023.
[arXiv] [code] [project page] [bibtex]
Dr-Fairness: Dynamic Data Ratio Adjustment for Fair Training on Real and Generated Data.
Yuji Roh, Weili Nie, De-An Huang, Steven Euijong Whang, Arash Vahdat, and Anima Anandkumar.
Transactions on Machine Learning Research (TMLR), 2023.
[OpenReview] [code] [bibtex]
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models.
Jiarui Xu, Sifei Liu, Arash Vahdat, Wonmin Byeon, Xiaolong Wang, and Shalini De Mello.
Computer Vision and Pattern Recognition (CVPR), 2023, Highlight (top 2.5%).
[arXiv] [code] [project page] [bibtex]
Recurrence without Recurrence: Stable Video Landmark Detection with Deep Equilibrium Models.
Paul Micaelli, Pavlo Molchanov, Arash Vahdat, Hongxu Yin, and Jan Kautz.
Computer Vision and Pattern Recognition (CVPR), 2023.
[arXiv] [bibtex]
Pseudoinverse-Guided Diffusion Models for Inverse Problems.
Jiaming Song, Arash Vahdat, Morteza Mardani, and Jan Kautz.
International Conference on Learning Representations (ICLR), 2023.
[Openreview] [bibtex]
GenSLMs: Genome-Scale Language Models Reveal SARS-CoV-2 Evolutionary Dynamics.
Maxim Zvyagin, Alexander Brace, Kyle Hippe, Yuntian Deng, et al.
The International Journal of High Performance Computing Applications, 2023, ACM Gordon Bell Prize in HPC-Based COVID-19 Research.
[bioRxiv] [press release]
GENIE: Higher-Order Denoising Diffusion Solvers.
Tim Dockhorn, Arash Vahdat, and Karsten Kreis.
Neural Information Processing Systems (NeurIPS), 2022.
[arXiv] [bibtex] [code] [project page]
LION: Latent Point Diffusion Models for 3D Shape Generation.
Xiaohui Zeng, Arash Vahdat, Francis Williams, Zan Gojcic, Or Litany, Sanja Fidler, and Karsten Kreis.
Neural Information Processing Systems (NeurIPS), 2022.
[arXiv] [bibtex] [code] [project page]
LANA: Latency Aware Network Acceleration.
Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, and Arash Vahdat.
European Conference on Computer Vision (ECCV), 2022.
[arXiv] [bibtex]
Diffusion Models for Adversarial Purification.
Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, Anima Anandkumar.
International Conference on Machine Learning (ICML), 2022.
[arXiv] [bibtex] [code] [project page]
A-ViT: Adaptive Tokens for Efficient Vision Transformer.
Hongxu Yin, Arash Vahdat, Jose Alvarez, Arun Mallya, Jan Kautz, and Pavlo Molchanov.
Computer Vision and Pattern Recognition (CVPR), 2022, Oral (top 4.2%).
[arXiv] [bibtex] [project page]
Tackling the Generative Learning Trilemma with Denoising Diffusion GANs.
Zhisheng Xiao, Karsten Kreis, and Arash Vahdat.
International Conference on Learning Representations (ICLR), 2022, Spotlight (top 2.6%).
[arXiv] [bibtex] [code] [project page]
Score-Based Generative Modeling with Critically-Damped Langevin Diffusion.
Tim Dockhorn, Arash Vahdat, and Karsten Kreis.
International Conference on Learning Representations (ICLR), 2022, Spotlight (top 0.4%).
[arXiv] [bibtex] [code] [project page]
Score-based Generative Modeling in Latent Space.
Arash Vahdat, Karsten Kreis, and Jan Kautz.
Neural Information Processing Systems (NeurIPS), 2021.
[arXiv] [bibtex] [code] [project page]
Controllable and Compositional Generation with Latent-Space Energy-Based Models.
Weili Nie, Arash Vahdat, and Anima Anandkumar.
Neural Information Processing Systems (NeurIPS), 2021.
[arXiv] [bibtex] [code] [project page]
A Contrastive Learning Approach for Training Variational Autoencoder Priors.
Jyoti Aneja, Alexander Schwing, Jan Kautz, and Arash Vahdat.
Neural Information Processing Systems (NeurIPS), 2021.
[arXiv] [bibtex]
Don’t Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence.
Tianshi Cao, Alex Bie, Arash Vahdat, Sanja Fidler, and Karsten Kreis.
Neural Information Processing Systems (NeurIPS), 2021.
[arXiv] [bibtex]
See through Gradients: Image Batch Recovery via GradInversion.
Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, and Pavlo Molchanov.
Computer Vision and Pattern Recognition (CVPR), 2021.
[arXiv] [bibtex]
VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models.
Zhisheng Xiao, Karsten Kreis, Jan Kautz, and Arash Vahdat.
International Conference on Learning Representations (ICLR), 2021, Spotlight (top 5.5%).
[arXiv] [bibtex] [code] [talk (9 min)]
NVAE: A Deep Hierarchical Variational Autoencoder.
Arash Vahdat, and Jan Kautz.
Neural Information Processing Systems (NeurIPS), 2020, Spotlight (top 4.1%).
[arXiv] [bibtex] [code] [talk (8 min)]
On the Distance between Two Neural Networks and the Stability of Learning.
Jeremy Bernstein, Arash Vahdat, Yisong Yue, and Ming-Yu Liu.
Neural Information Processing Systems (NeurIPS), 2020.
[arXiv] [bibtex] [code] [blog] [talk (3 min)]
Contrastive Learning for Weakly Supervised Phrase Grounding.
Tanmay Gupta, Arash Vahdat, Gal Chechik, Xiaodong Yang, Jan Kautz, and Derek Hoiem.
European Conference on Computer Vision (ECCV), 2020, Spotlight (top 5%).
[arXiv] [bibtex] [code] [project page] [talk (1 min)] [talk (8 min)]
Undirected Graphical Models as Approximate Posteriors.
Arash Vahdat, Evgeny Andriyash, and William G. Macready.
International Conference on Machine Learning (ICML) 2020.
[arXiv] [bibtex] [code] [talk (15 min)]
UNAS: Differentiable Architecture Search Meets Reinforcement Learning.
Arash Vahdat, Arun Mallya, Ming-Yu Liu, and Jan Kautz.
Computer Vision and Pattern Recognition (CVPR), 2020, Oral (top 5.7%).
[arXiv] [bibtex] [code] [talk (5 min)]
Semi-Supervised Semantic Image Segmentation with Self-correcting Networks.
Mostafa S. Ibrahim, Arash Vahdat, Mani Ranjbar, and William G. Macready.
Computer Vision and Pattern Recognition (CVPR), 2020.
[arXiv] [bibtex] [talk (1 min)] [talk (15 min)]
A Robust Learning Approach to Domain Adaptive Object Detection.
Mehran Khodabandeh, Arash Vahdat, Mani Ranjbar, and William G. Macready.
International Conference on Computer Vision. (ICCV), 2019.
[arXiv] [bibtex]
CATARACTS: Challenge on Automatic Tool Annotation for cataRACT Surgery.
Hassan Al Hajj, Mathieu Lamard, Pierre-Henri Conze et al.
Medical Image Analysis (MedIA), 2018.
[pdf] [bibtex]
Improved Gradient-Based Optimization over Discrete Distributions.
Evgeny Andriyash, Arash Vahdat, and William G. Macready.
Technical Report, 2018.
[arXiv] [bibtex]
DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors.
Arash Vahdat, Evgeny Andriyash, and William G. Macready.
Neural Information Processing Systems (NIPS) 2018.
[arXiv] [bibtex] [poster] [code]
DVAE++: Discrete Variational Autoencoders with Overlapping Transformations.
Arash Vahdat, William Macready, Zhengbing Bian, Amir Khoshaman, and Evgeny Andriyash.
International Conference on Machine Learning (ICML) 2018.
[arXiv] [bibtex]
Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks.
Arash Vahdat.
Neural Information Processing Systems (NIPS) 2017.
[arXiv] [bibtex]
Structure Inference Machines: Recurrent Neural Networks for Analyzing Relations in Group Activity Recognition.
Zhiwei Deng, Arash Vahdat, Hexiang Hu, and Greg Mori.
Computer Vision and Pattern Recognition (CVPR), 2016.
[pdf] [bibtex]
A Hierarchical Deep Temporal Model for Group Activity Recognition.
Mostafa S. Ibrahim, Srikanth Muralidharan, Zhiwei Deng, Arash Vahdat, and Greg Mori.
Computer Vision and Pattern Recognition (CVPR), 2016.
[pdf] [bibtex]
Unsupervised Learning of Supervoxel Embeddings for Video Segmentation.
Mehran Khodabandeh, Srikanth Muralidharan, Arash Vahdat, Nazanin Mehrasa, Eduardo M. Pereira, Shin’ichi Satoh, and Greg Mori.
IAPR International Conference on Pattern Recognition (ICPR), 2016.
[pdf] [bibtex]
Visual Recognition by Counting Instances: A Multi-Instance Cardinality Potential Kernel.
Hossein Hajimirsadeghi, Wang Yan, Arash Vahdat, and Greg Mori.
Computer Vision and Pattern Recognition (CVPR), 2015.
[pdf] [bibtex]
Discriminative Key-Component Models for Interaction Detection and Recognition.
Yasaman S. Sefidgar, Arash Vahdat, Stephen Se, and Greg Mori.
Computer Vision and Image Understanding (CVIU), 2015.
[pdf] [bibtex]
Discovering Video Clusters from Visual Features and Noisy Tags.
Arash Vahdat, Guang-Tong Zhou, and Greg Mori.
European Conference on Computer Vision (ECCV), 2014.
[pdf] [supplementary material] [bibtex] [Note on the annotation]
Handling Uncertain Tags in Visual Recognition.
Arash Vahdat and Greg Mori.
International Conference on Computer Vision. (ICCV), 2013.
[pdf] [bibtex] [annotation] [features]
Compositional Models for Video Event Detection: A Multiple Kernel Learning Latent Variable Approach.
Arash Vahdat, Kevin Cannons, Greg Mori, Sangmin Oh and Ilseo Kim.
International Conference on Computer Vision. (ICCV), 2013.
[pdf] [bibtex]
Latent Maximum Margin Clustering.
Guang-Tong Zhou, Tian Lan, Arash Vahdat and Greg Mori.
Neural Information Processing Systems (NIPS) 2013.
[pdf] [bibtex]
Segmental Multi-way Local Pooling for Video Recognition.
Ilseo Kim, Sangmin Oh, Arash Vahdat, Kevin Cannons, Amitha Perera, and Greg Mori.
ACM Multimedia Conference (ACM MM), 2013.
[pdf] [bibtex]
Multimedia Event Detection with Multimodal Feature Fusion and Temporal Concept Localization.
Sangmin Oh, Scott McCloskey, Ilseo Kim, Arash Vahdat, Kevin Cannons,
Hossein Hajimirsadeghi, Greg Mori, Amitha Perera, Megha Pandey, and Jason Corso.
Machine Vision and Applications (MVA) Special issue on Multimedia Event Detection 2013.
[link]
TRECVID 2012 GENIE: Multimedia Event Detection and Recounting.
TRECVID Workshop, 2012.
Amitha Perera et al.
[pdf]
Kernel Latent SVM for Visual Recognition.
Weilong Yang, Yang Wang, Arash Vahdat, and Greg Mori.
Neural Information Processing Systems (NIPS), 2012.
[pdf] [bibtex]
Similarity Constrained Latent Support Vector Machine: An Application to Weakly Supervised Action Classification.
Nataliya Shapovalova, Arash Vahdat, Kevin Cannons, Tian Lan, Greg Mori.
European Conference on Computer Vision (ECCV), 2012.
[pdf] [bibtex]
Complex Loss Optimization via Dual Decomposition.
Mani Ranjbar, Arash Vahdat, and Greg Mori.
Computer Vision and Pattern Recognition (CVPR), 2012.
[pdf] [bibtex]
GENIE TRECVID 2011 Multimedia Event Detection: Late-Fusion Approaches to Combine Multiple Audio-Visual features.
Amitha Perera et al.
TRECVID Workshop, 2011.
[pdf]
A Discriminative Key Pose Sequence Model for Recognizing Human Interactions.
Arash Vahdat, Bo Gao, Mani Ranjbar, and Greg Mori.
Eleventh IEEE International Workshop on Visual Surveillance, 2011.
[pdf] [bibtex]
Colour From Grey by Optimized Colour Ordering.
Arash Vahdat, and Mark Drew.
Color & Image Conference (CIC18), San Antonio, Nov. 2010.
[pdf] [pptx] [Talk]
Generalized Sparse Classifiers for Decoding Cognitive States in fMRI.
Bernard Ng, Arash Vahdat, Rafeef Abugharbieh, and Ghassan Hamarneh.
In Medical Image Computing and Computer-Assisted Intervention Workshop on Machine Learning
in Medical Imaging (MICCAI MLMI), 2010.
[pdf] [ppt]