research

A thematic breakdown of my research along with highlighted papers.

Foundational Image Gen

Architecting efficient, high-resolution diffusion models for edge and large-scale deployment.

2026

  1. arXiv
    SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices
    D. Hu, A. Gupta, M. Gabidolla, A. Sahni, H. Coskun, Y. Li, Y. Idelbayev, A. Mahmood, A. Lebedev, D. Lahiri, A. Goyal, J. Hu, M. Gong, S. Tulyakov, and Anil Kag
    In , 2026
  2. ICLR
    SPRINT: Sparse-Dense Residual Fusion for Efficient Diffusion Transformers
    D. Park, M. Haji-Ali, Y. Li, W. Menapace, S. Tulyakov, H. Kim, A. Siarohin, and Anil Kag
    In International Conference on Learning Representations, 2026
  3. CVPR
    Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization
    T. Chen, A. Siarohin, G. G. Qian, K. C. J. Wang, E. Nemchinov, M. Haji-Ali, R. A. Guler, W. Menapace, I. Skorokhodov, Anil Kag, J. Zhu, and S. Tulyakov
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2026

2025

  1. CVPR
    SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures
    J. Chen, D. Hu, X. Huang, H. Coskun, A. Sahni, A. Gupta, A. Goyal, D. Lahiri, R. Singh, Y. Idelbayev, J. Cao, Y. Li, K. Cheng, S. Chan, M. Gong, S. Tulyakov, Y. Xu, J. Ren, and Anil Kag
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

  1. NeurIPS
    BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
    Y. Sui, Y. Li, Anil Kag, Y. Idelbayev, J. Cao, J. Hu, D. Sagar, B. Yuan, S. Tulyakov, and J. Ren
    In Neural Information Processing Systems, 2024
  2. NeurIPS
    AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
    Anil Kag, H. Coskun, J. Chen, J. Cao, W. Menapace, A. Siarohin, S. Tulyakov, and J. Ren
    In Neural Information Processing Systems, 2024

Foundational Video Gen

Pioneering high-fidelity, spatiotemporal transformers for mobile and server-side synthesis.

2026

  1. CVPR
    S2DiT: Sandwich Diffusion Transformer for Mobile Streaming Video Generation
    L. Zhao, Y. Wu, A. Lebedev, D. Lahiri, M. Dong, A. Sahni, M. Vasilkovsky, H. Chen, J. Hu, A. Siarohin, S. Tulyakov, Y. Wang, Anil Kag, and Y. Li
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2026
  2. CVPR
    ELITE: One Model, Many Budgets: Elastic Latent Interfaces for Diffusion Transformers
    M. Haji-Ali, W. Menapace, I. Skorokhodov Ribbon D. Park, Anil Kag, M. Vasilkovsky, S. Tulyakov, V. Ordonez, and A. Siarohin
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2026

2025

  1. CVPR
    SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
    Y. Wu, Z. Zhang, Y. Li, Y. Xu, Anil Kag, Y. Sui, H. Coskun, K. Ma, Alexander Lebedev, J. Hu, D. Metaxas, Y. Wang, S. Tulyakov, and J. Ren
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
  2. NeurIPS
    PointVid: Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach
    Y. Chen, J. Cao, V. Goel, S. Korolev, C. Jiang, J. Ren, S. Tulyakov, and Anil Kag
    In Neural Information Processing Systems, 2025
  3. arXiv
    H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models
    Y. Wu, Y. Li, I. Skorokhodov, Anil Kag, W. Menapace, S. Girish, A. Siarohin, Y. Wang, and S. Tulyakov
    arXiv preprint arXiv:2504.10567, 2025
  4. arXiv
    SnapGen-V2: Taming Diffusion Transformer for Real-Time Mobile Video Generation
    Y. Wu, Y. Li, Anil Kag, I. Skorokhodov, W. Menapace, K. Ma, A. Sahni, J. Hu, A. Siarohin, D. Sagar, Y. Wang, and S. Tulyakov
    arXiv preprint arXiv:2507.13343, 2025

2024

  1. CVPR
    Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
    W. Menapace, A. Siarohin, I. Skorokhodov, E. Deyneka, T. S. Chen, Anil Kag, Y. Fang, A. Stoliar, E. Ricci, J. Ren, and S. Tulyakov
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
  2. NeurIPS
    Sf-V: Single Forward Video Generation Model
    Z. Zhang, Y. Li, Y. Wu, Y. Xu, Anil Kag, I. Skorokhodov, W. Menapace, A. Siarohin, J. Cao, D. Metaxas, S. Tulyakov, and J. Ren
    In Neural Information Processing Systems, 2024

RLHF & Alignment

Advancing preference optimization and reward flow frameworks for generative model fine-tuning.

2026

  1. arXiv
    Diffusion-DRF: Free, Rich, and Differentiable Reward for Video Diffusion Fine-Tuning
    Y. Wang, Y. Li, G. G. Qian, S. Tulyakov, Y. Fu, and Anil Kag
    In , 2026

2025

  1. NeurIPS
    DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
    Z. Wu, Anil Kag, I. Skorokhodov, W. Menapace, A. Mirzaei, I. Gilitschenski, S. Tulyakov, and A. Siarohin
    In Neural Information Processing Systems, 2025
  2. ICCV
    RankDPO: Scalable Ranked Preference Optimization for Text-to-Image Generation
    S. Karthik, H. Coskun, Z. Akata, S. Tulyakov, J. Ren, and Anil Kag
    In International Conference on Computer Vision, 2025

2024

  1. CVPR
    TextCraftor: Your Text Encoder Can be Image Quality Controller
    Y. Li, X. Liu, Anil Kag, J. Hu, Y. Idelbayev, D. Sagar, Y. Wang, S. Tulyakov, and J. Ren
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Adaptive & Hardness-Aware

Developing intelligent systems that adapt model complexity based on input difficulty.

2023

  1. ICLR
    Scaffolding a Student to Instill Knowledge
    Anil Kag, Durmus Alp Emre Acar, Aditya Gangrade, and Venkatesh Saligrama
    In International Conference on Learning Representations, 2023
  2. ICLR
    Efficient Edge Inference by Selective Query
    Anil Kag, Igor Fedorov, Aditya Gangrade, Paul Whatmough, and Venkatesh Saligrama
    In International Conference on Learning Representations, 2023

2022

  1. ICML Workshop
    Achieving High TinyML Accuracy through Selective Cloud Interaction
    Anil Kag, Igor Fedorov, Aditya Gangrade, Paul Whatmough, and Venkatesh Saligrama
    In Workshop on Dynamic Neural Networks at International Conference on Machine Learning, 2022

2021

  1. NeurIPS
    Online Selective Classification with Limited Feedback
    Aditya Gangrade, Anil Kag, Ashok Cutkosky, and Venkatesh Saligrama
    In Neural Information Processing Systems, 2021
  2. AISTATS
    Selective Classification via One-Sided Prediction
    Aditya Gangrade, Anil Kag, and Venkatesh Saligrama
    In Artificial Intelligence and Statistics, 2021

Efficient RNN/CNN

Fundamental breakthroughs in low-complexity architectural design and optimization.

2022

  1. CVPR
    Condensing CNNs with Partial Differential Equations
    Anil Kag and Venkatesh Saligrama
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

  1. ICML
    Training Recurrent Neural Networks via Forward Propagation Through Time
    Anil Kag and Venkatesh Saligrama
    In International Conference on Machine Learning, 2021
  2. CVPR
    Time Adaptive Recurrent Neural Network
    Anil Kag and Venkatesh Saligrama
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021

2020

  1. ICLR
    RNNs Incrementally Evolving on an Equilibrium Manifold: A Panacea for Vanishing and Exploding Gradients?
    Anil Kag, Ziming Zhang, and Venkatesh Saligrama
    In International Conference on Learning Representations, 2020

2018

  1. NIPS Workshop
    Learning Compact Networks via Adaptive Network Regularization
    Sivaramakrishnan Sankarapandian, Anil Kag, Rachel Manzelli, and Brian Kulis
    In NIPS workshop on Compact Deep Neural Networks with industrial applications, 2018

Extreme Classification

Architecting industrial-scale recommendation systems for millions of labels.

2019

  1. NSDI
    BLAS-on-Flash: An Efficient Alternative for Large Scale ML Training and Inference?
    Suhas Jayaram Subramanya, Harsha Vardhan Simhadri, Srajan Garg, Anil Kag, and Venkatesh Balasubramanian
    In Proceedings of the 16th USENIX Conference on Networked Systems Design and Implementation, 2019

2018

  1. WWW
    Parabel: Partitioned Label Trees for Extreme Classification with Application to Dynamic Search Advertising
    Yashoteja Prabhu, Anil Kag, Shrutendra Harsola, Rahul Agrawal, and Manik Varma
    In Proceedings of the 2018 World Wide Web Conference, 2018
  2. WSDM
    Extreme Multi-label Learning with Label Features for Warm-start Tagging, Ranking & Recommendation
    Yashoteja Prabhu, Anil Kag, Shilpa Gopinath, Kunal Dahiya, Shrutendra Harsola, Rahul Agrawal, and Manik Varma
    In Proceedings of the 11th ACM International Conference on Web Search and Data Mining, 2018