Dong Chen

Dong Chen

陈栋

Principal Research Manager

Visual Computing Group, Microsoft Research Asia

About Me

I joined the Visual Computing Group at Microsoft Research Asia in July 2015. Prior to that, I received my B.S. and Ph.D. degrees from the University of Science and Technology of China in 2010 and 2015, respectively. I had internships at Microsoft Research Asia from 2010 to 2015 and was honored with the Microsoft Research Asia Fellowship Award in 2013.

My research focuses on cutting-edge computer vision and machine learning technologies that push the boundaries of visual understanding and generation.

Denoising Diffusion Probabilistic Models (DDPM) Generative Adversarial Networks (GAN) Self-/Semi-/Unsupervised Learning Face Avatar Computer Vision Deep Learning

Recent News

  • 2025: 4 papers accepted by CVPR'25 (Structured 3D Latents, DesignDiffusion, SmartEraser, ART)
  • 2025: 2 papers accepted by ICCV'25 (Gaussian Variation Field Diffusion, Improved Noise Schedule)
  • 2025: SinDiffusion published in IEEE TPAMI; Phi-4-Mini Technical Report released
  • 2024: 8 papers accepted by top-tier conferences (CVPR'24, ICCV'24, ECCV'24, ICML'24)
  • 2024: Co-authored Phi-3 Technical Report on highly capable language models
  • 2024: Released InstructDiffusion: A generalist modeling interface for vision tasks
  • 2023: 5 papers accepted by CVPR'23
  • 2023: Serving as Area Chair for CVPR'23
  • 2022: 2 papers accepted by ECCV'22

Selected Publications

2025

Structured 3D Latents for Scalable and Versatile 3D Generation
Jianfeng Xiang, Zelong Lv, Sicheng Xu, Yu Deng, Ruicheng Wang, Bowen Zhang, Dong Chen, Xin Tong, Jiaolong Yang
Computer Vision and Pattern Recognition (CVPR), 2025
DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models
Zhendong Wang, Jianmin Bao, Shuyang Gu, Dong Chen, Wengang Zhou, Houqiang Li
Computer Vision and Pattern Recognition (CVPR), 2025
SmartEraser: Remove Anything from Images using Masked-Region Guidance
Longtao Jiang, Zhendong Wang, Jianmin Bao, Wengang Zhou, Dongdong Chen, Lei Shi, Dong Chen, Houqiang Li
Computer Vision and Pattern Recognition (CVPR), 2025
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation
Yifan Pu, Yiming Zhao, Zhicong Tang, Ruihong Yin, Haoxing Ye, Yuhui Yuan, Dong Chen, Jianmin Bao, Sirui Zhang, et al.
Computer Vision and Pattern Recognition (CVPR), 2025
Gaussian Variation Field Diffusion for High-Fidelity Video-to-4D Synthesis
Bowen Zhang, Sicheng Xu, Chuxin Wang, Jiaolong Yang, Feng Zhao, Dong Chen, Baining Guo
International Conference on Computer Vision (ICCV), 2025
Improved Noise Schedule for Diffusion Training
Tiankai Hang, Shuyang Gu, Jianmin Bao, Fangyun Wei, Dong Chen, Xin Geng, Baining Guo
International Conference on Computer Vision (ICCV), 2025
SinDiffusion: Learning a Diffusion Model from a Single Natural Image
Weilun Wang, Jianmin Bao, Wengang Zhou, Dongdong Chen, Dong Chen, Lu Yuan, Houqiang Li
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
HairShifter: Consistent and High-Fidelity Video Hair Transfer via Anchor-Guided Animation
Wangzheng Shi, Yinglin Zheng, Yuxin Lin, Jianmin Bao, Ming Zeng, Dong Chen
ACM International Conference on Multimedia (ACM MM), 2025
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Microsoft Research Team, Dong Chen, et al.
arXiv preprint, 2025
Diffusion Models without Classifier-Free Guidance
Zhicong Tang, Jianmin Bao, Dong Chen, Baining Guo
arXiv preprint, 2025
Fast Autoregressive Models for Continuous Latent Generation
Tiankai Hang, Jianmin Bao, Fangyun Wei, Dong Chen
arXiv preprint, 2025
VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder
Zhicong Tang, Shuyang Gu, Chunyu Wang, Ting Zhang, Jianmin Bao, Dong Chen, Baining Guo
Graphical Models, 2025
CCA: Collaborative Competitive Agents for Image Editing
Tiankai Hang, Shuyang Gu, Dong Chen, Xin Geng, Baining Guo
Frontiers of Computer Science, 2025
High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior
Nan Huang, Ting Zhang, Yuhui Yuan, Dong Chen, Shanghang Zhang
IEEE International Conference on Robotics and Automation (ICRA), 2025

2024

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
Zigang Geng, Binxin Yang, Tiankai Hang, Chen Li, Shuyang Gu, Ting Zhang, Jianmin Bao, Zheng Zhang, Han Hu, Dong Chen, Baining Guo
Computer Vision and Pattern Recognition (CVPR), 2024
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Microsoft Research Team, Dong Chen, et al.
arXiv preprint, 2024
GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling
Bowen Zhang, Yiji Cheng, Jiaolong Yang, Chunyu Wang, Feng Zhao, Yansong Tang, Dong Chen, Baining Guo
Advances in Neural Information Processing Systems (NeurIPS), 2024
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%
Lei Zhu, Fangyun Wei, Yanye Lu, Dong Chen
Advances in Neural Information Processing Systems (NeurIPS), 2024
RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Bowen Zhang, Yiji Cheng, Chunyu Wang, Ting Zhang, Jiaolong Yang, Yansong Tang, Dong Chen, Baining Guo
European Conference on Computer Vision (ECCV), 2024
IRGen: Generative Modeling for Image Retrieval
Yidan Zhang, Ting Zhang, Dong Wang, Dong Chen, Fang Wen
European Conference on Computer Vision (ECCV), 2024

2023

MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
Xiaoyi Dong, Jianmin Bao, Yinglin Zheng, Ting Zhang, Dongdong Chen, Hao Yang, Ming Zeng, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu
Computer Vision and Pattern Recognition (CVPR), 2023
RODIN: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion
Tengfei Wang, Bo Zhang, Ting Zhang, Shuyang Gu, Jianmin Bao, Tadas Baltrusaitis, Jingjing Shen, Dong Chen, Fang Wen, Qifeng Chen, Baining Guo
Computer Vision and Pattern Recognition (CVPR), 2023
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
Bowen Zhang, Chenyang Qi, Pan Zhang, Bo Zhang, HsiangTao Wu, Dong Chen, Qifeng Chen, Yong Wang, Fang Wen
Computer Vision and Pattern Recognition (CVPR), 2023
Paint by Example: Exemplar-based Image Editing with Diffusion Models
Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Dong Chen, Fang Wen
Computer Vision and Pattern Recognition (CVPR), 2023
CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning
Yiting Cheng, Fangyun Wei, Jianmin Bao Dong Chen, Wenqiang Zhang
Computer Vision and Pattern Recognition (CVPR), 2023

2022

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Weiming Zhang, Nenghai Yu, Lu Yuan, Dong Chen, Baining Guo
Computer Vision and Pattern Recognition (CVPR), 2022
StyleSwin: Transformer-based GAN for High-resolution Image Generation
Bowen Zhang, Shuyang Gu, Bo Zhang, Jianmin Bao, Dong Chen, Fang Wen, Yong Wang, Baining Guo
Computer Vision and Pattern Recognition (CVPR), 2022
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Shuyang Gu, Dong Chen, Jianmin Bao, Fang Wen, Bo Zhang, DongDong Chen, Lu Yuan, Baining Guo
Computer Vision and Pattern Recognition (CVPR), 2022

Selected Earlier Work

Cross-domain Correspondence Learning for Exemplar-based Image Translation
Pan Zhang, Bo Zhang, Dong Chen, Lu Yuan, Fang Wen
Computer Vision and Pattern Recognition (CVPR Oral), 2020
Bringing Old Photos Back to Life
Ziyu Wan, Bo Zhang, Dongdong Chen, Pan Zhang, Dong Chen, Jing Liao, Fang Wen
Computer Vision and Pattern Recognition (CVPR Oral), 2020
An Efficient Joint Formulation for Bayesian Face Verification
Dong Chen, Xudong Cao, David Wipf, Fang Wen, Jian Sun
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016
Bayesian Face Revisited: A Joint Formulation
Dong Chen, Xudong Cao, Liwei Wang, Fang Wen, Jian Sun
European Conference on Computer Vision (ECCV), 2012

Awards & Honors

Microsoft Research Asia Fellowship, 2013