Yuzhi Zhao

Research Engineer, Huawei Hong Kong Research Center

Pak Shek Kok, Hong Kong SAR, China

Short Bio

I am research engineer at Huawei Hong Kong Research Center. I received the Ph.D. degree from Department of Electronic Engineering, City University of Hong Kong in February 2023 and the B.Eng. degree from School of Electronic and Information Engineering (Qiming College), Huazhong University of Science and Technology in June 2018. I have broad interests in AI applications, including low-level vision and computational photography, generative models (e.g., GAN and diffusion model). Recently, I focuses on applications of Multimodal Large Language Model (MLLM), e.g., AI Agent.

He has interned at AI Imaging Group, SenseTime, working on computational photography research and projects, and Lightspeed and Quantum Studios, Tencent IEG, working on AIGC projects (e.g., stable diffusion). He is now as a research engineer at Huawei Hong Kong Research Center, working on MLLM projects.

We are hiring! Please email me if you are interested in an internship / research engineer position (MLLM research or projects, base: Dongguan, Beijing, Shanghai, Hong Kong).

Selected Publication

*: corresponding author

  • LLM, MLLM, AI Agent

Mengyang Wu, Yuzhi Zhao*, Jialun Cao, Mingjie Xu, Zhongming Jiang, Xuehui Wang, Qinbin Li, Guangneng Hu, Shengchao Qin, Chi-Wing Fu. ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation. AAAI, 2025 (Code) (URL)

Mingjie Xu, Mengyang Wu, Yuzhi Zhao*, Jason Chun Lok Li, Weifeng Ou. LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations. WACV, 2025 (Code) (URL)

  • Low-level Vision and computational photography

Yuzhi Zhao*, Lai-Man Po, Xin Ye, Qiong Yan, Yongzhe Xu. Modeling Dual-Exposure Quad-Bayer Patterns for Joint Denoising and Deblurring. IEEE Transactions on Image Processing, 2024 (Code) (URL)

Yuzhi Zhao*, Lai-Man Po, Kangcheng Liu, Xuehui Wang, Wing-Yin Yu. SVCNet: Real-time Scribble-based Video Colorization with Pyramid Networks. IEEE Transactions on Image Processing, 2023 (PDF) (Code) (URL)

Yuzhi Zhao*, Lai-Man Po, Tingyu Lin, Qiong Yan, Wei Liu, Pengfei Xian. HSGAN: Hyperspectral Reconstruction from RGB Images with Generative Adversarial Network. IEEE Transactions on Neural Networks and Learning Systems, 2023 (PDF) (Code) (URL)

Yuzhi Zhao*, Yongzhe Xu, Qiong Yan, Dingdong Yang, Xuehui Wang, and Lai-Man Po. D2HNet: Joint Denoising and Deblurring with Hierarchical Network for Robust Night Image Restoration. ECCV, 2022 (PDF) (Code/Dataset) (URL)

Yuzhi Zhao*, Lai-Man Po, Xuehui Wang, Qiong Yan, Wei Shen, et al. ChildPredictor: A Child Face Prediction Framework with Disentangled Learning. IEEE Transactions on Multimedia, 2022 (PDF) (Code/Dataset) (URL)

Yuzhi Zhao*, Lai-Man Po, Wing-Yin Yu, YAU Rehman, Mengyang Liu, Yujia Zhang, Weifeng Ou. VCGAN: Video Colorization with Hybrid Generative Adversarial Network. IEEE Transactions on Multimedia, 2022 (PDF) (Code) (URL)

Yuzhi Zhao*, Lai-Man Po, Kwok-Wai Cheung, Wing-Yin Yu, YAU Rehman. SCGAN: Saliency Map-guided Colorization with Generative Adversarial Network. IEEE Transactions on Circuits and Systems for Video Technology, 2020 (PDF) (Code) (URL)

  • Generative models

Wing-Yin Yu, Lai-Man Po, Ray C.C. Cheung, Yuzhi Zhao, Yu Xue, Kun Li. Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer. ICCV, 2023 (PDF) (Code) (URL)

Wing-Yin Yu, Lai-Man Po, Jingjing Xiong, Yuzhi Zhao, Pengfei Xian. ShaTure: Shape and Texture Deformation for Human Pose and Attribute Transfer. IEEE Transactions on Image Processing, 2022 (PDF) (URL)

  • Representation learning

Kangcheng Liu, Yuzhi Zhao, Qiang Nie, Zhi Gao, and Ben M. Chen. Weakly Supervised 3D Scene Segmentation with Region-Level Boundary Awareness and Instance Discrimination. ECCV, 2022 (PDF) (Code) (URL)

Yujia Zhang, Lai-Man Po, Xuyuan Xu, Mengyang Liu, Yexin Wang, Weifeng Ou, Yuzhi Zhao, Wing-Yin Yu. Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation. AAAI, 2022 (PDF) (Code) (URL)