Yuxuan Hu
I am currently pursuing a Ph.D. in the AML Lab at the School of Data Science, City University of Hong Kong, under the supervision of Prof. Xiangyu Zhao. Previously, I obtained my Master’s degree from Sun Yat-sen University, where I was advised by Prof. Chengming Li, with co-supervision from Min Yang and Prof. Xiaodan Liang. I also received my Bachelor’s degree from Sun Yat-sen University.
huyx555@gmail.com
huyx55@mail2.sysu.edu.cn
Education
City University of Hong Kong, PhD in Data Science (2025 - Now)
Sun Yat-sen University, Master of Engineering (2022 - 2025)
Sun Yat-sen University, Bachelor of Engineering (2018 - 2022)
Research Interests
Large Language Model, Information Retrieval, Dialogue, Multi-modal Learning.
Experience
Algorithm Intern
July 2025 - September 2025 | Baidu
- Responsible for enhancing the realism of audiobook performances for live-action novels. Utilized large language models to extract information about different characters in novels and identify the speaking character for each line of text.
- Developed a novel training dataset and designed few-shot prompts for 11 identity types. Fine-tuned Llama 3 using LoRA with Llama-Factory to extract character identities and specific attributes from the novel, building character profiles to support the selection of suitable voice actors.
Awards
- First-class Scholarship, Sun Yat-sen University, 2023.
- National Project on Undergraduate Innovation and Entrepreneurship, Sun Yat-sen University, 2021.
- Second Prize, National College Students Mathematical Modeling Contest, Sun Yat-sen University, 2020.
Selected Publications
Emotion and Intention Guided Multi-Modal Learning for Sticker Response Selection
Hu Y, Chen J, Wang Y, et al. (2025). Emotion and Intention Guided Multi-Modal Learning for Sticker Response Selection. arXiv:2511.17587. (AAAI 2026).
MGHFT: Multi-Granularity Hierarchical Fusion Transformer for Cross-Modal Sticker Emotion Recognition
Chen J, Hu Y, Lu H, et al. (2025). MGHFT: Multi-Granularity Hierarchical Fusion Transformer for Cross-Modal Sticker Emotion Recognition. Proceedings of the 33rd ACM International Conference on Multimedia, 5794-5803.
Learning to Generalize Unseen Domains via Multi-source Meta Learning for Text Classification
Hu Y, Zhang C, Yang M, et al. (2024). Learning to Generalize Unseen Domains via Multi-source Meta Learning for Text Classification. International Conference on Pattern Recognition, 412-428.
Aptness: Incorporating Appraisal Theory and Emotion Support Strategies for Empathetic Response Generation
Hu Y, Tan M, Zhang C, et al. (2024). Aptness: Incorporating Appraisal Theory and Emotion Support Strategies for Empathetic Response Generation. Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 900-909.
Skeletal Spatial-Temporal Semantics Guided Homogeneous-Heterogeneous Multimodal Network for Action Recognition
Zhang C, Hu Y, Yang M, et al. (2023). Skeletal Spatial-Temporal Semantics Guided Homogeneous-Heterogeneous Multimodal Network for Action Recognition. Proceedings of the 31st ACM International Conference on Multimedia, 3657-3666.
