![]() |
Xiaomeng Hu, Ph.D. student |
I am currently a second-year Ph.D. student at the Department of Computer Science and Engineering of The Chinese University of Hong Kong under the supervision of Prof. Tsung-Yi Ho and co-supervised by Dr. Pin-Yu Chen from IBM-Research. Previously, I had the fortune to collaborate with researchers from Microsoft Research Redmond, Tsinghua University and Northeastern University. I received my B.Eng. from Northeastern Univ in July 2023.
Large Language Models
AI Safety
Ph.D. Computer Science and Engineering, The Chinese University of Hong Kong, Aug. 2023 -
B.Eng. Artificial Intelligence, Northeastern University, Sep. 2019 - Jul. 2023
[C4] Xiaomeng Hu, Pin-Yu Chen, Tsung-Yi Ho. Token Highlighter: Inspecting and Mitigating Jailbreak Prompts for Large Language Models. AAAI 2025 (Oral). (paper) (demo)
[C3] Xiaomeng Hu, Pin-Yu Chen, Tsung-Yi Ho. Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes. NeurIPS 2024. (paper) (demo) (code) (IBM Blog)
[C2] Xiaomeng Hu, Pin-Yu Chen, Tsung-Yi Ho. RADAR: Robust AI-Text Detection via Adversarial Learning. NeurIPS 2023. (paper) (demo) (code) (IBM Blog)
[C1] Xiaomeng Hu*, Shi Yu*, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu, Ge Yu. P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning. SIGIR 2022. (paper) (code)
International Business Machine, New York, USA, Aug. 2023 - Present
Research Intern, IBM Research AI
Topic: LLM Jailbreak
Project: Gradient Cuff (NeurIPS 2024), Token Highlighter (AAAI 2025 (Oral))
The Chinese University of Hong Kong, Hong Kong SAR, Feb. 2023 - Jun. 2023
Research Intern, IDEA Lab
Topic: AI-Text detection for LLMs
Project: RADAR (NeurIPS 2023)
Tsinghua University, Beijing, P.R. China, Aug. 2021 - Jun. 2022
Research Intern, THU-NLP Lab
Topic: Information Retrieval and Pretrained Language Models
Project: P3 Ranker (SIGIR 2022)
2023 Fall: CSCI3130 Formal Languages and Automata Theory
2024 Spring: ENGG1110E Problem Solving By Programming (C language)
2024 Fall: CSCI3130 Formal Languages and Automata Theory
2025 Spring: CSCI3320 Fundamentals of Machine Learning
NeurIPS 2023 (AdvML Workshop).
ICLR 2025.
IJCAI 2025.
ACL ARR 2025 February.