Postdoctoral Researcher Xiang Zheng from City University of Hong Kong Delivers an Academic Report

2025-06-03

News

Content View Count:

On May 28, 2025, at the invitation of Jianyu Niu, Research Assistant Professor of The Research Institute of Trustworthy Autonomous Systems (RITAS) at Southern University of Science and Technology, Postdoctoral researcher Xiang Zheng from City University of Hong Kong delivered an academic report on "Reinforcement Learning-Based Adversarial Evaluation and Defense Enhancement for Large Language Models (LLM) " in Room 443B of the South Tower of the College of Engineering.

Figure 1 XiangZheng presents an academic report

At present, LLM has been more deeply applied in customer service, law, healthcare, and other fields based on its ability of understanding, reasoning, programming, planning and decision-making. While enhancing performance, ensuring LLM security has emerged as a critical concern. Based on this background, Xiang Zheng focused on the perspective of security assessment and introduced the technical framework, lightweight tools and defense solutions of LLM security assessment to the teachers and students present.

Reinforcement Learning-Based Adversarial Evaluation and Defense Enhancement for LLM refers to evaluating the fault tolerance boundary of LLM under malicious input through adversarial testing, such as modifying prompt words and injecting noise, optimizing attack strategies using reinforcement learning algorithms to discover vulnerabilities, and designing defense mechanisms to enhance model robustness. In the report, Xiang Zheng introduced a series of recent related work, such as the curiosity-driven LLM black-box audit framework (CALM), the black-box defense mechanism using VLMs to resist jailbreak attacks (BlueSuffix), and the multi-dimensional systemic security evaluation that is closer to real scenarios (ROSE).

Figure 2 Technical Architecture of BlueSuffix

After the report, the teachers and students engaged in technical dialogues around the core topics of the report and combined with their own research directions. Xiang Zheng systematically responded to the questions raised by teachers and students from the dimensions of technical implementation, experimental verification and industry application. The report ended successfully in a lively academic dialogue.

相关阅读

我系斯发基斯、唐珂、张宇老师参加第二届世界顶尖科学家论坛

【大咖访谈】对话图灵奖获得者Joseph Sifakis：“自主系统”将如何改变未来

南科大讲堂丨图灵奖得主Joseph Sifakis院士畅谈“将AI引入自主系统”