Picture for Rongwu Xu

Rongwu Xu

Detecting LLM-Generated Spam Reviews by Integrating Language Model Embeddings and Graph Neural Network

Add code
Oct 02, 2025
Viaarxiv icon

Difficulty-Based Preference Data Selection by DPO Implicit Reward Gap

Add code
Aug 06, 2025
Figure 1 for Difficulty-Based Preference Data Selection by DPO Implicit Reward Gap
Figure 2 for Difficulty-Based Preference Data Selection by DPO Implicit Reward Gap
Figure 3 for Difficulty-Based Preference Data Selection by DPO Implicit Reward Gap
Figure 4 for Difficulty-Based Preference Data Selection by DPO Implicit Reward Gap
Viaarxiv icon

The Singapore Consensus on Global AI Safety Research Priorities

Add code
Jun 25, 2025
Viaarxiv icon

Does Chain-of-Thought Reasoning Really Reduce Harmfulness from Jailbreaking?

Add code
May 23, 2025
Viaarxiv icon

LIFEBench: Evaluating Length Instruction Following in Large Language Models

Add code
May 22, 2025
Viaarxiv icon

AI Awareness

Add code
Apr 25, 2025
Viaarxiv icon

"Nuclear Deployed!": Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents

Add code
Feb 17, 2025
Figure 1 for "Nuclear Deployed!": Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents
Figure 2 for "Nuclear Deployed!": Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents
Figure 3 for "Nuclear Deployed!": Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents
Figure 4 for "Nuclear Deployed!": Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents
Viaarxiv icon

Long$^2$RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall

Add code
Oct 31, 2024
Viaarxiv icon

Sing it, Narrate it: Quality Musical Lyrics Translation

Add code
Oct 29, 2024
Viaarxiv icon

On the Role of Attention Heads in Large Language Model Safety

Add code
Oct 17, 2024
Figure 1 for On the Role of Attention Heads in Large Language Model Safety
Figure 2 for On the Role of Attention Heads in Large Language Model Safety
Figure 3 for On the Role of Attention Heads in Large Language Model Safety
Figure 4 for On the Role of Attention Heads in Large Language Model Safety
Viaarxiv icon