Picture for Jiawei Kong

Jiawei Kong

CHILLGuard: Towards Fine-Grained Chinese LLM Safety Guardrail with Scalable Data Construction and Model-aware Preference Alignment

Add code
Jun 13, 2026
Viaarxiv icon

Bypassing Copyright Protection in Diffusion-based Customization via Two-Stage Latent Feature Optimization

Add code
Jun 06, 2026
Viaarxiv icon

Reasoning Matters: Mitigate Hallucination in Multimodal Large Reasoning Models via Reasoning-Conditioned Preference Optimization

Add code
May 27, 2026
Viaarxiv icon

Seeing Through the Chain: Mitigate Hallucination in Multimodal Reasoning Models via CoT Compression and Contrastive Preference Optimization

Add code
Feb 03, 2026
Viaarxiv icon

Towards Distillation-Resistant Large Language Models: An Information-Theoretic Perspective

Add code
Feb 03, 2026
Viaarxiv icon

Grounding Language with Vision: A Conditional Mutual Information Calibrated Decoding Strategy for Reducing Hallucinations in LVLMs

Add code
May 26, 2025
Viaarxiv icon

Wolf Hidden in Sheep's Conversations: Toward Harmless Data-Based Backdoor Attacks for Jailbreaking Large Language Models

Add code
May 23, 2025
Viaarxiv icon

Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors

Add code
May 21, 2025
Figure 1 for Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors
Figure 2 for Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors
Figure 3 for Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors
Figure 4 for Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors
Viaarxiv icon

Neural Antidote: Class-Wise Prompt Tuning for Purifying Backdoors in Pre-trained Vision-Language Models

Add code
Feb 26, 2025
Viaarxiv icon

CLIP-Guided Networks for Transferable Targeted Attacks

Add code
Jul 14, 2024
Viaarxiv icon