Picture for Chenliang Xu

Chenliang Xu

TDMM-LM: Bridging Facial Understanding and Animation via Language Models

Add code
Mar 14, 2026
Viaarxiv icon

Training Large Reasoning Models Efficiently via Progressive Thought Encoding

Add code
Feb 18, 2026
Viaarxiv icon

SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks

Add code
Feb 06, 2026
Viaarxiv icon

Omni-Judge: Can Omni-LLMs Serve as Human-Aligned Judges for Text-Conditioned Audio-Video Generation?

Add code
Feb 02, 2026
Viaarxiv icon

Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling

Add code
Jan 30, 2026
Viaarxiv icon

Semantic visually-guided acoustic highlighting with large vision-language models

Add code
Jan 12, 2026
Viaarxiv icon

When to Think and When to Look: Uncertainty-Guided Lookback

Add code
Nov 19, 2025
Figure 1 for When to Think and When to Look: Uncertainty-Guided Lookback
Figure 2 for When to Think and When to Look: Uncertainty-Guided Lookback
Figure 3 for When to Think and When to Look: Uncertainty-Guided Lookback
Figure 4 for When to Think and When to Look: Uncertainty-Guided Lookback
Viaarxiv icon

PromptReverb: Multimodal Room Impulse Response Generation Through Latent Rectified Flow Matching

Add code
Oct 25, 2025
Viaarxiv icon

Diagnosing Visual Reasoning: Challenges, Insights, and a Path Forward

Add code
Oct 23, 2025
Viaarxiv icon

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

Add code
Oct 06, 2025
Figure 1 for Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
Figure 2 for Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
Figure 3 for Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
Figure 4 for Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
Viaarxiv icon