Picture for Yongdong Zhang

Yongdong Zhang

In-Token Rationality Optimization: Towards Accurate and Concise LLM Reasoning via Self-Feedback

Add code
Nov 13, 2025
Viaarxiv icon

SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder

Add code
Nov 11, 2025
Viaarxiv icon

UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models

Add code
Oct 02, 2025
Figure 1 for UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models
Figure 2 for UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models
Figure 3 for UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models
Figure 4 for UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models
Viaarxiv icon

Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models

Add code
Aug 28, 2025
Figure 1 for Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Figure 2 for Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Figure 3 for Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Figure 4 for Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Viaarxiv icon

Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models

Add code
May 26, 2025
Figure 1 for Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Figure 2 for Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Figure 3 for Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Figure 4 for Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Viaarxiv icon

Training LLM-Based Agents with Synthetic Self-Reflected Trajectories and Partial Masking

Add code
May 26, 2025
Viaarxiv icon

Leveraging Robust Optimization for LLM Alignment under Distribution Shifts

Add code
Apr 08, 2025
Viaarxiv icon

HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation

Add code
Mar 31, 2025
Viaarxiv icon

Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Add code
Mar 25, 2025
Figure 1 for Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Figure 2 for Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Figure 3 for Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Figure 4 for Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Viaarxiv icon

SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability

Add code
Mar 18, 2025
Figure 1 for SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability
Figure 2 for SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability
Figure 3 for SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability
Figure 4 for SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability
Viaarxiv icon