Picture for Yongdong Zhang

Yongdong Zhang

In-Token Rationality Optimization: Towards Accurate and Concise LLM Reasoning via Self-Feedback

Add code
Nov 13, 2025
Viaarxiv icon

SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder

Add code
Nov 11, 2025
Figure 1 for SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder
Figure 2 for SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder
Figure 3 for SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder
Figure 4 for SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder
Viaarxiv icon

UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models

Add code
Oct 02, 2025
Figure 1 for UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models
Figure 2 for UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models
Figure 3 for UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models
Figure 4 for UpSafe$^\circ$C: Upcycling for Controllable Safety in Large Language Models
Viaarxiv icon

Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models

Add code
Aug 28, 2025
Figure 1 for Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Figure 2 for Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Figure 3 for Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Figure 4 for Video-LevelGauge: Investigating Contextual Positional Bias in Large Video Language Models
Viaarxiv icon

Training LLM-Based Agents with Synthetic Self-Reflected Trajectories and Partial Masking

Add code
May 26, 2025
Viaarxiv icon

Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models

Add code
May 26, 2025
Figure 1 for Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Figure 2 for Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Figure 3 for Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Figure 4 for Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
Viaarxiv icon

Leveraging Robust Optimization for LLM Alignment under Distribution Shifts

Add code
Apr 08, 2025
Viaarxiv icon

HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation

Add code
Mar 31, 2025
Viaarxiv icon

Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Add code
Mar 25, 2025
Figure 1 for Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Figure 2 for Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Figure 3 for Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Figure 4 for Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Viaarxiv icon

SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability

Add code
Mar 18, 2025
Figure 1 for SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability
Figure 2 for SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability
Figure 3 for SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability
Figure 4 for SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability
Viaarxiv icon