Picture for Zhe Yang

Zhe Yang

Victor

FC-MIR: A Mobile Screen Awareness Framework for Intent-Aware Recommendation based on Frame-Compressed Multimodal Trajectory Reasoning

Add code
Dec 22, 2025
Viaarxiv icon

GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation

Add code
Dec 19, 2025
Viaarxiv icon

C$^3$TG: Conflict-aware, Composite, and Collaborative Controlled Text Generation

Add code
Nov 16, 2025
Viaarxiv icon

Adaptive Redundancy Regulation for Balanced Multimodal Information Refinement

Add code
Nov 14, 2025
Figure 1 for Adaptive Redundancy Regulation for Balanced Multimodal Information Refinement
Figure 2 for Adaptive Redundancy Regulation for Balanced Multimodal Information Refinement
Figure 3 for Adaptive Redundancy Regulation for Balanced Multimodal Information Refinement
Figure 4 for Adaptive Redundancy Regulation for Balanced Multimodal Information Refinement
Viaarxiv icon

Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning

Add code
Sep 04, 2025
Figure 1 for Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning
Figure 2 for Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning
Figure 3 for Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning
Figure 4 for Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning
Viaarxiv icon

UMRE: A Unified Monotonic Transformation for Ranking Ensemble in Recommender Systems

Add code
Aug 11, 2025
Viaarxiv icon

Entropy-Memorization Law: Evaluating Memorization Difficulty of Data in LLMs

Add code
Jul 08, 2025
Viaarxiv icon

ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization

Add code
May 16, 2025
Figure 1 for ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization
Figure 2 for ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization
Figure 3 for ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization
Figure 4 for ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization
Viaarxiv icon

Palette of Language Models: A Solver for Controlled Text Generation

Add code
Mar 14, 2025
Figure 1 for Palette of Language Models: A Solver for Controlled Text Generation
Figure 2 for Palette of Language Models: A Solver for Controlled Text Generation
Figure 3 for Palette of Language Models: A Solver for Controlled Text Generation
Figure 4 for Palette of Language Models: A Solver for Controlled Text Generation
Viaarxiv icon

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Add code
Jan 07, 2025
Figure 1 for LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token
Figure 2 for LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token
Figure 3 for LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token
Figure 4 for LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token
Viaarxiv icon