Picture for Tianyi Zhou

Tianyi Zhou

C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing

Add code
Apr 10, 2025
Figure 1 for C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing
Figure 2 for C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing
Figure 3 for C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing
Figure 4 for C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing
Viaarxiv icon

ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness

Add code
Apr 10, 2025
Figure 1 for ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
Figure 2 for ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
Figure 3 for ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
Figure 4 for ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
Viaarxiv icon

Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?

Add code
Apr 09, 2025
Viaarxiv icon

FedMerge: Federated Personalization via Model Merging

Add code
Apr 09, 2025
Viaarxiv icon

Efficient Reinforcement Finetuning via Adaptive Curriculum Learning

Add code
Apr 07, 2025
Viaarxiv icon

Towards Visual Text Grounding of Multimodal Large Language Model

Add code
Apr 07, 2025
Figure 1 for Towards Visual Text Grounding of Multimodal Large Language Model
Figure 2 for Towards Visual Text Grounding of Multimodal Large Language Model
Figure 3 for Towards Visual Text Grounding of Multimodal Large Language Model
Figure 4 for Towards Visual Text Grounding of Multimodal Large Language Model
Viaarxiv icon

CoSTA$\ast$: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing

Add code
Mar 13, 2025
Figure 1 for CoSTA$\ast$: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing
Figure 2 for CoSTA$\ast$: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing
Figure 3 for CoSTA$\ast$: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing
Figure 4 for CoSTA$\ast$: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing
Viaarxiv icon

R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model

Add code
Mar 07, 2025
Viaarxiv icon

Quantifying and Modeling Driving Styles in Trajectory Forecasting

Add code
Mar 06, 2025
Figure 1 for Quantifying and Modeling Driving Styles in Trajectory Forecasting
Figure 2 for Quantifying and Modeling Driving Styles in Trajectory Forecasting
Figure 3 for Quantifying and Modeling Driving Styles in Trajectory Forecasting
Figure 4 for Quantifying and Modeling Driving Styles in Trajectory Forecasting
Viaarxiv icon

ATLaS: Agent Tuning via Learning Critical Steps

Add code
Mar 04, 2025
Figure 1 for ATLaS: Agent Tuning via Learning Critical Steps
Figure 2 for ATLaS: Agent Tuning via Learning Critical Steps
Figure 3 for ATLaS: Agent Tuning via Learning Critical Steps
Figure 4 for ATLaS: Agent Tuning via Learning Critical Steps
Viaarxiv icon