Picture for Song Dai

Song Dai

AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI Agents

Add code
Mar 19, 2026
Viaarxiv icon

Temporal Gains, Spatial Costs: Revisiting Video Fine-Tuning in Multimodal Large Language Models

Add code
Mar 18, 2026
Viaarxiv icon

LatentGeo: Learnable Auxiliary Constructions in Latent Space for Multimodal Geometric Reasoning

Add code
Mar 12, 2026
Viaarxiv icon

EffiReason-Bench: A Unified Benchmark for Evaluating and Advancing Efficient Reasoning in Large Language Models

Add code
Nov 13, 2025
Figure 1 for EffiReason-Bench: A Unified Benchmark for Evaluating and Advancing Efficient Reasoning in Large Language Models
Figure 2 for EffiReason-Bench: A Unified Benchmark for Evaluating and Advancing Efficient Reasoning in Large Language Models
Figure 3 for EffiReason-Bench: A Unified Benchmark for Evaluating and Advancing Efficient Reasoning in Large Language Models
Figure 4 for EffiReason-Bench: A Unified Benchmark for Evaluating and Advancing Efficient Reasoning in Large Language Models
Viaarxiv icon

GM-PRM: A Generative Multimodal Process Reward Model for Multimodal Mathematical Reasoning

Add code
Aug 06, 2025
Viaarxiv icon

Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities

Add code
May 27, 2025
Viaarxiv icon

PhysicsArena: The First Multimodal Physics Reasoning Benchmark Exploring Variable, Process, and Solution Dimensions

Add code
May 21, 2025
Viaarxiv icon