Picture for Fan Zhou

Fan Zhou

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

Add code
Dec 28, 2025
Viaarxiv icon

From Shallow Humor to Metaphor: Towards Label-Free Harmful Meme Detection via LMM Agent Self-Improvement

Add code
Dec 25, 2025
Viaarxiv icon

TAMEing Long Contexts in Personalization: Towards Training-Free and State-Aware MLLM Personalized Assistant

Add code
Dec 25, 2025
Viaarxiv icon

Causally-Grounded Dual-Path Attention Intervention for Object Hallucination Mitigation in LVLMs

Add code
Nov 12, 2025
Viaarxiv icon

DORAEMON: A Unified Library for Visual Object Modeling and Representation Learning at Scale

Add code
Nov 06, 2025
Figure 1 for DORAEMON: A Unified Library for Visual Object Modeling and Representation Learning at Scale
Figure 2 for DORAEMON: A Unified Library for Visual Object Modeling and Representation Learning at Scale
Viaarxiv icon

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

Add code
Oct 29, 2025
Viaarxiv icon

StatEval: A Comprehensive Benchmark for Large Language Models in Statistics

Add code
Oct 10, 2025
Viaarxiv icon

Compose Yourself: Average-Velocity Flow Matching for One-Step Speech Enhancement

Add code
Sep 19, 2025
Viaarxiv icon

Deep Reinforcement Learning for Ranking Utility Tuning in the Ad Recommender System at Pinterest

Add code
Sep 05, 2025
Viaarxiv icon

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Add code
Jun 17, 2025
Viaarxiv icon