Picture for Fuxiao Liu

Fuxiao Liu

MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data

Add code
Mar 10, 2026
Viaarxiv icon

Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline

Add code
Mar 05, 2026
Viaarxiv icon

First Frame Is the Place to Go for Video Content Customization

Add code
Nov 19, 2025
Viaarxiv icon

NVIDIA Nemotron Nano V2 VL

Add code
Nov 07, 2025
Viaarxiv icon

Self-Rewarding Vision-Language Model via Reasoning Decomposition

Add code
Aug 27, 2025
Figure 1 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 2 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 3 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 4 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Viaarxiv icon

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Add code
Apr 10, 2025
Figure 1 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 2 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 3 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 4 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Viaarxiv icon

ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness

Add code
Apr 10, 2025
Figure 1 for ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
Figure 2 for ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
Figure 3 for ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
Figure 4 for ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
Viaarxiv icon

AIDE: Agentically Improve Visual Language Model with Domain Experts

Add code
Feb 13, 2025
Figure 1 for AIDE: Agentically Improve Visual Language Model with Domain Experts
Figure 2 for AIDE: Agentically Improve Visual Language Model with Domain Experts
Figure 3 for AIDE: Agentically Improve Visual Language Model with Domain Experts
Figure 4 for AIDE: Agentically Improve Visual Language Model with Domain Experts
Viaarxiv icon

DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments

Add code
Dec 28, 2024
Figure 1 for DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments
Figure 2 for DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments
Figure 3 for DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments
Figure 4 for DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments
Viaarxiv icon

DeepFM-Crispr: Prediction of CRISPR On-Target Effects via Deep Learning

Add code
Sep 09, 2024
Figure 1 for DeepFM-Crispr: Prediction of CRISPR On-Target Effects via Deep Learning
Figure 2 for DeepFM-Crispr: Prediction of CRISPR On-Target Effects via Deep Learning
Viaarxiv icon