Picture for Ruichuan An

Ruichuan An

DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Add code
Dec 18, 2025
Viaarxiv icon

GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models

Add code
Dec 17, 2025
Viaarxiv icon

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

Add code
Oct 30, 2025
Viaarxiv icon

Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval

Add code
Oct 26, 2025
Figure 1 for Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval
Figure 2 for Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval
Figure 3 for Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval
Figure 4 for Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval
Viaarxiv icon

MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning

Add code
Oct 16, 2025
Figure 1 for MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
Figure 2 for MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
Figure 3 for MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
Figure 4 for MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
Viaarxiv icon

WoW: Towards a World omniscient World model Through Embodied Interaction

Add code
Sep 26, 2025
Viaarxiv icon

Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos

Add code
Jun 05, 2025
Figure 1 for Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
Figure 2 for Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
Figure 3 for Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
Figure 4 for Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
Viaarxiv icon

Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking

Add code
May 26, 2025
Viaarxiv icon

SpikeGen: Generative Framework for Visual Spike Stream Processing

Add code
May 23, 2025
Viaarxiv icon

LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts

Add code
May 20, 2025
Figure 1 for LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts
Figure 2 for LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts
Figure 3 for LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts
Figure 4 for LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts
Viaarxiv icon