Picture for Yu Li

Yu Li

Victor

Learning Speaker-Invariant Visual Features for Lipreading

Add code
Jun 09, 2025
Viaarxiv icon

Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection

Add code
Jun 06, 2025
Viaarxiv icon

AgentSwift: Efficient LLM Agent Design via Value-guided Hierarchical Search

Add code
Jun 06, 2025
Viaarxiv icon

Uni-MuMER: Unified Multi-Task Fine-Tuning of Vision-Language Model for Handwritten Mathematical Expression Recognition

Add code
May 29, 2025
Viaarxiv icon

VidText: Towards Comprehensive Evaluation for Video Text Understanding

Add code
May 28, 2025
Viaarxiv icon

Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations

Add code
May 27, 2025
Viaarxiv icon

Rethinking Text-based Protein Understanding: Retrieval or LLM?

Add code
May 26, 2025
Viaarxiv icon

One Model Transfer to All: On Robust Jailbreak Prompts Generation against LLMs

Add code
May 23, 2025
Viaarxiv icon

SafeMVDrive: Multi-view Safety-Critical Driving Video Synthesis in the Real World Domain

Add code
May 23, 2025
Viaarxiv icon

IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment

Add code
May 19, 2025
Viaarxiv icon