Picture for Xu Li

Xu Li

Britton Chance Center for Biomedical Photonics, Wuhan National Laboratory for Optoelectronics-Huazhong University of Science and Technology, China

Text2Move: Text-to-moving sound generation via trajectory prediction and temporal alignment

Add code
Sep 26, 2025
Viaarxiv icon

Training-Free Pyramid Token Pruning for Efficient Large Vision-Language Models via Region, Token, and Instruction-Guided Importance

Add code
Sep 19, 2025
Viaarxiv icon

HERO: Rethinking Visual Token Early Dropping in High-Resolution Large Vision-Language Models

Add code
Sep 16, 2025
Viaarxiv icon

Joint decoding method for controllable contextual speech recognition based on Speech LLM

Add code
Aug 12, 2025
Figure 1 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Figure 2 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Figure 3 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Figure 4 for Joint decoding method for controllable contextual speech recognition based on Speech LLM
Viaarxiv icon

Large Language Models Enhanced by Plug and Play Syntactic Knowledge for Aspect-based Sentiment Analysis

Add code
Jun 15, 2025
Viaarxiv icon

Low-Resource Domain Adaptation for Speech LLMs via Text-Only Fine-Tuning

Add code
Jun 06, 2025
Viaarxiv icon

Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction

Add code
May 30, 2025
Figure 1 for Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
Figure 2 for Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
Figure 3 for Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
Figure 4 for Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
Viaarxiv icon

MM-Prompt: Cross-Modal Prompt Tuning for Continual Visual Question Answering

Add code
May 26, 2025
Viaarxiv icon

AutoGEEval: A Multimodal and Automated Framework for Geospatial Code Generation on GEE with Large Language Models

Add code
May 19, 2025
Viaarxiv icon

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Add code
Mar 11, 2025
Viaarxiv icon