Picture for Bo Li

Bo Li

Beijing Key Laboratory of Digital Media, School of Computer Science and Engineering, Beihang University, Beijing, China

Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model

Add code
Jun 10, 2025
Figure 1 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 2 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 3 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Figure 4 for Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Viaarxiv icon

CDR-Agent: Intelligent Selection and Execution of Clinical Decision Rules Using Large Language Model Agents

Add code
May 29, 2025
Viaarxiv icon

HyperMotion: DiT-Based Pose-Guided Human Image Animation of Complex Motions

Add code
May 29, 2025
Viaarxiv icon

MagicTryOn: Harnessing Diffusion Transformer for Garment-Preserving Video Virtual Try-on

Add code
May 28, 2025
Viaarxiv icon

Photography Perspective Composition: Towards Aesthetic Perspective Recommendation

Add code
May 27, 2025
Viaarxiv icon

Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment

Add code
May 27, 2025
Viaarxiv icon

SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge

Add code
May 27, 2025
Viaarxiv icon

Any-to-Bokeh: One-Step Video Bokeh via Multi-Plane Image Guided Diffusion

Add code
May 27, 2025
Viaarxiv icon

Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression

Add code
May 26, 2025
Figure 1 for Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression
Figure 2 for Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression
Figure 3 for Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression
Figure 4 for Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression
Viaarxiv icon

MPL: Multiple Programming Languages with Large Language Models for Information Extraction

Add code
May 22, 2025
Viaarxiv icon