Picture for Yangyang Shi

Yangyang Shi

VLM3: Vision Language Models Are Native 3D Learners

Add code
May 28, 2026
Viaarxiv icon

M$^2$E-UAV: A Benchmark and Analysis for Onboard Motion-on-Motion Event-Based Tiny UAV Detection

Add code
May 11, 2026
Viaarxiv icon

Exploring Audio Hallucination in Egocentric Video Understanding

Add code
Apr 26, 2026
Viaarxiv icon

RPRA: Predicting an LLM-Judge for Efficient but Performant Inference

Add code
Apr 14, 2026
Viaarxiv icon

Neural Computers

Add code
Apr 07, 2026
Viaarxiv icon

EgoAVU: Egocentric Audio-Visual Understanding

Add code
Feb 05, 2026
Viaarxiv icon

SLAP: Scalable Language-Audio Pretraining with Variable-Duration Audio and Multi-Objective Training

Add code
Jan 18, 2026
Viaarxiv icon

TFKAN: Time-Frequency KAN for Long-Term Time Series Forecasting

Add code
Jun 15, 2025
Viaarxiv icon

ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization

Add code
Feb 04, 2025
Viaarxiv icon

MASV: Speaker Verification with Global and Local Context Mamba

Add code
Dec 14, 2024
Figure 1 for MASV: Speaker Verification with Global and Local Context Mamba
Figure 2 for MASV: Speaker Verification with Global and Local Context Mamba
Figure 3 for MASV: Speaker Verification with Global and Local Context Mamba
Figure 4 for MASV: Speaker Verification with Global and Local Context Mamba
Viaarxiv icon