Picture for Chong Zhang

Chong Zhang

Tony

Plug-and-Play Co-Occurring Face Attention for Robust Audio-Visual Speaker Extraction

Add code
May 27, 2025
Viaarxiv icon

Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language Models

Add code
May 22, 2025
Viaarxiv icon

100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models

Add code
May 01, 2025
Viaarxiv icon

SegOTA: Accelerating Over-the-Air Federated Learning with Segmented Transmission

Add code
Apr 13, 2025
Viaarxiv icon

ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation

Add code
Mar 25, 2025
Viaarxiv icon

UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook

Add code
Feb 27, 2025
Viaarxiv icon

MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation

Add code
Feb 17, 2025
Viaarxiv icon

Improving Wireless Federated Learning via Joint Downlink-Uplink Beamforming over Analog Transmission

Add code
Feb 04, 2025
Viaarxiv icon

HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution

Add code
Jan 17, 2025
Viaarxiv icon

Conditional Latent Diffusion-Based Speech Enhancement Via Dual Context Learning

Add code
Jan 17, 2025
Viaarxiv icon