Picture for Chao Zhang

Chao Zhang

refer to the report for detailed contributions

QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions

Add code
Mar 26, 2025
Figure 1 for QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions
Figure 2 for QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions
Figure 3 for QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions
Figure 4 for QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions
Viaarxiv icon

IAP: Improving Continual Learning of Vision-Language Models via Instance-Aware Prompting

Add code
Mar 26, 2025
Figure 1 for IAP: Improving Continual Learning of Vision-Language Models via Instance-Aware Prompting
Figure 2 for IAP: Improving Continual Learning of Vision-Language Models via Instance-Aware Prompting
Figure 3 for IAP: Improving Continual Learning of Vision-Language Models via Instance-Aware Prompting
Figure 4 for IAP: Improving Continual Learning of Vision-Language Models via Instance-Aware Prompting
Viaarxiv icon

ACVUBench: Audio-Centric Video Understanding Benchmark

Add code
Mar 25, 2025
Figure 1 for ACVUBench: Audio-Centric Video Understanding Benchmark
Figure 2 for ACVUBench: Audio-Centric Video Understanding Benchmark
Figure 3 for ACVUBench: Audio-Centric Video Understanding Benchmark
Figure 4 for ACVUBench: Audio-Centric Video Understanding Benchmark
Viaarxiv icon

Language Model Uncertainty Quantification with Attention Chain

Add code
Mar 24, 2025
Viaarxiv icon

Improving LLM Video Understanding with 16 Frames Per Second

Add code
Mar 18, 2025
Figure 1 for Improving LLM Video Understanding with 16 Frames Per Second
Figure 2 for Improving LLM Video Understanding with 16 Frames Per Second
Figure 3 for Improving LLM Video Understanding with 16 Frames Per Second
Figure 4 for Improving LLM Video Understanding with 16 Frames Per Second
Viaarxiv icon

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Add code
Mar 11, 2025
Viaarxiv icon

EvolvingGS: High-Fidelity Streamable Volumetric Video via Evolving 3D Gaussian Representation

Add code
Mar 07, 2025
Figure 1 for EvolvingGS: High-Fidelity Streamable Volumetric Video via Evolving 3D Gaussian Representation
Figure 2 for EvolvingGS: High-Fidelity Streamable Volumetric Video via Evolving 3D Gaussian Representation
Figure 3 for EvolvingGS: High-Fidelity Streamable Volumetric Video via Evolving 3D Gaussian Representation
Figure 4 for EvolvingGS: High-Fidelity Streamable Volumetric Video via Evolving 3D Gaussian Representation
Viaarxiv icon

Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs

Add code
Mar 07, 2025
Figure 1 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 2 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 3 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 4 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Viaarxiv icon

Large-Scale AI in Telecom: Charting the Roadmap for Innovation, Scalability, and Enhanced Digital Experiences

Add code
Mar 06, 2025
Figure 1 for Large-Scale AI in Telecom: Charting the Roadmap for Innovation, Scalability, and Enhanced Digital Experiences
Figure 2 for Large-Scale AI in Telecom: Charting the Roadmap for Innovation, Scalability, and Enhanced Digital Experiences
Figure 3 for Large-Scale AI in Telecom: Charting the Roadmap for Innovation, Scalability, and Enhanced Digital Experiences
Figure 4 for Large-Scale AI in Telecom: Charting the Roadmap for Innovation, Scalability, and Enhanced Digital Experiences
Viaarxiv icon

EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test

Add code
Mar 03, 2025
Viaarxiv icon