Picture for Wen Wang

Wen Wang

Say More with Less: Variable-Frame-Rate Speech Tokenization via Adaptive Clustering and Implicit Duration Coding

Add code
Sep 04, 2025
Viaarxiv icon

Solving the Min-Max Multiple Traveling Salesmen Problem via Learning-Based Path Generation and Optimal Splitting

Add code
Aug 23, 2025
Viaarxiv icon

Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models

Add code
Aug 12, 2025
Viaarxiv icon

SpeakerLM: End-to-End Versatile Speaker Diarization and Recognition with Multimodal Large Language Models

Add code
Aug 08, 2025
Viaarxiv icon

Token Communication in the Era of Large Models: An Information Bottleneck-Based Approach

Add code
Jul 02, 2025
Viaarxiv icon

ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing

Add code
Jun 26, 2025
Figure 1 for ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
Figure 2 for ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
Figure 3 for ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
Figure 4 for ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
Viaarxiv icon

OmniDRCA: Parallel Speech-Text Foundation Model via Dual-Resolution Speech Representations and Contrastive Alignment

Add code
Jun 11, 2025
Viaarxiv icon

Speech Token Prediction via Compressed-to-fine Language Modeling for Speech Generation

Add code
May 30, 2025
Viaarxiv icon

Towards Minimizing Feature Drift in Model Merging: Layer-wise Task Vector Fusion for Adaptive Knowledge Integration

Add code
May 29, 2025
Figure 1 for Towards Minimizing Feature Drift in Model Merging: Layer-wise Task Vector Fusion for Adaptive Knowledge Integration
Figure 2 for Towards Minimizing Feature Drift in Model Merging: Layer-wise Task Vector Fusion for Adaptive Knowledge Integration
Figure 3 for Towards Minimizing Feature Drift in Model Merging: Layer-wise Task Vector Fusion for Adaptive Knowledge Integration
Figure 4 for Towards Minimizing Feature Drift in Model Merging: Layer-wise Task Vector Fusion for Adaptive Knowledge Integration
Viaarxiv icon

Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

Add code
May 26, 2025
Viaarxiv icon