Picture for Shuang Chen

Shuang Chen

BcQLM: Efficient Vision-Language Understanding with Distilled Q-Gated Cross-Modal Fusion

Add code
Sep 10, 2025
Viaarxiv icon

Interleaving Reasoning for Better Text-to-Image Generation

Add code
Sep 09, 2025
Viaarxiv icon

Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

Add code
Jun 04, 2025
Viaarxiv icon

Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

Add code
May 21, 2025
Viaarxiv icon

A comprehensive review of remote sensing in wetland classification and mapping

Add code
Apr 15, 2025
Viaarxiv icon

Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program

Add code
Apr 09, 2025
Viaarxiv icon

Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling

Add code
Mar 07, 2025
Viaarxiv icon

Deep Learning-Enhanced Visual Monitoring in Hazardous Underwater Environments with a Swarm of Micro-Robots

Add code
Mar 04, 2025
Viaarxiv icon

ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning

Add code
Feb 22, 2025
Viaarxiv icon

VidSketch: Hand-drawn Sketch-Driven Video Generation with Diffusion Control

Add code
Feb 03, 2025
Figure 1 for VidSketch: Hand-drawn Sketch-Driven Video Generation with Diffusion Control
Figure 2 for VidSketch: Hand-drawn Sketch-Driven Video Generation with Diffusion Control
Figure 3 for VidSketch: Hand-drawn Sketch-Driven Video Generation with Diffusion Control
Figure 4 for VidSketch: Hand-drawn Sketch-Driven Video Generation with Diffusion Control
Viaarxiv icon