Picture for Mingze Li

Mingze Li

Group Sequence Policy Optimization

Add code
Jul 24, 2025
Viaarxiv icon

STAR-R1: Spatial TrAnsformation Reasoning by Reinforcing Multimodal LLMs

Add code
May 26, 2025
Viaarxiv icon

STAR-R1: Spacial TrAnsformation Reasoning by Reinforcing Multimodal LLMs

Add code
May 21, 2025
Viaarxiv icon

Qwen3 Technical Report

Add code
May 14, 2025
Viaarxiv icon

A MEMS-based terahertz broadband beam steering technique

Add code
Sep 06, 2024
Viaarxiv icon

MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis

Add code
Jul 19, 2024
Figure 1 for MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
Figure 2 for MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
Figure 3 for MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
Figure 4 for MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
Viaarxiv icon

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Add code
Apr 25, 2023
Figure 1 for AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Figure 2 for AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Figure 3 for AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Figure 4 for AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Viaarxiv icon

Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer

Add code
Mar 21, 2023
Viaarxiv icon

Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models

Add code
Jan 30, 2023
Viaarxiv icon

TSRFormer: Table Structure Recognition with Transformers

Add code
Aug 09, 2022
Figure 1 for TSRFormer: Table Structure Recognition with Transformers
Figure 2 for TSRFormer: Table Structure Recognition with Transformers
Figure 3 for TSRFormer: Table Structure Recognition with Transformers
Figure 4 for TSRFormer: Table Structure Recognition with Transformers
Viaarxiv icon