Picture for Jiansheng Wei

Jiansheng Wei

VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format

Add code
Nov 27, 2024
Viaarxiv icon

Visually Guided Generative Text-Layout Pre-training for Document Intelligence

Add code
Mar 27, 2024
Viaarxiv icon

PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing

Add code
Mar 20, 2023
Viaarxiv icon

Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding

Add code
Dec 19, 2022
Viaarxiv icon

PanGu-Coder: Program Synthesis with Function-Level Language Modeling

Add code
Jul 22, 2022
Figure 1 for PanGu-Coder: Program Synthesis with Function-Level Language Modeling
Figure 2 for PanGu-Coder: Program Synthesis with Function-Level Language Modeling
Figure 3 for PanGu-Coder: Program Synthesis with Function-Level Language Modeling
Figure 4 for PanGu-Coder: Program Synthesis with Function-Level Language Modeling
Viaarxiv icon

Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme

Add code
Sep 28, 2021
Figure 1 for Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme
Figure 2 for Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme
Figure 3 for Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme
Figure 4 for Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme
Viaarxiv icon