Picture for Dong Zhang

Dong Zhang

On the Temperature of Machine Learning Systems

Add code
Apr 19, 2024
Viaarxiv icon

SpeechAlign: Aligning Speech Generation to Human Preferences

Add code
Apr 08, 2024
Figure 1 for SpeechAlign: Aligning Speech Generation to Human Preferences
Figure 2 for SpeechAlign: Aligning Speech Generation to Human Preferences
Figure 3 for SpeechAlign: Aligning Speech Generation to Human Preferences
Figure 4 for SpeechAlign: Aligning Speech Generation to Human Preferences
Viaarxiv icon

Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers

Add code
Mar 29, 2024
Figure 1 for Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers
Figure 2 for Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers
Figure 3 for Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers
Figure 4 for Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers
Viaarxiv icon

Unleashing Network Potentials for Semantic Scene Completion

Add code
Mar 14, 2024
Figure 1 for Unleashing Network Potentials for Semantic Scene Completion
Figure 2 for Unleashing Network Potentials for Semantic Scene Completion
Figure 3 for Unleashing Network Potentials for Semantic Scene Completion
Figure 4 for Unleashing Network Potentials for Semantic Scene Completion
Viaarxiv icon

Location-guided Head Pose Estimation for Fisheye Image

Add code
Feb 28, 2024
Viaarxiv icon

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

Add code
Feb 26, 2024
Figure 1 for AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
Figure 2 for AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
Figure 3 for AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
Figure 4 for AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
Viaarxiv icon

Comment-aided Video-Language Alignment via Contrastive Pre-training for Short-form Video Humor Detection

Add code
Feb 14, 2024
Viaarxiv icon

GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators

Add code
Feb 10, 2024
Viaarxiv icon

GroundingGPT:Language Enhanced Multi-modal Grounding Model

Add code
Jan 30, 2024
Viaarxiv icon

SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation

Add code
Jan 25, 2024
Viaarxiv icon