Picture for Zeyu Jin

Zeyu Jin

VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap

Add code
May 24, 2024
Viaarxiv icon

A Closer Look at the Limitations of Instruction Tuning

Add code
Feb 03, 2024
Viaarxiv icon

Efficient Spoken Language Recognition via Multilabel Classification

Add code
Jun 02, 2023
Figure 1 for Efficient Spoken Language Recognition via Multilabel Classification
Figure 2 for Efficient Spoken Language Recognition via Multilabel Classification
Figure 3 for Efficient Spoken Language Recognition via Multilabel Classification
Figure 4 for Efficient Spoken Language Recognition via Multilabel Classification
Viaarxiv icon

Audio Similarity is Unreliable as a Proxy for Audio Quality

Add code
Jun 27, 2022
Figure 1 for Audio Similarity is Unreliable as a Proxy for Audio Quality
Figure 2 for Audio Similarity is Unreliable as a Proxy for Audio Quality
Figure 3 for Audio Similarity is Unreliable as a Proxy for Audio Quality
Figure 4 for Audio Similarity is Unreliable as a Proxy for Audio Quality
Viaarxiv icon

Music Enhancement via Image Translation and Vocoding

Add code
Apr 28, 2022
Figure 1 for Music Enhancement via Image Translation and Vocoding
Figure 2 for Music Enhancement via Image Translation and Vocoding
Figure 3 for Music Enhancement via Image Translation and Vocoding
Figure 4 for Music Enhancement via Image Translation and Vocoding
Viaarxiv icon

HEAR 2021: Holistic Evaluation of Audio Representations

Add code
Mar 26, 2022
Figure 1 for HEAR 2021: Holistic Evaluation of Audio Representations
Figure 2 for HEAR 2021: Holistic Evaluation of Audio Representations
Figure 3 for HEAR 2021: Holistic Evaluation of Audio Representations
Figure 4 for HEAR 2021: Holistic Evaluation of Audio Representations
Viaarxiv icon

Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet

Add code
Oct 05, 2021
Figure 1 for Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet
Figure 2 for Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet
Figure 3 for Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet
Figure 4 for Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet
Viaarxiv icon

Controllable deep melody generation via hierarchical music structure representation

Add code
Sep 02, 2021
Figure 1 for Controllable deep melody generation via hierarchical music structure representation
Figure 2 for Controllable deep melody generation via hierarchical music structure representation
Figure 3 for Controllable deep melody generation via hierarchical music structure representation
Figure 4 for Controllable deep melody generation via hierarchical music structure representation
Viaarxiv icon

Context-Aware Prosody Correction for Text-Based Speech Editing

Add code
Feb 16, 2021
Figure 1 for Context-Aware Prosody Correction for Text-Based Speech Editing
Figure 2 for Context-Aware Prosody Correction for Text-Based Speech Editing
Figure 3 for Context-Aware Prosody Correction for Text-Based Speech Editing
Viaarxiv icon

CDPAM: Contrastive learning for perceptual audio similarity

Add code
Feb 09, 2021
Figure 1 for CDPAM: Contrastive learning for perceptual audio similarity
Figure 2 for CDPAM: Contrastive learning for perceptual audio similarity
Figure 3 for CDPAM: Contrastive learning for perceptual audio similarity
Figure 4 for CDPAM: Contrastive learning for perceptual audio similarity
Viaarxiv icon