Picture for Changhe Song

Changhe Song

Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States

Add code
May 23, 2025
Viaarxiv icon

Towards a Client-Centered Assessment of LLM Therapists by Client Simulation

Add code
Jun 18, 2024
Viaarxiv icon

3D-Speaker-Toolkit: An Open Source Toolkit for Multi-modal Speaker Verification and Diarization

Add code
Mar 29, 2024
Viaarxiv icon

SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge

Add code
Sep 04, 2023
Figure 1 for SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
Figure 2 for SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
Figure 3 for SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
Figure 4 for SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
Viaarxiv icon

Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation

Add code
Sep 04, 2023
Figure 1 for Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
Figure 2 for Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
Figure 3 for Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
Figure 4 for Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
Viaarxiv icon

Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information

Add code
Aug 31, 2023
Figure 1 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information
Figure 2 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information
Figure 3 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information
Figure 4 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information
Viaarxiv icon

Towards Cross-speaker Reading Style Transfer on Audiobook Dataset

Add code
Aug 19, 2022
Figure 1 for Towards Cross-speaker Reading Style Transfer on Audiobook Dataset
Figure 2 for Towards Cross-speaker Reading Style Transfer on Audiobook Dataset
Figure 3 for Towards Cross-speaker Reading Style Transfer on Audiobook Dataset
Figure 4 for Towards Cross-speaker Reading Style Transfer on Audiobook Dataset
Viaarxiv icon

Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis

Add code
Apr 03, 2022
Figure 1 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Figure 2 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Figure 3 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Figure 4 for Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Viaarxiv icon

An End-to-end Chinese Text Normalization Model based on Rule-guided Flat-Lattice Transformer

Add code
Mar 31, 2022
Figure 1 for An End-to-end Chinese Text Normalization Model based on Rule-guided Flat-Lattice Transformer
Figure 2 for An End-to-end Chinese Text Normalization Model based on Rule-guided Flat-Lattice Transformer
Figure 3 for An End-to-end Chinese Text Normalization Model based on Rule-guided Flat-Lattice Transformer
Figure 4 for An End-to-end Chinese Text Normalization Model based on Rule-guided Flat-Lattice Transformer
Viaarxiv icon

A Character-level Span-based Model for Mandarin Prosodic Structure Prediction

Add code
Mar 31, 2022
Figure 1 for A Character-level Span-based Model for Mandarin Prosodic Structure Prediction
Figure 2 for A Character-level Span-based Model for Mandarin Prosodic Structure Prediction
Figure 3 for A Character-level Span-based Model for Mandarin Prosodic Structure Prediction
Figure 4 for A Character-level Span-based Model for Mandarin Prosodic Structure Prediction
Viaarxiv icon