Alert button
Picture for Yiling Huang

Yiling Huang

Alert button

PNeSM: Arbitrary 3D Scene Stylization via Prompt-Based Neural Style Mapping

Mar 13, 2024
Jiafu Chen, Wei Xing, Jiakai Sun, Tianyi Chu, Yiling Huang, Boyan Ji, Lei Zhao, Huaizhong Lin, Haibo Chen, Zhizhong Wang

Viaarxiv icon

DiarizationLM: Speaker Diarization Post-Processing with Large Language Models

Jan 16, 2024
Quan Wang, Yiling Huang, Guanlong Zhao, Evan Clark, Wei Xia, Hank Liao

Viaarxiv icon

ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and Implicit Style Prompt Bank

Dec 11, 2023
Zhanjie Zhang, Quanwei Zhang, Guangyuan Li, Wei Xing, Lei Zhao, Jiakai Sun, Zehua Lan, Junsheng Luan, Yiling Huang, Huaizhong Lin

Viaarxiv icon

Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network

Sep 15, 2023
Yiling Huang, Weiran Wang, Guanlong Zhao, Hank Liao, Wei Xia, Quan Wang

Figure 1 for Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network
Figure 2 for Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network
Figure 3 for Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network
Figure 4 for Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network
Viaarxiv icon

USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models

Sep 14, 2023
Guanlong Zhao, Yongqiang Wang, Jason Pelecanos, Yu Zhang, Hank Liao, Yiling Huang, Han Lu, Quan Wang

Figure 1 for USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models
Figure 2 for USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models
Figure 3 for USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models
Figure 4 for USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models
Viaarxiv icon

Selective inference using randomized group lasso estimators for general models

Jun 24, 2023
Yiling Huang, Sarah Pirenne, Snigdha Panigrahi, Gerda Claeskens

Figure 1 for Selective inference using randomized group lasso estimators for general models
Figure 2 for Selective inference using randomized group lasso estimators for general models
Figure 3 for Selective inference using randomized group lasso estimators for general models
Figure 4 for Selective inference using randomized group lasso estimators for general models
Viaarxiv icon

Augmenting Transformer-Transducer Based Speaker Change Detection With Token-Level Training Loss

Nov 11, 2022
Guanlong Zhao, Quan Wang, Han Lu, Yiling Huang, Ignacio Lopez Moreno

Figure 1 for Augmenting Transformer-Transducer Based Speaker Change Detection With Token-Level Training Loss
Figure 2 for Augmenting Transformer-Transducer Based Speaker Change Detection With Token-Level Training Loss
Figure 3 for Augmenting Transformer-Transducer Based Speaker Change Detection With Token-Level Training Loss
Figure 4 for Augmenting Transformer-Transducer Based Speaker Change Detection With Token-Level Training Loss
Viaarxiv icon

Highly Efficient Real-Time Streaming and Fully On-Device Speaker Diarization with Multi-Stage Clustering

Oct 25, 2022
Quan Wang, Yiling Huang, Han Lu, Guanlong Zhao, Ignacio Lopez Moreno

Figure 1 for Highly Efficient Real-Time Streaming and Fully On-Device Speaker Diarization with Multi-Stage Clustering
Figure 2 for Highly Efficient Real-Time Streaming and Fully On-Device Speaker Diarization with Multi-Stage Clustering
Figure 3 for Highly Efficient Real-Time Streaming and Fully On-Device Speaker Diarization with Multi-Stage Clustering
Figure 4 for Highly Efficient Real-Time Streaming and Fully On-Device Speaker Diarization with Multi-Stage Clustering
Viaarxiv icon

Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech

Mar 21, 2022
Quan Wang, Yang Yu, Jason Pelecanos, Yiling Huang, Ignacio Lopez Moreno

Figure 1 for Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech
Figure 2 for Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech
Figure 3 for Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech
Figure 4 for Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech
Viaarxiv icon

Parameter-Free Attentive Scoring for Speaker Verification

Mar 10, 2022
Jason Pelecanos, Quan Wang, Yiling Huang, Ignacio Lopez Moreno

Figure 1 for Parameter-Free Attentive Scoring for Speaker Verification
Figure 2 for Parameter-Free Attentive Scoring for Speaker Verification
Figure 3 for Parameter-Free Attentive Scoring for Speaker Verification
Figure 4 for Parameter-Free Attentive Scoring for Speaker Verification
Viaarxiv icon