Picture for Roshan Sharma

Roshan Sharma

On the Evaluation of Speech Foundation Models for Spoken Language Understanding

Add code
Jun 14, 2024
Viaarxiv icon

AugSumm: towards generalizable speech summarization using synthetic labels from large language model

Add code
Jan 10, 2024
Viaarxiv icon

UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network

Add code
Oct 04, 2023
Figure 1 for UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network
Figure 2 for UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network
Figure 3 for UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network
Figure 4 for UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network
Viaarxiv icon

LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model

Add code
Oct 02, 2023
Figure 1 for LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model
Figure 2 for LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model
Figure 3 for LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model
Figure 4 for LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model
Viaarxiv icon

Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data

Add code
Oct 02, 2023
Figure 1 for Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Figure 2 for Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Figure 3 for Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Figure 4 for Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data
Viaarxiv icon

Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech

Add code
Oct 01, 2023
Figure 1 for Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech
Figure 2 for Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech
Viaarxiv icon

Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study

Add code
Sep 27, 2023
Figure 1 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Figure 2 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Figure 3 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Figure 4 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Viaarxiv icon

Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech

Add code
Sep 18, 2023
Figure 1 for Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
Figure 2 for Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
Figure 3 for Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
Figure 4 for Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
Viaarxiv icon

Augmenting text for spoken language understanding with Large Language Models

Add code
Sep 17, 2023
Figure 1 for Augmenting text for spoken language understanding with Large Language Models
Figure 2 for Augmenting text for spoken language understanding with Large Language Models
Figure 3 for Augmenting text for spoken language understanding with Large Language Models
Figure 4 for Augmenting text for spoken language understanding with Large Language Models
Viaarxiv icon

BASS: Block-wise Adaptation for Speech Summarization

Add code
Jul 17, 2023
Figure 1 for BASS: Block-wise Adaptation for Speech Summarization
Figure 2 for BASS: Block-wise Adaptation for Speech Summarization
Figure 3 for BASS: Block-wise Adaptation for Speech Summarization
Figure 4 for BASS: Block-wise Adaptation for Speech Summarization
Viaarxiv icon