Picture for Boris Ginsburg

Boris Ginsburg

Nemotron-4 340B Technical Report

Add code
Jun 17, 2024
Figure 1 for Nemotron-4 340B Technical Report
Figure 2 for Nemotron-4 340B Technical Report
Figure 3 for Nemotron-4 340B Technical Report
Figure 4 for Nemotron-4 340B Technical Report
Viaarxiv icon

Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter

Add code
Jun 11, 2024
Figure 1 for Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter
Figure 2 for Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter
Figure 3 for Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter
Figure 4 for Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter
Viaarxiv icon

Label-Looping: Highly Efficient Decoding for Transducers

Add code
Jun 10, 2024
Figure 1 for Label-Looping: Highly Efficient Decoding for Transducers
Figure 2 for Label-Looping: Highly Efficient Decoding for Transducers
Figure 3 for Label-Looping: Highly Efficient Decoding for Transducers
Figure 4 for Label-Looping: Highly Efficient Decoding for Transducers
Viaarxiv icon

Spectral Codecs: Spectrogram-Based Audio Codecs for High Quality Speech Synthesis

Add code
Jun 07, 2024
Viaarxiv icon

Flexible Multichannel Speech Enhancement for Noise-Robust Frontend

Add code
Jun 06, 2024
Viaarxiv icon

RULER: What's the Real Context Size of Your Long-Context Language Models?

Add code
Apr 11, 2024
Viaarxiv icon

Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition

Add code
Apr 04, 2024
Figure 1 for Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition
Figure 2 for Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition
Figure 3 for Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition
Figure 4 for Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition
Viaarxiv icon

Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition

Add code
Jan 11, 2024
Viaarxiv icon

The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System

Add code
Oct 18, 2023
Viaarxiv icon

Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation

Add code
Oct 18, 2023
Figure 1 for Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation
Figure 2 for Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation
Figure 3 for Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation
Figure 4 for Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation
Viaarxiv icon