Picture for James Qin

James Qin

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study

Add code
Jan 23, 2024
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Massive End-to-end Models for Short Search Queries

Add code
Sep 22, 2023
Figure 1 for Massive End-to-end Models for Short Search Queries
Figure 2 for Massive End-to-end Models for Short Search Queries
Figure 3 for Massive End-to-end Models for Short Search Queries
Figure 4 for Massive End-to-end Models for Short Search Queries
Viaarxiv icon

AudioPaLM: A Large Language Model That Can Speak and Listen

Add code
Jun 22, 2023
Figure 1 for AudioPaLM: A Large Language Model That Can Speak and Listen
Figure 2 for AudioPaLM: A Large Language Model That Can Speak and Listen
Figure 3 for AudioPaLM: A Large Language Model That Can Speak and Listen
Figure 4 for AudioPaLM: A Large Language Model That Can Speak and Listen
Viaarxiv icon

Efficient Adapters for Giant Speech Models

Add code
Jun 13, 2023
Figure 1 for Efficient Adapters for Giant Speech Models
Figure 2 for Efficient Adapters for Giant Speech Models
Figure 3 for Efficient Adapters for Giant Speech Models
Figure 4 for Efficient Adapters for Giant Speech Models
Viaarxiv icon

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

Add code
Mar 03, 2023
Figure 1 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 2 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 3 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 4 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Viaarxiv icon

LaMDA: Language Models for Dialog Applications

Add code
Feb 10, 2022
Figure 1 for LaMDA: Language Models for Dialog Applications
Figure 2 for LaMDA: Language Models for Dialog Applications
Figure 3 for LaMDA: Language Models for Dialog Applications
Figure 4 for LaMDA: Language Models for Dialog Applications
Viaarxiv icon

Self-supervised Learning with Random-projection Quantizer for Speech Recognition

Add code
Feb 03, 2022
Figure 1 for Self-supervised Learning with Random-projection Quantizer for Speech Recognition
Figure 2 for Self-supervised Learning with Random-projection Quantizer for Speech Recognition
Figure 3 for Self-supervised Learning with Random-projection Quantizer for Speech Recognition
Figure 4 for Self-supervised Learning with Random-projection Quantizer for Speech Recognition
Viaarxiv icon

Vector-quantized Image Modeling with Improved VQGAN

Add code
Oct 09, 2021
Figure 1 for Vector-quantized Image Modeling with Improved VQGAN
Figure 2 for Vector-quantized Image Modeling with Improved VQGAN
Figure 3 for Vector-quantized Image Modeling with Improved VQGAN
Figure 4 for Vector-quantized Image Modeling with Improved VQGAN
Viaarxiv icon