Picture for Wei Han

Wei Han

Britton Chance Center for Biomedical Photonics, Wuhan National Laboratory for Optoelectronics-Huazhong University of Science and Technology, China

Retrieval Augmented End-to-End Spoken Dialog Models

Add code
Feb 02, 2024
Figure 1 for Retrieval Augmented End-to-End Spoken Dialog Models
Figure 2 for Retrieval Augmented End-to-End Spoken Dialog Models
Figure 3 for Retrieval Augmented End-to-End Spoken Dialog Models
Figure 4 for Retrieval Augmented End-to-End Spoken Dialog Models
Viaarxiv icon

Localization and Discrete Beamforming with a Large Reconfigurable Intelligent Surface

Add code
Dec 19, 2023
Figure 1 for Localization and Discrete Beamforming with a Large Reconfigurable Intelligent Surface
Figure 2 for Localization and Discrete Beamforming with a Large Reconfigurable Intelligent Surface
Figure 3 for Localization and Discrete Beamforming with a Large Reconfigurable Intelligent Surface
Figure 4 for Localization and Discrete Beamforming with a Large Reconfigurable Intelligent Surface
Viaarxiv icon

Extending Context Window of Large Language Models via Semantic Compression

Add code
Dec 15, 2023
Figure 1 for Extending Context Window of Large Language Models via Semantic Compression
Figure 2 for Extending Context Window of Large Language Models via Semantic Compression
Figure 3 for Extending Context Window of Large Language Models via Semantic Compression
Figure 4 for Extending Context Window of Large Language Models via Semantic Compression
Viaarxiv icon

RoboVQA: Multimodal Long-Horizon Reasoning for Robotics

Add code
Nov 01, 2023
Figure 1 for RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
Figure 2 for RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
Figure 3 for RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
Figure 4 for RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
Viaarxiv icon

SLM: Bridge the thin gap between speech and text foundation models

Add code
Sep 30, 2023
Figure 1 for SLM: Bridge the thin gap between speech and text foundation models
Figure 2 for SLM: Bridge the thin gap between speech and text foundation models
Figure 3 for SLM: Bridge the thin gap between speech and text foundation models
Figure 4 for SLM: Bridge the thin gap between speech and text foundation models
Viaarxiv icon

High Perceptual Quality Wireless Image Delivery with Denoising Diffusion Models

Add code
Sep 27, 2023
Figure 1 for High Perceptual Quality Wireless Image Delivery with Denoising Diffusion Models
Figure 2 for High Perceptual Quality Wireless Image Delivery with Denoising Diffusion Models
Figure 3 for High Perceptual Quality Wireless Image Delivery with Denoising Diffusion Models
Figure 4 for High Perceptual Quality Wireless Image Delivery with Denoising Diffusion Models
Viaarxiv icon

Multimodal Modeling For Spoken Language Identification

Add code
Sep 19, 2023
Figure 1 for Multimodal Modeling For Spoken Language Identification
Figure 2 for Multimodal Modeling For Spoken Language Identification
Figure 3 for Multimodal Modeling For Spoken Language Identification
Figure 4 for Multimodal Modeling For Spoken Language Identification
Viaarxiv icon

SAS Video-QA: Self-Adaptive Sampling for Efficient Video Question-Answering

Add code
Aug 01, 2023
Viaarxiv icon

AudioPaLM: A Large Language Model That Can Speak and Listen

Add code
Jun 22, 2023
Figure 1 for AudioPaLM: A Large Language Model That Can Speak and Listen
Figure 2 for AudioPaLM: A Large Language Model That Can Speak and Listen
Figure 3 for AudioPaLM: A Large Language Model That Can Speak and Listen
Figure 4 for AudioPaLM: A Large Language Model That Can Speak and Listen
Viaarxiv icon

Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding

Add code
Jun 08, 2023
Figure 1 for Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding
Figure 2 for Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding
Figure 3 for Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding
Figure 4 for Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding
Viaarxiv icon