Picture for Haizhou Li

Haizhou Li

Apollo: An Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People

Add code
Mar 09, 2024
Figure 1 for Apollo: An Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People
Figure 2 for Apollo: An Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People
Figure 3 for Apollo: An Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People
Figure 4 for Apollo: An Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People
Viaarxiv icon

Fine-Grained Quantitative Emotion Editing for Speech Generation

Add code
Mar 04, 2024
Viaarxiv icon

Event-Driven Learning for Spiking Neural Networks

Add code
Mar 01, 2024
Figure 1 for Event-Driven Learning for Spiking Neural Networks
Figure 2 for Event-Driven Learning for Spiking Neural Networks
Figure 3 for Event-Driven Learning for Spiking Neural Networks
Figure 4 for Event-Driven Learning for Spiking Neural Networks
Viaarxiv icon

Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks

Add code
Feb 28, 2024
Figure 1 for Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks
Figure 2 for Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks
Figure 3 for Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks
Figure 4 for Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks
Viaarxiv icon

Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition

Add code
Feb 08, 2024
Viaarxiv icon

LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization

Add code
Jan 26, 2024
Viaarxiv icon

CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing

Add code
Jan 22, 2024
Figure 1 for CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing
Figure 2 for CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing
Figure 3 for CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing
Figure 4 for CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing
Viaarxiv icon

An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement

Add code
Jan 18, 2024
Figure 1 for An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
Figure 2 for An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
Figure 3 for An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
Figure 4 for An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
Viaarxiv icon

Bridging Research and Readers: A Multi-Modal Automated Academic Papers Interpretation System

Add code
Jan 17, 2024
Viaarxiv icon

Gradient weighting for speaker verification in extremely low Signal-to-Noise Ratio

Add code
Jan 05, 2024
Viaarxiv icon