Picture for Kang Yin

Kang Yin

MM-Sonate: Multimodal Controllable Audio-Video Generation with Zero-Shot Voice Cloning

Add code
Jan 08, 2026
Viaarxiv icon

Cross-Modal Consistency-Guided Active Learning for Affective BCI Systems

Add code
Nov 19, 2025
Figure 1 for Cross-Modal Consistency-Guided Active Learning for Affective BCI Systems
Figure 2 for Cross-Modal Consistency-Guided Active Learning for Affective BCI Systems
Figure 3 for Cross-Modal Consistency-Guided Active Learning for Affective BCI Systems
Viaarxiv icon

NeuroLex: A Lightweight Domain Language Model for EEG Report Understanding and Generation

Add code
Nov 17, 2025
Figure 1 for NeuroLex: A Lightweight Domain Language Model for EEG Report Understanding and Generation
Figure 2 for NeuroLex: A Lightweight Domain Language Model for EEG Report Understanding and Generation
Figure 3 for NeuroLex: A Lightweight Domain Language Model for EEG Report Understanding and Generation
Figure 4 for NeuroLex: A Lightweight Domain Language Model for EEG Report Understanding and Generation
Viaarxiv icon

Semantic Prioritization in Visual Counterfactual Explanations with Weighted Segmentation and Auto-Adaptive Region Selection

Add code
Nov 17, 2025
Viaarxiv icon

Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition

Add code
Nov 11, 2025
Figure 1 for Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition
Figure 2 for Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition
Figure 3 for Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition
Figure 4 for Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition
Viaarxiv icon

Reconstructing Unseen Sentences from Speech-related Biosignals for Open-vocabulary Neural Communication

Add code
Oct 31, 2025
Viaarxiv icon

Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation

Add code
Jun 24, 2025
Figure 1 for Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation
Figure 2 for Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation
Figure 3 for Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation
Figure 4 for Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation
Viaarxiv icon

EEG-based Multimodal Representation Learning for Emotion Recognition

Add code
Oct 29, 2024
Figure 1 for EEG-based Multimodal Representation Learning for Emotion Recognition
Figure 2 for EEG-based Multimodal Representation Learning for Emotion Recognition
Figure 3 for EEG-based Multimodal Representation Learning for Emotion Recognition
Viaarxiv icon

Sparse Multitask Learning for Efficient Neural Representation of Motor Imagery and Execution

Add code
Dec 10, 2023
Figure 1 for Sparse Multitask Learning for Efficient Neural Representation of Motor Imagery and Execution
Figure 2 for Sparse Multitask Learning for Efficient Neural Representation of Motor Imagery and Execution
Viaarxiv icon

SHARK: A Lightweight Model Compression Approach for Large-scale Recommender Systems

Add code
Aug 18, 2023
Figure 1 for SHARK: A Lightweight Model Compression Approach for Large-scale Recommender Systems
Figure 2 for SHARK: A Lightweight Model Compression Approach for Large-scale Recommender Systems
Figure 3 for SHARK: A Lightweight Model Compression Approach for Large-scale Recommender Systems
Figure 4 for SHARK: A Lightweight Model Compression Approach for Large-scale Recommender Systems
Viaarxiv icon