Picture for Jian Xue

Jian Xue

Parallax to Align Them All: An OmniParallax Attention Mechanism for Distributed Multi-View Image Compression

Add code
Mar 04, 2026
Viaarxiv icon

AUHead: Realistic Emotional Talking Head Generation via Action Units Control

Add code
Feb 10, 2026
Viaarxiv icon

Generative AI for Analysts

Add code
Dec 12, 2025
Viaarxiv icon

PHRASED: Phrase Dictionary Biasing for Speech Translation

Add code
Jun 10, 2025
Viaarxiv icon

Streaming Speaker Change Detection and Gender Classification for Transducer-Based Multi-Talker Speech Translation

Add code
Feb 04, 2025
Figure 1 for Streaming Speaker Change Detection and Gender Classification for Transducer-Based Multi-Talker Speech Translation
Figure 2 for Streaming Speaker Change Detection and Gender Classification for Transducer-Based Multi-Talker Speech Translation
Figure 3 for Streaming Speaker Change Detection and Gender Classification for Transducer-Based Multi-Talker Speech Translation
Figure 4 for Streaming Speaker Change Detection and Gender Classification for Transducer-Based Multi-Talker Speech Translation
Viaarxiv icon

MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection

Add code
Nov 20, 2024
Figure 1 for MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection
Figure 2 for MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection
Figure 3 for MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection
Figure 4 for MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection
Viaarxiv icon

Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages

Add code
Nov 11, 2024
Figure 1 for Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages
Figure 2 for Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages
Figure 3 for Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages
Figure 4 for Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages
Viaarxiv icon

Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation

Add code
Oct 17, 2024
Figure 1 for Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation
Figure 2 for Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation
Figure 3 for Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation
Figure 4 for Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation
Viaarxiv icon

Towards Unified Facial Action Unit Recognition Framework by Large Language Models

Add code
Sep 13, 2024
Figure 1 for Towards Unified Facial Action Unit Recognition Framework by Large Language Models
Figure 2 for Towards Unified Facial Action Unit Recognition Framework by Large Language Models
Figure 3 for Towards Unified Facial Action Unit Recognition Framework by Large Language Models
Figure 4 for Towards Unified Facial Action Unit Recognition Framework by Large Language Models
Viaarxiv icon

MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis

Add code
Sep 11, 2024
Figure 1 for MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis
Figure 2 for MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis
Figure 3 for MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis
Figure 4 for MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis
Viaarxiv icon