Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function


Oct 26, 2022
Qing Wang, Hang Chen, Ya Jiang, Zhe Wang, Yuyang Wang, Jun Du, Chin-Hui Lee

* 5 pages, 3 figures, accepted by ISCSLP 2022 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Convergence Theory of Generalized Distributed Subgradient Method with Random Quantization


Jul 22, 2022
Zhaoyue Xia, Jun Du, Yong Ren


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification


Mar 31, 2022
Qing Wang, Jun Du, Siyuan Zheng, Yunqing Li, Yajian Wang, Yuzhong Wu, Hu Hu, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Yannan Wang, Chin-Hui Lee

* 5 pages, 1 figure, submitted to INTERSPEECH 2022 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Multimodal Pre-training Based on Graph Attention Network for Document Understanding


Mar 25, 2022
Zhenrong Zhang, Jiefeng Ma, Jun Du, Licheng Wang, Jianshu Zhang


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

A Study of Designing Compact Audio-Visual Wake Word Spotting System Based on Iterative Fine-Tuning in Neural Network Pruning


Feb 17, 2022
Hengshun Zhou, Jun Du, Chao-Han Huck Yang, Shifu Xiong, Chin-Hui Lee

* Accepted to ICASSP 2022. H. Zhou et al 

   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

The USTC-Ximalaya system for the ICASSP 2022 multi-channel multi-party meeting transcription (M2MeT) challenge


Feb 10, 2022
Maokui He, Xiang Lv, Weilin Zhou, JingJing Yin, Xiaoqi Zhang, Yuxuan Wang, Shutong Niu, Yuhang Cao, Heng Lu, Jun Du, Chin-Hui Lee


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Information Fusion in Attention Networks Using Adaptive and Multi-level Factorized Bilinear Pooling for Audio-visual Emotion Recognition


Nov 17, 2021
Hengshun Zhou, Jun Du, Yuanyuan Zhang, Qing Wang, Qing-Feng Liu, Chin-Hui Lee


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Target-speaker Voice Activity Detection with Improved I-Vector Estimation for Unknown Number of Speaker


Aug 07, 2021
Maokui He, Desh Raj, Zili Huang, Jun Du, Zhuo Chen, Shinji Watanabe


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email

Split, embed and merge: An accurate table structure recognizer


Jul 20, 2021
Zhenrong Zhang, Jianshu Zhang, Jun Du


   Access Paper or Ask Questions

  • Share via Twitter
  • Share via Facebook
  • Share via LinkedIn
  • Share via Whatsapp
  • Share via Messenger
  • Share via Email
1
2
3
4
5
>>