Alert button

End-to-End Neural Audio Coding for Real-Time Communications

Jan 25, 2022
Xue Jiang, Xiulian Peng, Chengyu Zheng, Huaying Xue, Yuan Zhang, Yan Lu

Figure 1 for End-to-End Neural Audio Coding for Real-Time Communications
Figure 2 for End-to-End Neural Audio Coding for Real-Time Communications
Figure 3 for End-to-End Neural Audio Coding for Real-Time Communications
Figure 4 for End-to-End Neural Audio Coding for Real-Time Communications

Share this with someone who'll enjoy it:

Deep-learning based methods have shown their advantages in audio coding over traditional ones but limited attention has been paid on real-time communications (RTC). This paper proposes the TFNet, an end-to-end neural audio codec with low latency for RTC. It takes an encoder-temporal filtering-decoder paradigm that seldom being investigated in audio coding. An interleaved structure is proposed for temporal filtering to capture both short-term and long-term temporal dependencies. Furthermore, with end-to-end optimization, the TFNet is jointly optimized with speech enhancement and packet loss concealment, yielding a one-for-all network for three tasks. Both subjective and objective results demonstrate the efficiency of the proposed TFNet.

* ICASSP 2022 (Accepted)  
View paper onarxiv icon

Share this with someone who'll enjoy it: