Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gang Qiao

FLASepformer: Efficient Speech Separation with Gated Focused Linear Attention Transformer

Aug 27, 2025

Haoxu Wang, Yiheng Jiang, Gang Qiao, Pengteng Shi, Biao Tian

Abstract:Speech separation always faces the challenge of handling prolonged time sequences. Past methods try to reduce sequence lengths and use the Transformer to capture global information. However, due to the quadratic time complexity of the attention module, memory usage and inference time still increase significantly with longer segments. To tackle this, we introduce Focused Linear Attention and build FLASepformer with linear complexity for efficient speech separation. Inspired by SepReformer and TF-Locoformer, we have two variants: FLA-SepReformer and FLA-TFLocoformer. We also add a new Gated module to improve performance further. Experimental results on various datasets show that FLASepformer matches state-of-the-art performance with less memory consumption and faster inference. FLA-SepReformer-T/B/L increases speed by 2.29x, 1.91x, and 1.49x, with 15.8%, 20.9%, and 31.9% GPU memory usage, proving our model's effectiveness.

* Accepted by Interspeech 2025

Via

Access Paper or Ask Questions

Exploring Efficient Directional and Distance Cues for Regional Speech Separation

Aug 11, 2025

Yiheng Jiang, Haoxu Wang, Yafeng Chen, Gang Qiao, Biao Tian

Abstract:In this paper, we introduce a neural network-based method for regional speech separation using a microphone array. This approach leverages novel spatial cues to extract the sound source not only from specified direction but also within defined distance. Specifically, our method employs an improved delay-and-sum technique to obtain directional cues, substantially enhancing the signal from the target direction. We further enhance separation by incorporating the direct-to-reverberant ratio into the input features, enabling the model to better discriminate sources within and beyond a specified distance. Experimental results demonstrate that our proposed method leads to substantial gains across multiple objective metrics. Furthermore, our method achieves state-of-the-art performance on the CHiME-8 MMCSG dataset, which was recorded in real-world conversational scenarios, underscoring its effectiveness for speech separation in practical applications.

* This paper has been accepted by Interspeech 2025

Via

Access Paper or Ask Questions

An Asymptotically Optimal Algorithm for the One-Dimensional Convex Hull Feasibility Problem

Feb 03, 2023

Gang Qiao, Ambuj Tewari

Abstract:This work studies the pure-exploration setting for the convex hull feasibility (CHF) problem where one aims to efficiently and accurately determine if a given point lies in the convex hull of means of a finite set of distributions. We give a complete characterization of the sample complexity of the CHF problem in the one-dimensional setting. We present the first asymptotically optimal algorithm called Thompson-CHF, whose modular design consists of a stopping rule and a sampling rule. In addition, we provide an extension of the algorithm that generalizes several important problems in the multi-armed bandit literature. Finally, we further investigate the Gaussian bandit case with unknown variances and address how the Thompson-CHF algorithm can be adjusted to be asymptotically optimal in this setting.

Via

Access Paper or Ask Questions

Vector Approximate Message Passing based Channel Estimation for MIMO-OFDM Underwater Acoustic Communications

Nov 22, 2022

Wenxuan Chen, Jun Tao, Lu Ma, Gang Qiao

Figure 1 for Vector Approximate Message Passing based Channel Estimation for MIMO-OFDM Underwater Acoustic Communications

Figure 2 for Vector Approximate Message Passing based Channel Estimation for MIMO-OFDM Underwater Acoustic Communications

Figure 3 for Vector Approximate Message Passing based Channel Estimation for MIMO-OFDM Underwater Acoustic Communications

Figure 4 for Vector Approximate Message Passing based Channel Estimation for MIMO-OFDM Underwater Acoustic Communications

Abstract:Accurate channel estimation is critical to the performance of orthogonal frequency-division multiplexing (OFDM) underwater acoustic (UWA) communications, especially under multiple-input multiple-output (MIMO) scenarios. In this paper, we explore Vector Approximate Message Passing (VAMP) coupled with Expected Maximum (EM) to obtain channel estimation (CE) for MIMO OFDM UWA communications. The EM-VAMP-CE scheme is developed by employing a Bernoulli-Gaussian (BG) prior distribution for the channel impulse response, and hyperparameters of the BG prior distribution are learned via the EM algorithm. Performance of the EM-VAMP-CE is evaluated through both synthesized data and real data collected in two at-sea UWA communication experiments. It is shown the EM-VAMP-CE achieves better performance-complexity tradeoff compared with existing channel estimation methods.

* Journal:IEEE Journal of Oceanic Engineering(Date of Submission:2022-06-25)

Via

Access Paper or Ask Questions

An Information-Theoretic Approach for Estimating Scenario Generalization in Crowd Motion Prediction

Nov 02, 2022

Gang Qiao, Kaidong Hu, Seonghyeon Moon, Samuel S. Sohn, Sejong Yoon, Mubbasir Kapadia, Vladimir Pavlovic

Abstract:Learning-based approaches to modeling crowd motion have become increasingly successful but require training and evaluation on large datasets, coupled with complex model selection and parameter tuning. To circumvent this tremendously time-consuming process, we propose a novel scoring method, which characterizes generalization of models trained on source crowd scenarios and applied to target crowd scenarios using a training-free, model-agnostic Interaction + Diversity Quantification score, ISDQ. The Interaction component aims to characterize the difficulty of scenario domains, while the diversity of a scenario domain is captured in the Diversity score. Both scores can be computed in a computation tractable manner. Our experimental results validate the efficacy of the proposed method on several simulated and real-world (source,target) generalization tasks, demonstrating its potential to select optimal domain pairs before training and testing a model.

Via

Access Paper or Ask Questions

Oneshot Differentially Private Top-k Selection

May 18, 2021

Gang Qiao, Weijie J. Su, Li Zhang

Abstract:Being able to efficiently and accurately select the top-$k$ elements without privacy leakage is an integral component of various data analysis tasks and has gained significant attention. In this paper, we introduce the \textit{oneshot mechanism}, a fast, low-distortion, and differentially private primitive for the top-$k$ problem. Compared with existing approaches in the literature, our algorithm adds Laplace noise to the counts and releases the top-$k$ noisy counts and their estimates in a oneshot fashion, thereby substantially reducing the computational cost while maintaining satisfying utility. Our proof of privacy for this mechanism relies on a novel coupling technique that is of independent theoretical interest. Finally, we apply the oneshot mechanism to multiple hypothesis testing and ranking from pairwise comparisons and thus obtain their differentially private counterparts.

* Accepted to ICML 2021

Via

Access Paper or Ask Questions