Abstract:The coexistence of heterogeneous cellular standards (2G-5G) in shared spectrum demands sophisticated RF source separation techniques, yet no public dataset exists for data-driven research on this problem. We present RFSS (RF Signal Source Separation), an open-source dataset of 100,000 multi-source RF signal samples generated with full 3GPP standards compliance. The dataset covers GSM (TS 45.004), UMTS (TS 25.211), LTE (TS 36.211), and 5G NR (TS 38.211), with 2-4 simultaneous sources per sample plus 4,000 single-source reference samples, at 30.72 MHz sample rate. Each sample passes through independent 3GPP TDL multipath fading channels and realistic hardware impairments: carrier frequency offset, I/Q imbalance, phase noise, DC offset, and PA nonlinearity (Rapp model). Two mixing modes are provided: co-channel (all sources at baseband) and adjacent-channel (each source frequency-shifted to its standard-specific carrier). The dataset totals 103 GB in HDF5 format with a 70/15/15 train/validation/test split. We benchmark five methods: FastICA, Frobenius-norm NMF, Conv-TasNet, DPRNN, and a CNN-LSTM baseline, evaluated using permutation-invariant SI-SINR (PI-SI-SINR). Conv-TasNet achieves -21.18 dB PI-SI-SINR on 2-source mixtures versus -34.91 dB for ICA, a 13.7 dB improvement. On co-channel mixtures, Conv-TasNet reaches -12.34 dB versus -28.04 dB for ICA and -16.19 dB for NMF. The dataset and evaluation code are publicly released at submission time.




Abstract:In today's data-driven landscape spanning finance, government, and healthcare sectors, the exponential growth of information necessitates robust solutions for secure storage, efficient dissemination, and fine-grained access control. Convolutional dictionary learning emerges as a powerful approach for extracting meaningful representations from complex data. This paper presents a novel weakly supervised convolutional dictionary learning framework that incorporates both shared and discriminative components for classification tasks. Our approach leverages limited label information to learn dictionaries that capture common patterns across classes while simultaneously highlighting class-specific features. By decomposing the learned representations into shared and discriminative parts, we enhance both feature interpretability and classification performance. Extensive experiments across multiple datasets demonstrate that our method outperforms state-of-the-art approaches, particularly in scenarios with limited labeled data. The proposed framework offers a promising solution for applications requiring both effective feature extraction and accurate classification in weakly supervised settings.