In this correspondence, we propose a movable antenna (MA)-aided multi-user hybrid beamforming scheme with a sub-connected structure, where multiple movable sub-arrays can independently change their positions within different local regions. To maximize the system sum rate, we jointly optimize the digital beamformer, analog beamformer, and positions of subarrays, under the constraints of unit modulus, finite movable regions, and power budget. Due to the non-concave/non-convex objective function/constraints, as well as the highly coupled variables, the formulated problem is challenging to solve. By employing fractional programming, we develop an alternating optimization framework to solve the problem via a combination of Lagrange multipliers, penalty method, and gradient descent. Numerical results reveal that the proposed MA-aided hybrid beamforming scheme significantly improves the sum rate compared to its fixed-position antenna (FPA) counterpart. Moreover, with sufficiently large movable regions, the proposed scheme with sub-connected MA arrays even outperforms the fully-connected FPA array.
The use of fluorescent molecules to create long sequences of low-density, diffraction-limited images enables highly-precise molecule localization. However, this methodology requires lengthy imaging times, which limits the ability to view dynamic interactions of live cells on short time scales. Many techniques have been developed to reduce the number of frames needed for localization, from classic iterative optimization to deep neural networks. Particularly, deep algorithm unrolling utilizes both the structure of iterative sparse recovery algorithms and the performance gains of supervised deep learning. However, the robustness of this approach is highly dependant on having sufficient training data. In this paper we introduce deep unrolled self-supervised learning, which alleviates the need for such data by training a sequence-specific, model-based autoencoder that learns only from given measurements. Our proposed method exceeds the performance of its supervised counterparts, thus allowing for robust, dynamic imaging well below the diffraction limit without any labeled training samples. Furthermore, the suggested model-based autoencoder scheme can be utilized to enhance generalization in any sparse recovery framework, without the need for external training data.
This study considers the Block-Toeplitz structural properties inherent in traditional multichannel forward model matrices, using Full Matrix Capture (FMC) in ultrasonic testing as a case study. We propose an analytical convolutional forward model that transforms reflectivity maps into FMC data. Our findings demonstrate that the convolutional model excels over its matrix-based counterpart in terms of computational efficiency and storage requirements. This accelerated forward modeling approach holds significant potential for various inverse problems, notably enhancing Sparse Signal Recovery (SSR) within the context LASSO regression, which facilitates efficient Convolutional Sparse Coding (CSC) algorithms. Additionally, we explore the integration of Convolutional Neural Networks (CNNs) for the forward model, employing deep unfolding to implement the Learned Block Convolutional ISTA (BC-LISTA).
Navigating a nonholonomic robot in a cluttered environment requires extremely accurate perception and locomotion for collision avoidance. This paper presents NeuPAN: a real-time, highly-accurate, map-free, robot-agnostic, and environment-invariant robot navigation solution. Leveraging a tightly-coupled perception-locomotion framework, NeuPAN has two key innovations compared to existing approaches: 1) it directly maps raw points to a learned multi-frame distance space, avoiding error propagation from perception to control; 2) it is interpretable from an end-to-end model-based learning perspective, enabling provable convergence. The crux of NeuPAN is to solve a high-dimensional end-to-end mathematical model with various point-level constraints using the plug-and-play (PnP) proximal alternating-minimization network (PAN) with neurons in the loop. This allows NeuPAN to generate real-time, end-to-end, physically-interpretable motions directly from point clouds, which seamlessly integrates data- and knowledge-engines, where its network parameters are adjusted via back propagation. We evaluate NeuPAN on car-like robot, wheel-legged robot, and passenger autonomous vehicle, in both simulated and real-world environments. Experiments demonstrate that NeuPAN outperforms various benchmarks, in terms of accuracy, efficiency, robustness, and generalization capability across various environments, including the cluttered sandbox, office, corridor, and parking lot. We show that NeuPAN works well in unstructured environments with arbitrary-shape undetectable objects, making impassable ways passable.
This paper proposes a framework for designing robust precoders for a multi-input single-output (MISO) system that performs integrated sensing and communication (ISAC) across multiple cells and users. We use Cramer-Rao-Bound (CRB) to measure the sensing performance and derive its expressions for two multi-cell scenarios, namely coordinated beamforming (CBF) and coordinated multi-point (CoMP). In the CBF scheme, a BS shares channel state information (CSI) and estimates target parameters using monostatic sensing. In contrast, a BS in the CoMP scheme shares the CSI and data, allowing bistatic sensing through inter-cell reflection. We consider both block-level (BL) and symbol-level (SL) precoding schemes for both the multi-cell scenarios that are robust to channel state estimation errors. The formulated optimization problems to minimize the CRB in estimating the parameters of a target and maximize the minimum communication signal-to-interference-plus-noise-ratio (SINR) while satisfying a given total transmit power budget are non-convex. We tackle the non-convexity using a combination of semidefinite relaxation (SDR) and alternating optimization (AO) techniques. Simulations suggest that neglecting the inter-cell reflection and communication links degrades the performance of an ISAC system. The CoMP scenario employing SL precoding performs the best, whereas the BL precoding applied in the CBF scenario produces relatively high estimation error for a given minimum SINR value.
Federated learning (FL) is a machine learning paradigm that targets model training without gathering the local data dispersed over various data sources. Standard FL, which employs a single server, can only support a limited number of users, leading to degraded learning capability. In this work, we consider a multi-server FL framework, referred to as \emph{Confederated Learning} (CFL), in order to accommodate a larger number of users. A CFL system is composed of multiple networked edge servers, with each server connected to an individual set of users. Decentralized collaboration among servers is leveraged to harness all users' data for model training. Due to the potentially massive number of users involved, it is crucial to reduce the communication overhead of the CFL system. We propose a stochastic gradient method for distributed learning in the CFL framework. The proposed method incorporates a conditionally-triggered user selection (CTUS) mechanism as the central component to effectively reduce communication overhead. Relying on a delicately designed triggering condition, the CTUS mechanism allows each server to select only a small number of users to upload their gradients, without significantly jeopardizing the convergence performance of the algorithm. Our theoretical analysis reveals that the proposed algorithm enjoys a linear convergence rate. Simulation results show that it achieves substantial improvement over state-of-the-art algorithms in terms of communication efficiency.
In recent years, algorithm unrolling has emerged as a powerful technique for designing interpretable neural networks based on iterative algorithms. Imaging inverse problems have particularly benefited from unrolling-based deep network design since many traditional model-based approaches rely on iterative optimization. Despite exciting progress, typical unrolling approaches heuristically design layer-specific convolution weights to improve performance. Crucially, convergence properties of the underlying iterative algorithm are lost once layer-specific parameters are learned from training data. We propose an unrolling technique that breaks the trade-off between retaining algorithm properties while simultaneously enhancing performance. We focus on image deblurring and unrolling the widely-applied Half-Quadratic Splitting (HQS) algorithm. We develop a new parametrization scheme which enforces layer-specific parameters to asymptotically approach certain fixed points. Through extensive experimental studies, we verify that our approach achieves competitive performance with state-of-the-art unrolled layer-specific learning and significantly improves over the traditional HQS algorithm. We further establish convergence of the proposed unrolled network as the number of layers approaches infinity, and characterize its convergence rate. Our experimental verification involves simulations that validate the analytical results as well as comparison with state-of-the-art non-blind deblurring techniques on benchmark datasets. The merits of the proposed convergent unrolled network are established over competing alternatives, especially in the regime of limited training.
Ultrasound and radar signals are highly beneficial for medical imaging as they are non-invasive and non-ionizing. Traditional imaging techniques have limitations in terms of contrast and physical interpretation. Quantitative medical imaging can display various physical properties such as speed of sound, density, conductivity, and relative permittivity. This makes it useful for a wider range of applications, including improving cancer detection, diagnosing fatty liver, and fast stroke imaging. However, current quantitative imaging techniques that estimate physical properties from received signals, such as Full Waveform Inversion, are time-consuming and tend to converge to local minima, making them unsuitable for medical imaging. To address these challenges, we propose a neural network based on the physical model of wave propagation, which defines the relationship between the received signals and physical properties. Our network can reconstruct multiple physical properties in less than one second for complex and realistic scenarios, using data from only eight elements. We demonstrate the effectiveness of our approach for both radar and ultrasound signals.
In task-based quantization, a multivariate analog signal is transformed into a digital signal using a limited number of low-resolution analog-to-digital converters (ADCs). This process aims to minimize a fidelity criterion, which is assessed against an unobserved task variable that is correlated with the analog signal. The scenario models various applications of interest such as channel estimation, medical imaging applications, and object localization. This work explores the integration of analog processing components -- such as analog delay elements, polynomial operators, and envelope detectors -- prior to ADC quantization. Specifically, four scenarios, involving different collections of analog processing operators are considered: (i) arbitrary polynomial operators with analog delay elements, (ii) limited-degree polynomial operators, excluding delay elements, (iii) sequences of envelope detectors, and (iv) a combination of analog delay elements and linear combiners. For each scenario, the minimum achievable distortion is quantified through derivation of computable expressions in various statistical settings. It is shown that analog processing can significantly reduce the distortion in task reconstruction. Numerical simulations in a Gaussian example are provided to give further insights into the aforementioned analog processing gains.
Collaborative perception allows each agent to enhance its perceptual abilities by exchanging messages with others. It inherently results in a trade-off between perception ability and communication costs. Previous works transmit complete full-frame high-dimensional feature maps among agents, resulting in substantial communication costs. To promote communication efficiency, we propose only transmitting the information needed for the collaborator's downstream task. This pragmatic communication strategy focuses on three key aspects: i) pragmatic message selection, which selects task-critical parts from the complete data, resulting in spatially and temporally sparse feature vectors; ii) pragmatic message representation, which achieves pragmatic approximation of high-dimensional feature vectors with a task-adaptive dictionary, enabling communicating with integer indices; iii) pragmatic collaborator selection, which identifies beneficial collaborators, pruning unnecessary communication links. Following this strategy, we first formulate a mathematical optimization framework for the perception-communication trade-off and then propose PragComm, a multi-agent collaborative perception system with two key components: i) single-agent detection and tracking and ii) pragmatic collaboration. The proposed PragComm promotes pragmatic communication and adapts to a wide range of communication conditions. We evaluate PragComm for both collaborative 3D object detection and tracking tasks in both real-world, V2V4Real, and simulation datasets, OPV2V and V2X-SIM2.0. PragComm consistently outperforms previous methods with more than 32.7K times lower communication volume on OPV2V. Code is available at github.com/PhyllisH/PragComm.