By deploying a large number of antennas with sub-half-wavelength spacing in a compact space, dense array systems(DASs) can fully unleash the multiplexing-and-diversity gains of limited apertures. To acquire these gains, accurate channel state information acquisition is necessary but challenging due to the large antenna numbers. To overcome this obstacle, this paper reveals that exploiting the high spatial correlation of DAS channels is crucial while designing the observation matrix for optimal/near-optimal channel estimation. Firstly, we prove that the observation matrix design is equivalent to a time-domain duality of multiple-input multiple-output precoding, which can be ideally addressed by the water-filling principle. For practical realizations, a novel ice-filling algorithm is proposed to design amplitude-and-phase controllable observation matrices, and a majorization-minimization algorithm is proposed to address the phase-only controllable case. Particularly, we prove that the ice-filling algorithm can be viewed as a ``quantized" water-filling algorithm. To support the sub-optimality of the proposed designs, we provide comprehensive analyses on the achievable mean square errors and their asymptotic expressions. Finally, numerical simulations verify that our proposed channel estimation designs can achieve the near-optimal performance and outperform existing approaches significantly.
One key task in environmental science is to map environmental variables continuously in space or even in space and time. Machine learning algorithms are frequently used to learn from local field observations to make spatial predictions by estimating the value of the variable of interest in places where it has not been measured. However, the application of machine learning strategies for spatial mapping involves additional challenges compared to "non-spatial" prediction tasks that often originate from spatial autocorrelation and from training data that are not independent and identically distributed. In the past few years, we developed a number of methods to support the application of machine learning for spatial data which involves the development of suitable cross-validation strategies for performance assessment and model selection, spatial feature selection, and methods to assess the area of applicability of the trained models. The intention of the CAST package is to support the application of machine learning strategies for predictive mapping by implementing such methods and making them available for easy integration into modelling workflows. Here we introduce the CAST package and its core functionalities. At the case study of mapping plant species richness, we will go through the different steps of the modelling workflow and show how CAST can be used to support more reliable spatial predictions.
The increasing demand for virtual reality applications has highlighted the significance of crafting immersive 3D assets. We present a text-to-3D 360$^{\circ}$ scene generation pipeline that facilitates the creation of comprehensive 360$^{\circ}$ scenes for in-the-wild environments in a matter of minutes. Our approach utilizes the generative power of a 2D diffusion model and prompt self-refinement to create a high-quality and globally coherent panoramic image. This image acts as a preliminary "flat" (2D) scene representation. Subsequently, it is lifted into 3D Gaussians, employing splatting techniques to enable real-time exploration. To produce consistent 3D geometry, our pipeline constructs a spatially coherent structure by aligning the 2D monocular depth into a globally optimized point cloud. This point cloud serves as the initial state for the centroids of 3D Gaussians. In order to address invisible issues inherent in single-view inputs, we impose semantic and geometric constraints on both synthesized and input camera views as regularizations. These guide the optimization of Gaussians, aiding in the reconstruction of unseen regions. In summary, our method offers a globally consistent 3D scene within a 360$^{\circ}$ perspective, providing an enhanced immersive experience over existing techniques. Project website at: http://dreamscene360.github.io/
Objects manipulated by the hand (i.e., manipulanda) are particularly challenging to reconstruct from in-the-wild RGB images or videos. Not only does the hand occlude much of the object, but also the object is often only visible in a small number of image pixels. At the same time, two strong anchors emerge in this setting: (1) estimated 3D hands help disambiguate the location and scale of the object, and (2) the set of manipulanda is small relative to all possible objects. With these insights in mind, we present a scalable paradigm for handheld object reconstruction that builds on recent breakthroughs in large language/vision models and 3D object datasets. Our model, MCC-Hand-Object (MCC-HO), jointly reconstructs hand and object geometry given a single RGB image and inferred 3D hand as inputs. Subsequently, we use GPT-4(V) to retrieve a 3D object model that matches the object in the image and rigidly align the model to the network-inferred geometry; we call this alignment Retrieval-Augmented Reconstruction (RAR). Experiments demonstrate that MCC-HO achieves state-of-the-art performance on lab and Internet datasets, and we show how RAR can be used to automatically obtain 3D labels for in-the-wild images of hand-object interactions.
Multispectral photoacoustic tomography (PAT) is an imaging modality that utilizes the photoacoustic effect to achieve non-invasive and high-contrast imaging of internal tissues. However, the hardware cost and computational demand of a multispectral PAT system consisting of up to thousands of detectors are huge. To address this challenge, we propose an ultra-sparse spiral sampling strategy for multispectral PAT, which we named U3S-PAT. Our strategy employs a sparse ring-shaped transducer that, when switching excitation wavelengths, simultaneously rotates and translates. This creates a spiral scanning pattern with multispectral angle-interlaced sampling. To solve the highly ill-conditioned image reconstruction problem, we propose a self-supervised learning method that is able to introduce structural information shared during spiral scanning. We simulate the proposed U3S-PAT method on a commercial PAT system and conduct in vivo animal experiments to verify its performance. The results show that even with a sparse sampling rate as low as 1/30, our U3S-PAT strategy achieves similar reconstruction and spectral unmixing accuracy as non-spiral dense sampling. Given its ability to dramatically reduce the time required for three-dimensional multispectral scanning, our U3S-PAT strategy has the potential to perform volumetric molecular imaging of dynamic biological activities.
High antenna directivity allows for high throughput transmission but also increases the exposure to electromagnetic field (EMF) of the end-users. Health regulations impose limitations on the incident power density, that generate a negative impact on network performance. In this work we focus at the slot-by-slot operations of a cellular Medium Access Control (MAC) scheduler to constrain the short-term EMF exposure upon real-time resource allocation, minimizing the impacts on network performance. We assume that the long-term EMF exposure is controlled by a proper outer-loop technique, that is not the object of this paper. Due to the minimal computational complexity allowed in MAC scheduling, existing solutions allowing practical implementation are few and focused at sub-optimal approaches curbing radio resource allocation. Our contribution is the derivation of a computationally efficient water-filling solution to allocate power and - then - resources, with a feasible integration of the necessary algorithms in the operations of a 5G MAC scheduler. We finally evaluate our proposal versus the prior art approaches with system level simulations with realistic modeling of physical and MAC level cellular procedures. We conclude that our proposal can control EMF with considerable less impact on network performance, making it a standout candidate for 5G and future 6G MAC scheduler implementations.
We propose a regularization-based image restoration scheme for 2D images recorded over time (2D+t). We design an infimal convolution-based regularization function which we call spatio-temporal Adaptive Infimal Convolution (STAIC) regularization. We formulate the infimal convolution in the form of an additive decomposition of the 2D+t image such that the extent of spatial and temporal smoothing is controlled in a spatially and temporally varying manner. This makes the regularization adaptable to the local characteristics of the motion leading to an improved ability to handle noise. We also develop a minimization method for image reconstruction by using the proposed form of regularization. We demonstrate the effectiveness of the proposed regularization using TIRF images recorded over time and compare with some selected existing regularizations.
The unit selection problem aims to find objects, called units, that optimize a causal objective function which describes the objects' behavior in a causal context (e.g., selecting customers who are about to churn but would most likely change their mind if encouraged). While early studies focused mainly on bounding a specific class of counterfactual objective functions using data, more recent work allows one to find optimal units exactly by reducing the causal objective to a classical objective on a meta-model, and then applying a variant of the classical Variable Elimination (VE) algorithm to the meta-model -- assuming a fully specified causal model is available. In practice, however, finding optimal units using this approach can be very expensive because the used VE algorithm must be exponential in the constrained treewidth of the meta-model, which is larger and denser than the original model. We address this computational challenge by introducing a new approach for unit selection that is not necessarily limited by the constrained treewidth. This is done through compiling the meta-model into a special class of tractable arithmetic circuits that allows the computation of optimal units in time linear in the circuit size. We finally present empirical results on random causal models that show order-of-magnitude speedups based on the proposed method for solving unit selection.
In-context learning is a powerful emergent ability in transformer models. Prior work in mechanistic interpretability has identified a circuit element that may be critical for in-context learning -- the induction head (IH), which performs a match-and-copy operation. During training of large transformers on natural language data, IHs emerge around the same time as a notable phase change in the loss. Despite the robust evidence for IHs and this interesting coincidence with the phase change, relatively little is known about the diversity and emergence dynamics of IHs. Why is there more than one IH, and how are they dependent on each other? Why do IHs appear all of a sudden, and what are the subcircuits that enable them to emerge? We answer these questions by studying IH emergence dynamics in a controlled setting by training on synthetic data. In doing so, we develop and share a novel optogenetics-inspired causal framework for modifying activations throughout training. Using this framework, we delineate the diverse and additive nature of IHs. By clamping subsets of activations throughout training, we then identify three underlying subcircuits that interact to drive IH formation, yielding the phase change. Furthermore, these subcircuits shed light on data-dependent properties of formation, such as phase change timing, already showing the promise of this more in-depth understanding of subcircuits that need to "go right" for an induction head.
With the increasing presence of autonomous vehicles (AVs) on public roads, developing robust control strategies to navigate the uncertainty of human-driven vehicles (HVs) is crucial. This paper introduces an advanced method for modeling HV behavior, combining a first-principles model with Gaussian process (GP) learning to enhance velocity prediction accuracy and provide a measurable uncertainty. We validated this innovative HV model using real-world data from field experiments and applied it to develop a GP-enhanced model predictive control (GP-MPC) strategy. This strategy aims to improve safety in mixed vehicle platoons by integrating uncertainty assessment into distance constraints. Comparative simulation studies with a conventional model predictive control (MPC) approach demonstrated that our GP-MPC strategy ensures more reliable safe distancing and fosters efficient vehicular dynamics, achieving notably higher speeds within the platoon. By incorporating a sparse GP technique in HV modeling and adopting a dynamic GP prediction within the MPC framework, we significantly reduced the computation time of GP-MPC, marking it only 4.6% higher than that of the conventional MPC. This represents a substantial improvement, making the process about 100 times faster than our preliminary work without these approximations. Our findings underscore the effectiveness of learning-based HV modeling in enhancing both safety and operational efficiency in mixed-traffic environments, paving the way for more harmonious AV-HV interactions.