Considering that frequency diverse array multiple-input multiple-output (FDA-MIMO) possesses extra range information to enhance sensing performance, this paper explores the FDA-MIMO-based integrated sensing and communication (ISAC) system. To reinforce the system communication capability, we propose the frequency offset permutation index modulation (FOPIM) scheme, which conveys extra information bits by selecting and permutating frequency offsets from a frequency offsets pool. For the system communication sub-functionality, considering the fact that the traditional maximum likelihood detection method suffers from high complexity and bit error rate (BER), the maximum likelihood-based two-stage detection (MLTSD) approach is presented to overcome this issue. For the system sensing sub-function, we employ the two-step maximum likelihood estimator (TSMLE) to stepwise estimate the angle and range of the interested target. Furthermore, we derive the closed-form expressions for the tight upper bound on the communication BER, along with the sensing Cram\'er-Rao bound (CRB). The simulation results validate the theoretical analysis, demonstrating that the proposed system exhibits lower BER and superior range resolution than independent MIMO communication and MIMO sensing modules.
This article reviews how empirical research of exploratory search is conducted. We investigated aspects of interdisciplinarity, study settings and evaluation methodologies from a systematically selected sample of 231 publications from 2010-2021, including a total of 172 articles with empirical studies. Our results show that exploratory search is highly interdisciplinary, with the most frequently occurring publication venues including high impact venues in information science, information systems and human-computer interaction. However, taken in aggregate, the breadth of study settings investigated was limited. We found that a majority of studies (77%) focused on evaluating novel retrieval systems as opposed to investigating users' search processes. Furthermore, a disproportionate number of studies were based on scientific literature search (20.7%), a majority of which only considered searching for Computer Science articles. Study participants were generally from convenience samples, with 75% of studies composed exclusively of students and other academics. The methodologies used for evaluation were mostly quantitative, but lacked consistency between studies and validated questionnaires were rarely used. In discussion, we offer a critical analysis of our findings and suggest potential improvements for future exploratory search studies.
How easy is it to uniquely identify a person based on their web browsing behavior? Here we show that when people navigate the Web, their online traces produce fingerprints that identify them. By merely knowing their most visited web domains, four data points are enough to identify 95% of the individuals. These digital fingerprints are stable and render high re-identifiability. We demonstrate that we can re-identify 90% of the individuals in separate time slices of data. Such a privacy threat persists even with limited information about individuals' browsing behavior, reinforcing existing concerns around online privacy.
In today's urban landscape, traffic congestion poses a critical challenge, especially during outlier scenarios. These outliers can indicate abrupt traffic peaks, drops, or irregular trends, often arising from factors such as accidents, events, or roadwork. Moreover, Given the dynamic nature of traffic, the need for real-time traffic modeling also becomes crucial to ensure accurate and up-to-date traffic predictions. To address these challenges, we introduce the Outlier Weighted Autoencoder Modeling (OWAM) framework. OWAM employs autoencoders for local outlier detection and generates correlation scores to assess neighboring traffic's influence. These scores serve as a weighted factor for neighboring sensors, before fusing them into the model. This information enhances the traffic model's performance and supports effective real-time updates, a crucial aspect for capturing dynamic traffic patterns. OWAM demonstrates a favorable trade-off between accuracy and efficiency, rendering it highly suitable for real-world applications. The research findings contribute significantly to the development of more efficient and adaptive traffic prediction models, advancing the field of transportation management for the future. The code and datasets of our framework is publicly available under https://github.com/himanshudce/OWAM.
BERT-based models have shown a remarkable ability in the Chinese Spelling Check (CSC) task recently. However, traditional BERT-based methods still suffer from two limitations. First, although previous works have identified that explicit prior knowledge like Part-Of-Speech (POS) tagging can benefit in the CSC task, they neglected the fact that spelling errors inherent in CSC data can lead to incorrect tags and therefore mislead models. Additionally, they ignored the correlation between the implicit hierarchical information encoded by BERT's intermediate layers and different linguistic phenomena. This results in sub-optimal accuracy. To alleviate the above two issues, we design a heterogeneous knowledge-infused framework to strengthen BERT-based CSC models. To incorporate explicit POS knowledge, we utilize an auxiliary task strategy driven by Gaussian mixture model. Meanwhile, to incorporate implicit hierarchical linguistic knowledge within the encoder, we propose a novel form of n-gram-based layerwise self-attention to generate a multilayer representation. Experimental results show that our proposed framework yields a stable performance boost over four strong baseline models and outperforms the previous state-of-the-art methods on two datasets.
In this study, we explore an innovative approach for neural network optimization, focusing on the application of gradient sampling techniques, similar to those in StochGradAdam, during the pruning process. Our primary objective is to maintain high accuracy levels in pruned models, a critical challenge in resource-limited scenarios. Our extensive experiments reveal that models optimized with gradient sampling techniques are more effective at preserving accuracy during pruning compared to those using traditional optimization methods. This finding underscores the significance of gradient sampling in facilitating robust learning and enabling networks to retain crucial information even after substantial reduction in their complexity. We validate our approach across various datasets and neural architectures, demonstrating its broad applicability and effectiveness. The paper also delves into the theoretical aspects, explaining how gradient sampling techniques contribute to the robustness of models during pruning. Our results suggest a promising direction for creating efficient neural networks that do not compromise on accuracy, even in environments with constrained computational resources.
This study aims to improve the generation of 3D gestures by utilizing multimodal information from human speech. Previous studies have focused on incorporating additional modalities to enhance the quality of generated gestures. However, these methods perform poorly when certain modalities are missing during inference. To address this problem, we suggest using speech-derived multimodal priors to improve gesture generation. We introduce a novel method that separates priors from speech and employs multimodal priors as constraints for generating gestures. Our approach utilizes a chain-like modeling method to generate facial blendshapes, body movements, and hand gestures sequentially. Specifically, we incorporate rhythm cues derived from facial deformation and stylization prior based on speech emotions, into the process of generating gestures. By incorporating multimodal priors, our method improves the quality of generated gestures and eliminate the need for expensive setup preparation during inference. Extensive experiments and user studies confirm that our proposed approach achieves state-of-the-art performance.
Speech signals are inherently complex as they encompass both global acoustic characteristics and local semantic information. However, in the task of target speech extraction, certain elements of global and local semantic information in the reference speech, which are irrelevant to speaker identity, can lead to speaker confusion within the speech extraction network. To overcome this challenge, we propose a self-supervised disentangled representation learning method. Our approach tackles this issue through a two-phase process, utilizing a reference speech encoding network and a global information disentanglement network to gradually disentangle the speaker identity information from other irrelevant factors. We exclusively employ the disentangled speaker identity information to guide the speech extraction network. Moreover, we introduce the adaptive modulation Transformer to ensure that the acoustic representation of the mixed signal remains undisturbed by the speaker embeddings. This component incorporates speaker embeddings as conditional information, facilitating natural and efficient guidance for the speech extraction network. Experimental results substantiate the effectiveness of our meticulously crafted approach, showcasing a substantial reduction in the likelihood of speaker confusion.
This paper introduces a complete method for the automatic detection, identification and localization of lighting elements in buildings, leveraging the available building information modeling (BIM) data of a building and feeding the BIM model with the new collected information, which is key for energy-saving strategies. The detection system is heavily improved from our previous work, with the following two main contributions: (i) a new refinement algorithm to provide a better detection rate and identification performance with comparable computational resources and (ii) a new plane estimation, filtering and projection step to leverage the BIM information earlier for lamps that are both hanging and embedded. The two modifications are thoroughly tested in five different case studies, yielding better results in terms of detection, identification and localization.
Learning tasks play an increasingly prominent role in quantum information and computation. They range from fundamental problems such as state discrimination and metrology over the framework of quantum probably approximately correct (PAC) learning, to the recently proposed shadow variants of state tomography. However, the many directions of quantum learning theory have so far evolved separately. We propose a general mathematical formalism for describing quantum learning by training on classical-quantum data and then testing how well the learned hypothesis generalizes to new data. In this framework, we prove bounds on the expected generalization error of a quantum learner in terms of classical and quantum information-theoretic quantities measuring how strongly the learner's hypothesis depends on the specific data seen during training. To achieve this, we use tools from quantum optimal transport and quantum concentration inequalities to establish non-commutative versions of decoupling lemmas that underlie recent information-theoretic generalization bounds for classical machine learning. Our framework encompasses and gives intuitively accessible generalization bounds for a variety of quantum learning scenarios such as quantum state discrimination, PAC learning quantum states, quantum parameter estimation, and quantumly PAC learning classical functions. Thereby, our work lays a foundation for a unifying quantum information-theoretic perspective on quantum learning.