With the boom of e-commerce, Multimodal Review Helpfulness Prediction (MRHP), which aims to sort product reviews according to the predicted helpfulness scores has become a research hotspot. Previous work on this task focuses on attention-based modality fusion, information integration, and relation modeling, which primarily exposes the following drawbacks: 1) the model may fail to capture the really essential information due to its indiscriminate attention formulation; 2) lack appropriate modeling methods that take full advantage of correlation among provided data. In this paper, we propose SANCL: Selective Attention and Natural Contrastive Learning for MRHP. SANCL adopts a probe-based strategy to enforce high attention weights on the regions of greater significance. It also constructs a contrastive learning framework based on natural matching properties in the dataset. Experimental results on two benchmark datasets with three categories show that SANCL achieves state-of-the-art baseline performance with lower memory consumption.
Benefiting from the search efficiency, differentiable neural architecture search (NAS) has evolved as the most dominant alternative to automatically design competitive deep neural networks (DNNs). We note that DNNs must be executed under strictly hard performance constraints in real-world scenarios, for example, the runtime latency on autonomous vehicles. However, to obtain the architecture that meets the given performance constraint, previous hardware-aware differentiable NAS methods have to repeat a plethora of search runs to manually tune the hyper-parameters by trial and error, and thus the total design cost increases proportionally. To resolve this, we introduce a lightweight hardware-aware differentiable NAS framework dubbed LightNAS, striving to find the required architecture that satisfies various performance constraints through a one-time search (i.e., \underline{\textit{you only search once}}). Extensive experiments are conducted to show the superiority of LightNAS over previous state-of-the-art methods.
Networks in 5G and beyond utilize millimeter wave (mmWave) radio signals, large bandwidths, and large antenna arrays, which bring opportunities in jointly localizing the user equipment and mapping the propagation environment, termed as simultaneous localization and mapping (SLAM). Existing approaches mainly rely on delays and angles, and ignore the Doppler, although it contains geometric information. In this paper, we study the benefits of exploiting Doppler in SLAM through deriving the posterior Cram\'er-Rao bounds (PCRBs) and formulating the extended Kalman-Poisson multi-Bernoulli sequential filtering solution with Doppler as one of the involved measurements. Both theoretical PCRB analysis and simulation results demonstrate the efficacy of utilizing Doppler.
We address the localization of a reconfigurable intelligent surface (RIS) for a single-input single-output multi-carrier system using bi-static sensing between a fixed transmitter and a fixed receiver. Due to the deployment of RISs with a large dimension, near-field (NF) scenarios are likely to occur, especially for indoor applications, and are the focus of this work. We first derive the Cramer-Rao bounds (CRBs) on the estimation error of the RIS position and orientation and the time of arrival (TOA) for the path transmitter-RIS-receiver. We propose a multi-stage low-complexity estimator for RIS localization purposes. In this proposed estimator, we first perform a line search to estimate the TOA. Then, we use the far-field approximation of the NF signal model to implicitly estimate the angle of arrival and the angle of departure at the RIS center. Finally, the RIS position and orientation estimate are refined via a quasi-Newton method. Simulation results reveal that the proposed estimator can attain the CRBs. We also investigate the effects of several influential factors on the accuracy of the proposed estimator like the RIS size, transmitted power, system bandwidth, and RIS position and orientation.
Radio localization is a key enabler for joint communication and sensing in the fifth/sixth generation (5G/6G) communication systems. With the help of multipath components (MPCs), localization and mapping tasks can be done with a single base station (BS) and single unsynchronized user equipment (UE) if both of them are equipped with an antenna array. However, the antenna array at the UE side increases the hardware and computational cost, preventing localization functionality. In this work, we show that with Doppler estimation and MPCs, localization and mapping tasks can be performed even with a single-antenna mobile UE. Furthermore, we show that the localization and mapping performance will improve and then saturate at a certain level with an increased UE speed. Both theoretical Cram\'er-Rao bound analysis and simulation results show the potential of localization under mobility and the effectiveness of the proposed localization algorithm.
Radio localization is applied in high-frequency (e.g., mmWave and THz) systems to support communication and to provide location-based services without extra infrastructure. {For solving localization problems, a simplified, stationary, narrowband far-field channel model is widely used due to its compact formulation.} However, with increased array size in extra-large MIMO systems and increased bandwidth at upper mmWave bands, the effect of channel spatial non-stationarity (SNS), spherical wave model (SWM), and beam squint effect (BSE) cannot be ignored. In this case, localization performance will be affected when an inaccurate channel model deviating from the true model is adopted. In this work, we employ the MCRB (misspecified Cram\'er-Rao lower bound) to lower bound the localization error using a simplified mismatched model while the observed data is governed by a more complex true model. The simulation results show that among all the model impairments, the SNS has the least contribution, the SWM dominates when the distance is small compared to the array size, and the BSE has a more significant effect when the distance is much larger than the array size.
In e-commerce, the salience of commonsense knowledge (CSK) is beneficial for widespread applications such as product search and recommendation. For example, when users search for "running" in e-commerce, they would like to find items highly related to running, such as "running shoes" rather than "shoes". However, many existing CSK collections rank statements solely by confidence scores, and there is no information about which ones are salient from a human perspective. In this work, we define the task of supervised salience evaluation, where given a CSK triple, the model is required to learn whether the triple is salient or not. In addition to formulating the new task, we also release a new Benchmark dataset of Salience Evaluation in E-commerce (BSEE) and hope to promote related research on commonsense knowledge salience evaluation. We conduct experiments in the dataset with several representative baseline models. The experimental results show that salience evaluation is a hard task where models perform poorly on our evaluation set. We further propose a simple but effective approach, PMI-tuning, which shows promise for solving this novel problem.
Direction-of-arrival (DOA) information is vital for multiple-input-multiple-output (MIMO) systems to complete localization and beamforming tasks. Switched antenna arrays have recently emerged as an effective solution to reduce the cost and power consumption of MIMO systems. Switch-based array architectures connect a limited number of radio frequency chains to a subset of the antenna elements forming a subarray. This paper addresses the problem of antenna selection to optimize DOA estimation performance. We first perform a subarray layout alignment process to remove subarrays with identical beampatterns and create a unique subarray set. By using this set, and based on a DOA threshold region performance approximation, we propose two antenna selection algorithms; a greedy algorithm and a deep-learning-based algorithm. The performance of the proposed algorithms is evaluated numerically. The results show a significant performance improvement over selected benchmark approaches in terms of DOA estimation in the threshold region and computational complexity.
6G will be characterized by extreme use cases, not only for communication, but also for localization, and sensing. The use cases can be directly mapped to requirements in terms of standard key performance indicators (KPIs), such as data rate, latency, or localization accuracy. The goal of this paper is to go one step further and map these standard KPIs to requirements on signals, on hardware architectures, and on deployments. Based on this, system solutions can be identified that can support several use cases simultaneously. Since there are several ways to meet the KPIs, there is no unique solution and preferable configurations will be discussed.
Entity matching (EM) is the most critical step for entity resolution (ER). While current deep learningbased methods achieve very impressive performance on standard EM benchmarks, their realworld application performance is much frustrating. In this paper, we highlight that such the gap between reality and ideality stems from the unreasonable benchmark construction process, which is inconsistent with the nature of entity matching and therefore leads to biased evaluations of current EM approaches. To this end, we build a new EM corpus and re-construct EM benchmarks to challenge critical assumptions implicit in the previous benchmark construction process by step-wisely changing the restricted entities, balanced labels, and single-modal records in previous benchmarks into open entities, imbalanced labels, and multimodal records in an open environment. Experimental results demonstrate that the assumptions made in the previous benchmark construction process are not coincidental with the open environment, which conceal the main challenges of the task and therefore significantly overestimate the current progress of entity matching. The constructed benchmarks and code are publicly released