Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Information": models, code, and papers

Invertible Fourier Neural Operators for Tackling Both Forward and Inverse Problems

Feb 18, 2024
Da Long, Shandian Zhe

Fourier Neural Operator (FNO) is a popular operator learning method, which has demonstrated state-of-the-art performance across many tasks. However, FNO is mainly used in forward prediction, yet a large family of applications rely on solving inverse problems. In this paper, we propose an invertible Fourier Neural Operator (iFNO) that tackles both the forward and inverse problems. We designed a series of invertible Fourier blocks in the latent channel space to share the model parameters, efficiently exchange the information, and mutually regularize the learning for the bi-directional tasks. We integrated a variational auto-encoder to capture the intrinsic structures within the input space and to enable posterior inference so as to overcome challenges of illposedness, data shortage, noises, etc. We developed a three-step process for pre-training and fine tuning for efficient training. The evaluations on five benchmark problems have demonstrated the effectiveness of our approach.

Via

Access Paper or Ask Questions

An evaluation of Deep Learning based stereo dense matching dataset shift from aerial images and a large scale stereo dataset

Feb 19, 2024
Teng Wu, Bruno Vallet, Marc Pierrot-Deseilligny, Ewelina Rupnik

Dense matching is crucial for 3D scene reconstruction since it enables the recovery of scene 3D geometry from image acquisition. Deep Learning (DL)-based methods have shown effectiveness in the special case of epipolar stereo disparity estimation in the computer vision community. DL-based methods depend heavily on the quality and quantity of training datasets. However, generating ground-truth disparity maps for real scenes remains a challenging task in the photogrammetry community. To address this challenge, we propose a method for generating ground-truth disparity maps directly from Light Detection and Ranging (LiDAR) and images to produce a large and diverse dataset for six aerial datasets across four different areas and two areas with different resolution images. We also introduce a LiDAR-to-image co-registration refinement to the framework that takes special precautions regarding occlusions and refrains from disparity interpolation to avoid precision loss. Evaluating 11 dense matching methods across datasets with diverse scene types, image resolutions, and geometric configurations, which are deeply investigated in dataset shift, GANet performs best with identical training and testing data, and PSMNet shows robustness across different datasets, and we proposed the best strategy for training with a limit dataset. We will also provide the dataset and training models; more information can be found at https://github.com/whuwuteng/Aerial_Stereo_Dataset.

* International Journal of Applied Earth Observation and Geoinformation, 128(2024)

Via

Access Paper or Ask Questions

Design and evaluation of a multi-finger skin-stretch tactile interface for hand rehabilitation robots

Feb 19, 2024
Alexandre L. Ratschat, Rubén Martín-Rodríguez, Yasemin Vardar, Gerard M. Ribbers, Laura Marchal-Crespo

Object properties perceived through the tactile sense, such as weight, friction, and slip, greatly influence motor control during manipulation tasks. However, the provision of tactile information during robotic training in neurorehabilitation has not been well explored. Therefore, we designed and evaluated a tactile interface based on a two-degrees-of-freedom moving platform mounted on a hand rehabilitation robot that provides skin stretch at four fingertips, from the index through the little finger. To accurately control the rendered forces, we included a custom magnetic-based force sensor to control the tactile interface in a closed loop. The technical evaluation showed that our custom force sensor achieved measurable shear forces of +-8N with accuracies of 95.2-98.4% influenced by hysteresis, viscoelastic creep, and torsional deformation. The tactile interface accurately rendered forces with a step response steady-state accuracy of 97.5-99.4% and a frequency response in the range of most activities of daily living. Our sensor showed the highest measurement-range-to-size ratio and comparable accuracy to sensors of its kind. These characteristics enabled the closed-loop force control of the tactile interface for precise rendering of multi-finger two-dimensional skin stretch. The proposed system is a first step towards more realistic and rich haptic feedback during robotic sensorimotor rehabilitation, potentially improving therapy outcomes.

* 6 pages, 3 figures, submitted to BioRob 2024

Via

Access Paper or Ask Questions

Incipient Slip Detection by Vibration Injection into Soft Sensor

Feb 19, 2024
Naoto Komeno, Takamitsu Matsubara

In robotic manipulation, preventing objects from slipping and establishing a secure grip on them is critical. Successful manipulation requires tactile sensors that detect the microscopic incipient slip phenomenon at the contact surface. Unfortunately, the tiny signals generated by incipient slip are quickly buried by environmental noise, and precise stress-distribution measurement requires an extensive optical system and integrated circuits. In this study, we focus on the macroscopic deformation of the entire fingertip's soft structure instead of directly observing the contact surface and its role as a vibration medium for sensing. The proposed method compresses the stick ratio's information into a one-dimensional pressure signal using the change in the propagation characteristics by vibration injection into the soft structure, which magnifies the microscopic incipient slip phenomena into the entire deformation. This mechanism allows a tactile sensor to use just a single vibration sensor. In the implemented system, a biomimetic tactile sensor is vibrated using a white signal from a PZT motor and utilizes frequency spectrum change of the propagated vibration as features. We investigated the proposed method's effectiveness on stick-ratio estimation and \red{stick-ratio stabilization} control during incipient slip. Our estimation error and the control performance results significantly outperformed the conventional methods.

* 8 pages, Accepted by Robotics and Automation Letters

Via

Access Paper or Ask Questions

The Paradox of Motion: Evidence for Spurious Correlations in Skeleton-based Gait Recognition Models

Feb 13, 2024
Andy Cătrună, Adrian Cosma, Emilian Rădoi

Gait, an unobtrusive biometric, is valued for its capability to identify individuals at a distance, across external outfits and environmental conditions. This study challenges the prevailing assumption that vision-based gait recognition, in particular skeleton-based gait recognition, relies primarily on motion patterns, revealing a significant role of the implicit anthropometric information encoded in the walking sequence. We show through a comparative analysis that removing height information leads to notable performance degradation across three models and two benchmarks (CASIA-B and GREW). Furthermore, we propose a spatial transformer model processing individual poses, disregarding any temporal information, which achieves unreasonably good accuracy, emphasizing the bias towards appearance information and indicating spurious correlations in existing benchmarks. These findings underscore the need for a nuanced understanding of the interplay between motion and appearance in vision-based gait recognition, prompting a reevaluation of the methodological assumptions in this field. Our experiments indicate that "in-the-wild" datasets are less prone to spurious correlations, prompting the need for more diverse and large scale datasets for advancing the field.

Via

Access Paper or Ask Questions

Fine Tuning Named Entity Extraction Models for the Fantasy Domain

Feb 16, 2024
Aravinth Sivaganeshan, Nisansa de Silva

Named Entity Recognition (NER) is a sequence classification Natural Language Processing task where entities are identified in the text and classified into predefined categories. It acts as a foundation for most information extraction systems. Dungeons and Dragons (D&D) is an open-ended tabletop fantasy game with its own diverse lore. DnD entities are domain-specific and are thus unrecognizable by even the state-of-the-art off-the-shelf NER systems as the NER systems are trained on general data for pre-defined categories such as: person (PERS), location (LOC), organization (ORG), and miscellaneous (MISC). For meaningful extraction of information from fantasy text, the entities need to be classified into domain-specific entity categories as well as the models be fine-tuned on a domain-relevant corpus. This work uses available lore of monsters in the D&D domain to fine-tune Trankit, which is a prolific NER framework that uses a pre-trained model for NER. Upon this training, the system acquires the ability to extract monster names from relevant domain documents under a novel NER tag. This work compares the accuracy of the monster name identification against; the zero-shot Trankit model and two FLAIR models. The fine-tuned Trankit model achieves an 87.86% F1 score surpassing all the other considered models.

Via

Access Paper or Ask Questions

Operational Collective Intelligence of Humans and Machines

Feb 16, 2024
Nikolos Gurney, Fred Morstatter, David V. Pynadath, Adam Russell, Gleb Satyukov

We explore the use of aggregative crowdsourced forecasting (ACF) as a mechanism to help operationalize ``collective intelligence'' of human-machine teams for coordinated actions. We adopt the definition for Collective Intelligence as: ``A property of groups that emerges from synergies among data-information-knowledge, software-hardware, and individuals (those with new insights as well as recognized authorities) that enables just-in-time knowledge for better decisions than these three elements acting alone.'' Collective Intelligence emerges from new ways of connecting humans and AI to enable decision-advantage, in part by creating and leveraging additional sources of information that might otherwise not be included. Aggregative crowdsourced forecasting (ACF) is a recent key advancement towards Collective Intelligence wherein predictions (X\% probability that Y will happen) and rationales (why I believe it is this probability that X will happen) are elicited independently from a diverse crowd, aggregated, and then used to inform higher-level decision-making. This research asks whether ACF, as a key way to enable Operational Collective Intelligence, could be brought to bear on operational scenarios (i.e., sequences of events with defined agents, components, and interactions) and decision-making, and considers whether such a capability could provide novel operational capabilities to enable new forms of decision-advantage.

Via

Access Paper or Ask Questions

Region Feature Descriptor Adapted to High Affine Transformations

Feb 15, 2024
Shaojie Zhang, Yinghui Wang, Peixuan Liu, Jinlong Yang, Tao Yan, Liangyi Huang, Mingfeng Wang

To address the issue of feature descriptors being ineffective in representing grayscale feature information when images undergo high affine transformations, leading to a rapid decline in feature matching accuracy, this paper proposes a region feature descriptor based on simulating affine transformations using classification. The proposed method initially categorizes images with different affine degrees to simulate affine transformations and generate a new set of images. Subsequently, it calculates neighborhood information for feature points on this new image set. Finally, the descriptor is generated by combining the grayscale histogram of the maximum stable extremal region to which the feature point belongs and the normalized position relative to the grayscale centroid of the feature point's region. Experimental results, comparing feature matching metrics under affine transformation scenarios, demonstrate that the proposed descriptor exhibits higher precision and robustness compared to existing classical descriptors. Additionally, it shows robustness when integrated with other descriptors.

Via

Access Paper or Ask Questions

Static vs. Dynamic Databases for Indoor Localization based on Wi-Fi Fingerprinting: A Discussion from a Data Perspective

Feb 20, 2024
Zhe Tang, Ruocheng Gu, Sihao Li, Kyeong Soo Kim, Jeremy S. Smith

Wi-Fi fingerprinting has emerged as the most popular approach to indoor localization. The use of ML algorithms has greatly improved the localization performance of Wi-Fi fingerprinting, but its success depends on the availability of fingerprint databases composed of a large number of RSSIs, the MAC addresses of access points, and the other measurement information. However, most fingerprint databases do not reflect well the time varying nature of electromagnetic interferences in complicated modern indoor environment. This could result in significant changes in statistical characteristics of training/validation and testing datasets, which are often constructed at different times, and even the characteristics of the testing datasets could be different from those of the data submitted by users during the operation of localization systems after their deployment. In this paper, we consider the implications of time-varying Wi-Fi fingerprints on indoor localization from a data-centric point of view and discuss the differences between static and dynamic databases. As a case study, we have constructed a dynamic database covering three floors of the IR building of XJTLU based on RSSI measurements, over 44 days, and investigated the differences between static and dynamic databases in terms of statistical characteristics and localization performance. The analyses based on variance calculations and Isolation Forest show the temporal shifts in RSSIs, which result in a noticeable trend of the increase in the localization error of a Gaussian process regression model with the maximum error of 6.65 m after 14 days of training without model adjustments. The results of the case study with the XJTLU dynamic database clearly demonstrate the limitations of static databases and the importance of the creation and adoption of dynamic databases for future indoor localization research and real-world deployment.

* 10 pages, 5 figures, Invited paper with Excellent Paper Award to be presented at ICAIIC 2024, Osaka, Japan, Feb. 19--22, 2023

Via

Access Paper or Ask Questions

Thompson Sampling in Partially Observable Contextual Bandits

Feb 15, 2024
Hongju Park, Mohamad Kazem Shirani Faradonbeh

Contextual bandits constitute a classical framework for decision-making under uncertainty. In this setting, the goal is to learn the arms of highest reward subject to contextual information, while the unknown reward parameters of each arm need to be learned by experimenting that specific arm. Accordingly, a fundamental problem is that of balancing exploration (i.e., pulling different arms to learn their parameters), versus exploitation (i.e., pulling the best arms to gain reward). To study this problem, the existing literature mostly considers perfectly observed contexts. However, the setting of partial context observations remains unexplored to date, despite being theoretically more general and practically more versatile. We study bandit policies for learning to select optimal arms based on the data of observations, which are noisy linear functions of the unobserved context vectors. Our theoretical analysis shows that the Thompson sampling policy successfully balances exploration and exploitation. Specifically, we establish the followings: (i) regret bounds that grow poly-logarithmically with time, (ii) square-root consistency of parameter estimation, and (iii) scaling of the regret with other quantities including dimensions and number of arms. Extensive numerical experiments with both real and synthetic data are presented as well, corroborating the efficacy of Thompson sampling. To establish the results, we introduce novel martingale techniques and concentration inequalities to address partially observed dependent random variables generated from unspecified distributions, and also leverage problem-dependent information to sharpen probabilistic bounds for time-varying suboptimality gaps. These techniques pave the road towards studying other decision-making problems with contextual information as well as partial observations.

* 43 pages

Via

Access Paper or Ask Questions