Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Time": models, code, and papers

Towards 6DoF Bilateral Teleoperation of an Omnidirectional Aerial Vehicle for Aerial Physical Interaction

Mar 07, 2022
Mike Allenspach, Nicholas Lawrance, Marco Tognon, Roland Siegwart

Figure 1 for Towards 6DoF Bilateral Teleoperation of an Omnidirectional Aerial Vehicle for Aerial Physical Interaction

Figure 2 for Towards 6DoF Bilateral Teleoperation of an Omnidirectional Aerial Vehicle for Aerial Physical Interaction

Figure 3 for Towards 6DoF Bilateral Teleoperation of an Omnidirectional Aerial Vehicle for Aerial Physical Interaction

Figure 4 for Towards 6DoF Bilateral Teleoperation of an Omnidirectional Aerial Vehicle for Aerial Physical Interaction

Bilateral teleoperation offers an intriguing solution towards shared autonomy with aerial vehicles in contact-based inspection and manipulation tasks. Omnidirectional aerial robots allow for full pose operations, making them particularly attractive in such tasks. Naturally, the question arises whether standard bilateral teleoperation methodologies are suitable for use with these vehicles. In this work, a fully decoupled 6DoF bilateral teleoperation framework for aerial physical interaction is designed and tested for the first time. The method is based on the well established rate control, recentering and interaction force feedback policy. However, practical experiments evince the difficulty of performing decoupled motions in a single axis only. As such, this work shows that the trivial extension of standard methods is insufficient for omnidirectional teleoperation, due to the operators physical inability to properly decouple all input DoFs. This suggests that further studies on enhanced haptic feedback are necessary.

Via

Access Paper or Ask Questions

Decontextualized I3D ConvNet for ultra-distance runners performance analysis at a glance

Mar 25, 2022
David Freire-Obregón, Javier Lorenzo-Navarro, Modesto Castrillón-Santana

Figure 1 for Decontextualized I3D ConvNet for ultra-distance runners performance analysis at a glance

Figure 2 for Decontextualized I3D ConvNet for ultra-distance runners performance analysis at a glance

Figure 3 for Decontextualized I3D ConvNet for ultra-distance runners performance analysis at a glance

Figure 4 for Decontextualized I3D ConvNet for ultra-distance runners performance analysis at a glance

In May 2021, the site runnersworld.com published that participation in ultra-distance races has increased by 1,676% in the last 23 years. Moreover, nearly 41% of those runners participate in more than one race per year. The development of wearable devices has undoubtedly contributed to motivating participants by providing performance measures in real-time. However, we believe there is room for improvement, particularly from the organizers point of view. This work aims to determine how the runners performance can be quantified and predicted by considering a non-invasive technique focusing on the ultra-running scenario. In this sense, participants are captured when they pass through a set of locations placed along the race track. Each footage is considered an input to an I3D ConvNet to extract the participant's running gait in our work. Furthermore, weather and illumination capture conditions or occlusions may affect these footages due to the race staff and other runners. To address this challenging task, we have tracked and codified the participant's running gait at some RPs and removed the context intending to ensure a runner-of-interest proper evaluation. The evaluation suggests that the features extracted by an I3D ConvNet provide enough information to estimate the participant's performance along the different race tracks.

* Accepted at ICIAP 2021

Via

Access Paper or Ask Questions

A Continuous-Time Approach for 3D Radar-to-Camera Extrinsic Calibration

Mar 12, 2021
Emmett Wise, Juraj Peršić, Christopher Grebe, Ivan Petrović, Jonathan Kelly

Figure 1 for A Continuous-Time Approach for 3D Radar-to-Camera Extrinsic Calibration

Figure 2 for A Continuous-Time Approach for 3D Radar-to-Camera Extrinsic Calibration

Figure 3 for A Continuous-Time Approach for 3D Radar-to-Camera Extrinsic Calibration

Figure 4 for A Continuous-Time Approach for 3D Radar-to-Camera Extrinsic Calibration

Reliable operation in inclement weather is essential to the deployment of safe autonomous vehicles (AVs). Robustness and reliability can be achieved by fusing data from the standard AV sensor suite (i.e., lidars, cameras) with weather robust sensors, such as millimetre-wavelength radar. Critically, accurate sensor data fusion requires knowledge of the rigid-body transform between sensor pairs, which can be determined through the process of extrinsic calibration. A number of extrinsic calibration algorithms have been designed for 2D (planar) radar sensors - however, recently-developed, low-cost 3D millimetre-wavelength radars are set to displace their 2D counterparts in many applications. In this paper, we present a continuous-time 3D radar-to-camera extrinsic calibration algorithm that utilizes radar velocity measurements and, unlike the majority of existing techniques, does not require specialized radar retroreflectors to be present in the environment. We derive the observability properties of our formulation and demonstrate the efficacy of our algorithm through synthetic and real-world experiments.

* 7 pages, 5 figures, Accepted to the International Conference on Robotics and Automation (ICRA'21), Xi'an, China, May 30 - June 5, 2021

Via

Access Paper or Ask Questions

A Survey on Scalable LoRaWAN for Massive IoT: Recent Advances, Potentials, and Challenges

Feb 22, 2022
Mohammed Jouhari, El Mehdi Amhoud, Nasir Saeed, Mohamed-Slim Alouini

Figure 1 for A Survey on Scalable LoRaWAN for Massive IoT: Recent Advances, Potentials, and Challenges

Figure 2 for A Survey on Scalable LoRaWAN for Massive IoT: Recent Advances, Potentials, and Challenges

Figure 3 for A Survey on Scalable LoRaWAN for Massive IoT: Recent Advances, Potentials, and Challenges

Figure 4 for A Survey on Scalable LoRaWAN for Massive IoT: Recent Advances, Potentials, and Challenges

Long Range (LoRa) is the most widely used technology for enabling Low Power Wide Area Networks (LPWANs) on unlicensed frequency bands. Despite its modest Data Rates (DRs), it provides extensive coverage for low-power devices, making it an ideal communication system for many Internet of Things (IoT) applications. In general, LoRa radio is considered as the physical layer, whereas Long Range Wide Area Networks (LoRaWAN) is the MAC layer of the LoRa stack that adopts star topology to enable communication between multiple End Devices (EDs) and the network Gateway (GW). The Chirp Spread Spectrum (CSS) modulation deals with LoRa signals interference and ensures long-range communication. At the same time, the Adaptive Data Rate (ADR) mechanism allows EDs to dynamically alter some LoRa features such as the Spreading Factor (SF), Code Rate (CR), and carrier frequency to address the time variance of communication conditions in dense networks. Despite the high LoRa connectivity demand, LoRa signals interference and concurrent transmission collisions are major limitations. Therefore, to enhance LoRaWAN capacity, the LoRa alliance released many LoRaWAN versions, and the research community provided numerous solutions to develop scalable LoRaWAN technology. Hence, we thoroughly examined LoRaWAN scalability challenges and the state-of-the-art solutions in both the PHY and MAC layers. Most of these solutions rely on SF, logical, and frequency channel assignment, while others propose new network topologies or implement signal processing schemes to cancel the interference and allow LoRaWAN to connect more EDs efficiently. A summary of the existing solutions in the literature is provided at the end of the paper by describing the advantages and drawbacks of each solution and suggesting possible enhancements as future research directions.

Via

Access Paper or Ask Questions

Automatic Recognition and Digital Documentation of Cultural Heritage Hemispherical Domes using Images

Jan 25, 2022
Reza Maalek, Shahrokh Maalek

Figure 1 for Automatic Recognition and Digital Documentation of Cultural Heritage Hemispherical Domes using Images

Figure 2 for Automatic Recognition and Digital Documentation of Cultural Heritage Hemispherical Domes using Images

Figure 3 for Automatic Recognition and Digital Documentation of Cultural Heritage Hemispherical Domes using Images

Figure 4 for Automatic Recognition and Digital Documentation of Cultural Heritage Hemispherical Domes using Images

Advancements in optical metrology has enabled documentation of dense 3D point clouds of cultural heritage sites. For large scale and continuous digital documentation, processing of dense 3D point clouds becomes computationally cumbersome, and often requires additional hardware for data management, increasing the time cost, and complexity of projects. To this end, this manuscript presents an original approach to generate fast and reliable semantic digital models of heritage hemispherical domes using only two images. New closed formulations were derived to establish the relationships between spheres and their projected ellipses onto images, which fostered the development of a new automatic framework for as-built generation of spheres. The effectiveness of the proposed method was evaluated under both laboratory and real-world datasets. The results revealed that the proposed method achieved as-built modeling accuracy of around 6mm, while improving the computation time by a factor of 7, when compared to established point cloud processing methods.

Via

Access Paper or Ask Questions

Speech watermarking: an approach for the forensic analysis of digital telephonic recordings

Mar 12, 2022
Marcos Faundez-Zanuy, Jose Juan Lucena-Molina, Martin Hagmueller

Figure 1 for Speech watermarking: an approach for the forensic analysis of digital telephonic recordings

Figure 2 for Speech watermarking: an approach for the forensic analysis of digital telephonic recordings

Figure 3 for Speech watermarking: an approach for the forensic analysis of digital telephonic recordings

Figure 4 for Speech watermarking: an approach for the forensic analysis of digital telephonic recordings

In this article, the authors discuss the problem of forensic authentication of digital audio recordings. Although forensic audio has been addressed in several articles, the existing approaches are focused on analog magnetic recordings, which are less prevalent because of the large amount of digital recorders available on the market (optical, solid state, hard disks, etc.). An approach based on digital signal processing that consists of spread spectrum techniques for speech watermarking is presented. This approach presents the advantage that the authentication is based on the signal itself rather than the recording format. Thus, it is valid for usual recording devices in police-controlled telephone intercepts. In addition, our proposal allows for the introduction of relevant information such as the recording date and time and all the relevant data (this is not always possible with classical systems). Our experimental results reveal that the speech watermarking procedure does not interfere in a significant way with the posterior forensic speaker identification.

* J Forensic Sci. 2010 Jul;55(4):1080-7
* 15 pages

Via

Access Paper or Ask Questions

Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

Mar 25, 2022
Hung-Shin Lee, Pin-Yuan Chen, Yu Tsao, Hsin-Min Wang

Figure 1 for Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

Figure 2 for Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

Figure 3 for Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

Figure 4 for Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

Compensation for channel mismatch and noise interference is essential for robust automatic speech recognition. Enhanced speech has been introduced into the multi-condition training of acoustic models to improve their generalization ability. In this paper, a noise-aware training framework based on two cascaded neural structures is proposed to jointly optimize speech enhancement and speech recognition. The feature enhancement module is composed of a multi-task autoencoder, where noisy speech is decomposed into clean speech and noise. By concatenating its enhanced, noise-aware, and noisy features for each frame, the acoustic-modeling module maps each feature-augmented frame into a triphone state by optimizing the lattice-free maximum mutual information and cross entropy between the predicted and actual state sequences. On top of the factorized time delay neural network (TDNN-F) and its convolutional variant (CNN-TDNNF), both with SpecAug, the two proposed systems achieve word error rate (WER) of 3.90% and 3.55%, respectively, on the Aurora-4 task. Compared with the best existing systems that use bigram and trigram language models for decoding, the proposed CNN-TDNNF-based system achieves a relative WER reduction of 15.20% and 33.53%, respectively. In addition, the proposed CNN-TDNNF-based system also outperforms the baseline CNN-TDNNF system on the AMI task.

* submitted to Interspeech 2022

Via

Access Paper or Ask Questions

GCF: Generalized Causal Forest for Heterogeneous Treatment Effect Estimation in Online Marketplace

Mar 21, 2022
Shu Wan, Chen Zheng, Zhonggen Sun, Mengfan Xu, Xiaoqing Yang, Hongtu Zhu, Jiecheng Guo

Figure 1 for GCF: Generalized Causal Forest for Heterogeneous Treatment Effect Estimation in Online Marketplace

Figure 2 for GCF: Generalized Causal Forest for Heterogeneous Treatment Effect Estimation in Online Marketplace

Figure 3 for GCF: Generalized Causal Forest for Heterogeneous Treatment Effect Estimation in Online Marketplace

Figure 4 for GCF: Generalized Causal Forest for Heterogeneous Treatment Effect Estimation in Online Marketplace

Uplift modeling is a rapidly growing approach that utilizes machine learning and causal inference methods to estimate the heterogeneous treatment effects. It has been widely adopted and applied to online marketplaces to assist large-scale decision-making in recent years. The existing popular methods, like forest-based modeling, either work only for discrete treatments or make partially linear or parametric assumptions that may suffer from model misspecification. To alleviate these problems, we extend causal forest (CF) with non-parametric dose-response functions (DRFs) that can be estimated locally using a kernel-based doubly robust estimator. Moreover, we propose a distance-based splitting criterion in the functional space of conditional DRFs to capture the heterogeneity for the continuous treatments. We call the proposed algorithm generalized causal forest (GCF) as it generalizes the use case of CF to a much broader setup. We show the effectiveness of GCF by comparing it to popular uplift modeling models on both synthetic and real-world datasets. We implement GCF in Spark and successfully deploy it into DiDi's real-time pricing system. Online A/B testing results further validate the superiority of GCF.

Via

Access Paper or Ask Questions

Video based real-time positional tracker

Oct 02, 2020
David Albarracín, Jesús Hormigo

Figure 1 for Video based real-time positional tracker

Figure 2 for Video based real-time positional tracker

Figure 3 for Video based real-time positional tracker

Figure 4 for Video based real-time positional tracker

We propose a system that uses video as the input to track the position of objects relative to their surrounding environment in real-time. The neural network employed is trained on a 100% synthetic dataset coming from our own automated generator. The positional tracker relies on a range of 1 to n video cameras placed around an arena of choice. The system returns the positions of the tracked objects relative to the broader world by understanding the overlapping matrices formed by the cameras and therefore these can be extrapolated into real world coordinates. In most cases, we achieve a higher update rate and positioning precision than any of the existing GPS-based systems, in particular for indoor objects or those occluded from clear sky.

* 8 pages, 5 figures

Via

Access Paper or Ask Questions

Fast Automatic Feature Selection for Multi-Period Sliding Window Aggregate in Time Series

Dec 02, 2020
Rui An, Xingtian Shi, Baohan Xu

Figure 1 for Fast Automatic Feature Selection for Multi-Period Sliding Window Aggregate in Time Series

Figure 2 for Fast Automatic Feature Selection for Multi-Period Sliding Window Aggregate in Time Series

Figure 3 for Fast Automatic Feature Selection for Multi-Period Sliding Window Aggregate in Time Series

Figure 4 for Fast Automatic Feature Selection for Multi-Period Sliding Window Aggregate in Time Series

As one of the most well-known artificial feature sampler, the sliding window is widely used in scenarios where spatial and temporal information exists, such as computer vision, natural language process, data stream, and time series. Among which time series is common in many scenarios like credit card payment, user behavior, and sensors. General feature selection for features extracted by sliding window aggregate calls for time-consuming iteration to generate features, and then traditional feature selection methods are employed to rank them. The decision of key parameter, i.e. the period of sliding windows, depends on the domain knowledge and calls for trivial. Currently, there is no automatic method to handle the sliding window aggregate features selection. As the time consumption of feature generation with different periods and sliding windows is huge, it is very hard to enumerate them all and then select them. In this paper, we propose a general framework using Markov Chain to solve this problem. This framework is very efficient and has high accuracy, such that it is able to perform feature selection on a variety of features and period options. We show the detail by 2 common sliding windows and 3 types of aggregation operators. And it is easy to extend more sliding windows and aggregation operators in this framework by employing existing theory about Markov Chain.

* ICDM 2020

Via

Access Paper or Ask Questions