Several sensing techniques have been proposed for silent speech recognition (SSR); however, many of these methods require invasive processes or sensor attachment to the skin using adhesive tape or glue, rendering them unsuitable for frequent use in daily life. By contrast, impulse radio ultra-wideband (IR-UWB) radar can operate without physical contact with users' articulators and related body parts, offering several advantages for SSR. These advantages include high range resolution, high penetrability, low power consumption, robustness to external light or sound interference, and the ability to be embedded in space-constrained handheld devices. This study demonstrated IR-UWB radar-based contactless SSR using four types of speech stimuli (vowels, consonants, words, and phrases). To achieve this, a novel speech feature extraction algorithm specifically designed for IR-UWB radar-based SSR is proposed. Each speech stimulus is recognized by applying a classification algorithm to the extracted speech features. Two different algorithms, multidimensional dynamic time warping (MD-DTW) and deep neural network-hidden Markov model (DNN-HMM), were compared for the classification task. Additionally, a favorable radar antenna position, either in front of the user's lips or below the user's chin, was determined to achieve higher recognition accuracy. Experimental results demonstrated the efficacy of the proposed speech feature extraction algorithm combined with DNN-HMM for classifying vowels, consonants, words, and phrases. Notably, this study represents the first demonstration of phoneme-level SSR using contactless radar.
Task-oriented dialogue research has mainly focused on a few popular languages like English and Chinese, due to the high dataset creation cost for a new language. To reduce the cost, we apply manual editing to automatically translated data. We create a new multilingual benchmark, X-RiSAWOZ, by translating the Chinese RiSAWOZ to 4 languages: English, French, Hindi, Korean; and a code-mixed English-Hindi language. X-RiSAWOZ has more than 18,000 human-verified dialogue utterances for each language, and unlike most multilingual prior work, is an end-to-end dataset for building fully-functioning agents. The many difficulties we encountered in creating X-RiSAWOZ led us to develop a toolset to accelerate the post-editing of a new language dataset after translation. This toolset improves machine translation with a hybrid entity alignment technique that combines neural with dictionary-based methods, along with many automated and semi-automated validation checks. We establish strong baselines for X-RiSAWOZ by training dialogue agents in the zero- and few-shot settings where limited gold data is available in the target language. Our results suggest that our translation and post-editing methodology and toolset can be used to create new high-quality multilingual dialogue agents cost-effectively. Our dataset, code, and toolkit are released open-source.
In regions where global navigation satellite systems (GNSS) signals are unavailable, such as underground areas and tunnels, GNSS simulators can be deployed for transmitting simulated GNSS signals. Then, a GNSS receiver in the simulator coverage outputs the position based on the received GNSS signals (e.g., Global Positioning System (GPS) L1 signals in this study) transmitted by the corresponding simulator. This approach provides periodic position updates to GNSS users while deploying a small number of simulators without modifying the hardware and software of user receivers. However, the simulator clock should be synchronized to the GNSS satellite clock to generate almost identical signals to the live-sky GNSS signals, which is necessary for seamless indoor and outdoor positioning handover. The conventional clock synchronization method based on the wired connection between each simulator and an outdoor GNSS antenna causes practical difficulty and increases the cost of deploying the simulators. This study proposes a wireless clock synchronization method based on a private time server and time delay calibration. Additionally, we derived the constraints for determining the optimal simulator coverage and separation between adjacent simulators. The positioning performance of the proposed GPS simulator-based indoor positioning system was demonstrated in the underground testbed for a driving vehicle with a GPS receiver and a pedestrian with a smartphone. The average position errors were 3.7 m for the vehicle and 9.6 m for the pedestrian during the field tests with successful indoor and outdoor positioning handovers. Since those errors are within the coverage of each deployed simulator, it is confirmed that the proposed system with wireless clock synchronization can effectively provide periodic position updates to users where live-sky GNSS signals are unavailable.
In urban areas, dense buildings frequently block and reflect global positioning system (GPS) signals, resulting in the reception of a few visible satellites with many multipath signals. This is a significant problem that results in unreliable positioning in urban areas. If a signal reception condition from a certain satellite can be detected, the positioning performance can be improved by excluding or de-weighting the multipath contaminated satellite signal. Thus, we developed a machine-learning-based method of classifying GPS signal reception conditions using a dual-polarized antenna. We employed a decision tree algorithm for classification using three features, one of which can be obtained only from a dual-polarized antenna. A machine-learning model was trained using GPS signals collected from various locations. When the features extracted from the GPS raw signal are input, the generated machine-learning model outputs one of the three signal reception conditions: non-line-of-sight (NLOS) only, line-of-sight (LOS) only, or LOS+NLOS. Multiple testing datasets were used to analyze the classification accuracy, which was then compared with an existing method using dual single-polarized antennas. Consequently, when the testing dataset was collected at different locations from the training dataset, a classification accuracy of 64.47% was obtained, which was slightly higher than the accuracy of the existing method using dual single-polarized antennas. Therefore, the dual-polarized antenna solution is more beneficial than the dual single-polarized antenna solution because it has a more compact form factor and its performance is similar to that of the other solution.
The maximum likelihood (ML) estimator can be applied to localize a target mobile device using the RSS and TOA. However, the ML estimator for the RSS-TOA-based target localization problem is nonconvex and nonlinear, having no analytical solution. Therefore, the ML estimator should be solved numerically, unless it is relaxed into a convex or linear form. This study investigates the target localization performance and computational complexity of numerical methods for solving an ML estimator. The three widely used numerical methods are: grid search, gradient descent, and particle swarm optimization. In the experimental evaluation, the grid search yielded the lowest target localization root-mean-squared error; however, the 95th percentile error of the grid search was larger than those of the other two algorithms. The average code computation time of the grid search was extremely large compared with those of the other two algorithms, and gradient descent exhibited the lowest computation time.
Neural networks have complex structures, and thus it is hard to understand their inner workings and ensure correctness. To understand and debug convolutional neural networks (CNNs) we propose techniques for testing the channels of CNNs. We design FtGAN, an extension to GAN, that can generate test data with varying the intensity (i.e., sum of the neurons) of a channel of a target CNN. We also proposed a channel selection algorithm to find representative channels for testing. To efficiently inspect the target CNN's inference computations, we define unexpectedness score, which estimates how similar the inference computation of the test data is to that of the training data. We evaluated FtGAN with five public datasets and showed that our techniques successfully identify defective channels in five different CNN models.
This study investigates unmanned aerial vehicle (UAV) trajectory planning strategies for localizing a target mobile device in emergency situations. The global navigation satellite system (GNSS)-based accurate position information of a target mobile device in an emergency may not be always available to first responders. For example, 1) GNSS positioning accuracy may be degraded in harsh signal environments and 2) in countries where emergency positioning service is not mandatory, some mobile devices may not report their locations. Under the cases mentioned above, one way to find the target mobile device is to use UAVs. Dispatched UAVs may search the target directly on the emergency site by measuring the strength of the signal (e.g., LTE wireless communication signal) from the target mobile device. To accurately localize the target mobile device in the shortest time possible, UAVs should fly in the most efficient way possible. The two popular trajectory optimization strategies of UAVs are greedy and predictive approaches. However, the research on localization performances of the two approaches has been evaluated only under favorable settings (i.e., under good UAV geometries and small received signal strength (RSS) errors); more realistic scenarios still remain unexplored. In this study, we compare the localization performance of the greedy and predictive approaches under realistic RSS errors (i.e., up to 6 dB according to the ITU-R channel model).
Target localization is essential for emergency dispatching situations. Maximum likelihood estimation (MLE) methods are widely used to estimate the target position based on the received signal strength measurements. However, the performance of MLE solvers is significantly affected by the initialization (i.e., initial guess of the solution or solution search space). To address this, a previous study proposed the semidefinite programming (SDP)-based MLE initialization. However, the performance of the SDP-based initialization technique is largely affected by the shadowing variance and geometric diversity between the target and receivers. In this study, a radio frequency (RF) fingerprinting-based MLE initialization is proposed. Further, a maximum likelihood problem for target localization combining RF fingerprinting is formulated. In the three test environments of open space, urban, and indoor, the proposed RF fingerprinting-aided target localization method showed a performance improvement of up to 63.31% and an average of 39.13%, compared to the MLE algorithm initialized with SDP. Furthermore, unlike the SDP-MLE method, the proposed method was not significantly affected by the poor geometry between the target and receivers in our experiments.
Predicting the safety of urban roads for navigation via global navigation satellite systems (GNSS) signals is considered. To ensure safe driving of automated vehicles, the vehicle must plan its trajectory to avoid navigating on unsafe roads (e.g., icy conditions, construction zones, narrow streets, etc.). Such information can be derived from the roads' physical properties, vehicle's capabilities, and weather conditions. From a GNSS-based navigation perspective, the reliability of GNSS signals in different locales, which is heavily dependent on the road layout within the surrounding environment, is crucial to ensure safe automated driving. An urban road environment surrounded by tall objects can significantly degrade the accuracy and availability of GNSS signals. This article proposes an approach to predict the reliability of GNSS-based navigation to ensure safe urban navigation. Satellite navigation reliability at a given location and time on a road is determined based on the probabilistic position error bound of the vehicle-mounted GNSS receiver. A metric for GNSS reliability for ground vehicles is suggested, and a method to predict the conservative probabilistic error bound of the GNSS navigation solution is proposed. A satellite navigation reliability map is generated for various navigation applications. As a case study, the reliability map is used in the proposed optimization problem formulation for automated ground vehicle safety-constrained path planning.
Ground-based Augmentation System (GBAS) augments Global Navigation Satellite Systems (GNSS) to support the precision approach and landing of aircraft. To guarantee integrity, existing single-frequency GBAS utilizes position-domain geometry screening to eliminate potentially unsafe satellite geometries by inflating one or more broadcast GBAS parameters. However, GBAS availability can be drastically impacted in low-latitude regions where severe ionospheric conditions have been observed. Thus, we developed a novel geometry-screening algorithm in this study to improve GBAS availability in low-latitude regions. Simulations demonstrate that the proposed method can provide 5-8 percentage point availability enhancement of GBAS at Gale\~ao airport near Rio de Janeiro, Brazil, compared to existing methods.