There is an increasing need for the ability to model fine-grained opinion shifts of social media users, as concerns about the potential polarizing social effects increase. However, the lack of publicly available datasets that are suitable for the task presents a major challenge. In this paper, we introduce an innovative annotated dataset for modeling subtle opinion fluctuations and detecting fine-grained stances. The dataset includes a sufficient amount of stance polarity and intensity labels per user over time and within entire conversational threads, thus making subtle opinion fluctuations detectable both in long term and in short term. All posts are annotated by non-experts and a significant portion of the data is also annotated by experts. We provide a strategy for recruiting suitable non-experts. Our analysis of the inter-annotator agreements shows that the resulting annotations obtained from the majority vote of the non-experts are of comparable quality to the annotations of the experts. We provide analyses of the stance evolution in short term and long term levels, a comparison of language usage between users with vacillating and resolute attitudes, and fine-grained stance detection baselines.
As scene changes with time map descriptors become outdated, affecting VPS localization accuracy. In this work, we propose an approach to detect structural and texture scene changes to be followed by map update. In our method - map includes 3D points with descriptors generated either via LiDAR or SFM. Common approaches suffer from shortcomings: 1) Direct comparison of the two point-clouds for change detection is slow due to the need to build new point-cloud every time we want to compare; 2) Image based comparison requires to keep the map images adding substantial storage overhead. To circumvent this problems, we propose an approach based on point-clouds descriptors comparison: 1) Based on VPS poses select close query and map images pairs, 2) Registration of query images to map image descriptors, 3) Use segmentation to filter out dynamic or short term temporal changes, 4) Compare the descriptors between corresponding segments.
We present ProtoTEx, a novel white-box NLP classification architecture based on prototype networks. ProtoTEx faithfully explains model decisions based on prototype tensors that encode latent clusters of training examples. At inference time, classification decisions are based on the distances between the input text and the prototype tensors, explained via the training examples most similar to the most influential prototypes. We also describe a novel interleaved training algorithm that effectively handles classes characterized by the absence of indicative features. On a propaganda detection task, ProtoTEx accuracy matches BART-large and exceeds BERT-large with the added benefit of providing faithful explanations. A user study also shows that prototype-based explanations help non-experts to better recognize propaganda in online news.
Text generation with beam search has proven successful in a wide range of applications. The commonly-used implementation of beam decoding follows a first come, first served heuristic: it keeps a set of already completed sequences over time steps and stops when the size of this set reaches the beam size. We introduce a patience factor, a simple modification to this decoding algorithm, that generalizes the stopping criterion and provides flexibility to the depth of search. Extensive empirical results demonstrate that the patience factor improves decoding performance of strong pretrained models on news text summarization and machine translation over diverse language pairs, with a negligible inference slowdown. Our approach only modifies one line of code and can be thus readily incorporated in any implementation.
Multi-agent reinforcement learning methods have shown remarkable potential in solving complex multi-agent problems but mostly lack theoretical guarantees. Recently, mean field control and mean field games have been established as a tractable solution for large-scale multi-agent problems with many agents. In this work, driven by a motivating scheduling problem, we consider a discrete-time mean field control model with common environment states. We rigorously establish approximate optimality as the number of agents grows in the finite agent case and find that a dynamic programming principle holds, resulting in the existence of an optimal stationary policy. As exact solutions are difficult in general due to the resulting continuous action space of the limiting mean field Markov decision process, we apply established deep reinforcement learning methods to solve the associated mean field control problem. The performance of the learned mean field control policy is compared to typical multi-agent reinforcement learning approaches and is found to converge to the mean field performance for sufficiently many agents, verifying the obtained theoretical results and reaching competitive solutions.
The actual progression of pitting on ball screw drive spindles is not well known since previous studies have only relied on the investigation of indirect wear effects (e. g. temperature, motor current, structure-borne noise). Using images from a camera system for ball screw drives, this paper elaborates on the visual analysis of pitting itself. Due to its direct, condition-based assessment of the wear state, an image-based approach offers several advantages, such as: Good interpretability, low influence of environmental conditions, and high spatial resolution. The study presented in this paper is based on a dataset containing the entire wear progression from original condition to component failure of ten ball screw drive spindles. The dataset is being analyzed regarding the following parameters: Axial length, tangential length, and surface area of each pit, the total number of pits, and the time of initial visual appearance of each pit. The results provide evidence that wear development can be quantified based on visual wear characteristics. In addition, using the dedicated camera system, the actual course of the growth curve of individual pits can be captured during machine operation. Using the findings of the analysis, the authors propose a formula for standards-based wear quantification based on geometric wear characteristics.
Cine cardiac magnetic resonance (CMR) imaging is considered the gold standard for cardiac function evaluation. However, cine CMR acquisition is inherently slow and in recent decades considerable effort has been put into accelerating scan times without compromising image quality or the accuracy of derived results. In this paper, we present a fully-automated, quality-controlled integrated framework for reconstruction, segmentation and downstream analysis of undersampled cine CMR data. The framework enables active acquisition of radial k-space data, in which acquisition can be stopped as soon as acquired data are sufficient to produce high quality reconstructions and segmentations. This results in reduced scan times and automated analysis, enabling robust and accurate estimation of functional biomarkers. To demonstrate the feasibility of the proposed approach, we perform realistic simulations of radial k-space acquisitions on a dataset of subjects from the UK Biobank and present results on in-vivo cine CMR k-space data collected from healthy subjects. The results demonstrate that our method can produce quality-controlled images in a mean scan time reduced from 12 to 4 seconds per slice, and that image quality is sufficient to allow clinically relevant parameters to be automatically estimated to within 5% mean absolute difference.
Kinematics decoding from brain activity helps in developing rehabilitation or power-augmenting brain-computer interface devices. Low-frequency signals recorded from non-invasive electroencephalography (EEG) are associated with the neural motor correlation utilised for motor trajectory decoding (MTD). In this communication, the ability to decode motor kinematics trajectory from pre-movement delta-band (0.5-3 Hz) EEG is investigated for the healthy participants. In particular, two deep learning-based neural decoders called PreMovNet-I and PreMovNet-II, are proposed that make use of motor-related neural information existing in the pre-movement EEG data. EEG data segments with various time lags of 150 ms, 200 ms, 250 ms, 300 ms, and 350 ms before the movement onset are utilised for the same. The MTD is presented for grasp-and-lift task (WAY-EEG-GAL dataset) using EEG with the various lags taken as input to the neural decoders. The performance of the proposed decoders are compared with the state-of-the-art multi-variable linear regression (mLR) model. Pearson correlation coefficient and hand trajectory are utilised as performance metric. The results demonstrate the viability of decoding 3D hand kinematics using pre-movement EEG data, enabling better control of BCI-based external devices such as exoskeleton/exosuit.
Mixed reality applications often require virtual objects that are partly occluded by real objects. However, previous research and commercial products have limitations in terms of performance and efficiency. To address these challenges, we propose a novel depth contour occlusion (DCO) algorithm. The proposed method is based on the sensitivity of contour occlusion and a binocular stereoscopic vision device. In this method, a depth contour map is combined with a sparse depth map obtained from a two-stage adaptive filter area stereo matching algorithm and the depth contour information of the objects extracted by a digital image stabilisation optical flow method. We also propose a quadratic optimisation model with three constraints to generate an accurate dense map of the depth contour for high-quality real-virtual occlusion. The whole process is accelerated by GPU. To evaluate the effectiveness of the algorithm, we demonstrate a time con-sumption statistical analysis for each stage of the DCO algorithm execution. To verify the relia-bility of the real-virtual occlusion effect, we conduct an experimental analysis on single-sided, enclosed, and complex occlusions; subsequently, we compare it with the occlusion method without quadratic optimisation. With our GPU implementation for real-time DCO, the evaluation indicates that applying the presented DCO algorithm can enhance the real-time performance and the visual quality of real-virtual occlusion.
As a widely studied task, video restoration aims to enhance the quality of the videos with multiple potential degradations, such as noises, blurs and compression artifacts. Among video restorations, compressed video quality enhancement and video super-resolution are two of the main tacks with significant values in practical scenarios. Recently, recurrent neural networks and transformers attract increasing research interests in this field, due to their impressive capability in sequence-to-sequence modeling. However, the training of these models is not only costly but also relatively hard to converge, with gradient exploding and vanishing problems. To cope with these problems, we proposed a two-stage framework including a multi-frame recurrent network and a single-frame transformer. Besides, multiple training strategies, such as transfer learning and progressive training, are developed to shorten the training time and improve the model performance. Benefiting from the above technical contributions, our solution wins two champions and a runner-up in the NTIRE 2022 super-resolution and quality enhancement of compressed video challenges.