Language evolves over time, and word meaning changes accordingly. This is especially true in social media, since its dynamic nature leads to faster semantic shifts, making it challenging for NLP models to deal with new content and trends. However, the number of datasets and models that specifically address the dynamic nature of these social platforms is scarce. To bridge this gap, we present TempoWiC, a new benchmark especially aimed at accelerating research in social media-based meaning shift. Our results show that TempoWiC is a challenging benchmark, even for recently-released language models specialized in social media.
Single view-based reconstruction of hand-object interaction is challenging due to the severe observation missing caused by occlusions. This paper proposes a physics-based method to better solve the ambiguities in the reconstruction. It first proposes a force-based dynamic model of the in-hand object, which not only recovers the unobserved contacts but also solves for plausible contact forces. Next, a confidence-based slide prevention scheme is proposed, which combines both the kinematic confidences and the contact forces to jointly model static and sliding contact motion. Qualitative and quantitative experiments show that the proposed technique reconstructs both physically plausible and more accurate hand-object interaction and estimates plausible contact forces in real-time with a single RGBD sensor.
Conceptually, Neural Ordinary Differential Equations (NeuralODEs) pose an attractive way to extract dynamical laws from time series data, as they are natural extensions of the traditional differential equation-based modeling paradigm of the physical sciences. In practice, NeuralODEs display long training times and suboptimal results, especially for longer duration data where they may fail to fit the data altogether. While methods have been proposed to stabilize NeuralODE training, many of these involve placing a strong constraint on the functional form the trained NeuralODE can take that the actual underlying governing equation does not guarantee satisfaction. In this work, we present a novel NeuralODE training algorithm that leverages tools from the chaos and mathematical optimization communities - synchronization and homotopy optimization - for a breakthrough in tackling the NeuralODE training obstacle. We demonstrate architectural changes are unnecessary for effective NeuralODE training. Compared to the conventional training methods, our algorithm achieves drastically lower loss values without any changes to the model architectures. Experiments on both simulated and real systems with complex temporal behaviors demonstrate NeuralODEs trained with our algorithm are able to accurately capture true long term behaviors and correctly extrapolate into the future.
Humans can generate reasonable answers to novel queries (Schulz, 2012): if I asked you what kind of food you want to eat for lunch, you would respond with a food, not a time. The thought that one would respond "After 4pm" to "What would you like to eat" is either a joke or a mistake, and seriously entertaining it as a lunch option would likely never happen in the first place. While understanding how people come up with new ideas, thoughts, explanations, and hypotheses that obey the basic constraints of a novel search space is of central importance to cognitive science, there is no agreed-on formal model for this kind of reasoning. We propose that a core component of any such reasoning system is a type theory: a formal imposition of structure on the kinds of computations an agent can perform, and how they're performed. We motivate this proposal with three empirical observations: adaptive constraints on learning and inference (i.e. generating reasonable hypotheses), how people draw distinctions between improbability and impossibility, and people's ability to reason about things at varying levels of abstraction.
In this paper we propose a novel algorithm, Wi-Closure, to improve computational efficiency and robustness of loop closure detection in multi-robot SLAM. Our approach decreases the computational overhead of classical approaches by pruning the search space of potential loop closures, prior to evaluation by a typical multi-robot SLAM pipeline. Wi-Closure achieves this by identifying candidates that are spatially close to each other by using sensing over the wireless communication signal between robots, even when they are operating in non-line-of-sight or in remote areas of the environment from one another. We demonstrate the validity of our approach in simulation and hardware experiments. Our results show that using Wi-closure greatly reduces computation time, by 54% in simulation and by 77% in hardware compared, with a multi-robot SLAM baseline. Importantly, this is achieved without sacrificing accuracy. Using Wi-Closure reduces absolute trajectory estimation error by 99% in simulation and 89.2% in hardware experiments. This improvement is due in part to Wi-Closure's ability to avoid catastrophic optimization failure that typically occurs with classical approaches in challenging repetitive environments.
In many complex sequential decision making tasks, online planning is crucial for high-performance. For efficient online planning, Monte Carlo Tree Search (MCTS) employs a principled mechanism for trading off between exploration and exploitation. MCTS outperforms comparison methods in various discrete decision making domains such as Go, Chess, and Shogi. Following, extensions of MCTS to continuous domains have been proposed. However, the inherent high branching factor and the resulting explosion of search tree size is limiting existing methods. To solve this problem, this paper proposes Continuous Monte Carlo Graph Search (CMCGS), a novel extension of MCTS to online planning in environments with continuous state and action spaces. CMCGS takes advantage of the insight that, during planning, sharing the same action policy between several states can yield high performance. To implement this idea, at each time step CMCGS clusters similar states into a limited number of stochastic action bandit nodes, which produce a layered graph instead of an MCTS search tree. Experimental evaluation with limited sample budgets shows that CMCGS outperforms comparison methods in several complex continuous DeepMind Control Suite benchmarks and a 2D navigation task.
This is the technical report of the 9th place in the final result of PARSE2022 Challenge. We solve the segmentation problem of the pulmonary artery by using a two-stage method based on a 3D CNN network. The coarse model is used to locate the ROI, and the fine model is used to refine the segmentation result. In addition, in order to improve the segmentation performance, we adopt multi-view and multi-window level method, at the same time we employ a fine-tune strategy to mitigate the impact of inconsistent labeling.
In many applications, signal denoising is often the first pre-processing step before any subsequent analysis or learning task. In this paper, we propose to apply a deep learning denoising model inspired by a signal processing, a learnable version of wavelet packet transform. The proposed algorithm has signficant learning capabilities with few interpretable parameters and has an intuitive initialisation. We propose a post-learning modification of the parameters to adapt the denoising to different noise levels. We evaluate the performance of the proposed methodology on two case studies and compare it to other state of the art approaches, including wavelet schrinkage denoising, convolutional neural network, autoencoder and U-net deep models. The first case study is based on designed functions that have typically been used to study denoising properties of the algorithms. The second case study is an audio background removal task. We demonstrate how the proposed algorithm relates to the universality of signal processing methods and the learning capabilities of deep learning approaches. In particular, we evaluate the obtained denoising performances on structured noisy signals inside and outside the classes used for training. In addition to having good performance in denoising signals inside and outside to the training class, our method shows to be particularly robust when different noise levels, noise types and artifacts are added.
Reinforcement Learning (RL) has witnessed great strides for quadruped locomotion, with continued progress in the reliable sim-to-real transfer of policies. However, it remains a challenge to reuse a policy on another robot, which could save time for retraining. In this work, we present a framework for zero-shot policy retargeting wherein diverse motor skills can be transferred between robots of different shapes and sizes. The new framework centers on a planning-and-control pipeline that systematically integrates RL and Model Predictive Control (MPC). The planning stage employs RL to generate a dynamically plausible trajectory as well as the contact schedule, avoiding the combinatorial complexity of contact sequence optimization. This information is then used to seed the MPC to stabilize and robustify the policy roll-out via a new Hybrid Kinodynamic (HKD) model that implicitly optimizes the foothold locations. Hardware results show an ability to transfer policies from both the A1 and Laikago robots to the MIT Mini Cheetah robot without requiring any policy re-tuning.
Cardiac magnetic resonance (CMR) sequences visualise the cardiac function voxel-wise over time. Simultaneously, deep learning-based deformable image registration is able to estimate discrete vector fields which warp one time step of a CMR sequence to the following in a self-supervised manner. However, despite the rich source of information included in these 3D+t vector fields, a standardised interpretation is challenging and the clinical applications remain limited so far. In this work, we show how to efficiently use a deformable vector field to describe the underlying dynamic process of a cardiac cycle in form of a derived 1D motion descriptor. Additionally, based on the expected cardiovascular physiological properties of a contracting or relaxing ventricle, we define a set of rules that enables the identification of five cardiovascular phases including the end-systole (ES) and end-diastole (ED) without the usage of labels. We evaluate the plausibility of the motion descriptor on two challenging multi-disease, -center, -scanner short-axis CMR datasets. First, by reporting quantitative measures such as the periodic frame difference for the extracted phases. Second, by comparing qualitatively the general pattern when we temporally resample and align the motion descriptors of all instances across both datasets. The average periodic frame difference for the ED, ES key phases of our approach is $0.80\pm{0.85}$, $0.69\pm{0.79}$ which is slightly better than the inter-observer variability ($1.07\pm{0.86}$, $0.91\pm{1.6}$) and the supervised baseline method ($1.18\pm{1.91}$, $1.21\pm{1.78}$). Code and labels will be made available on our GitHub repository. https://github.com/Cardio-AI/cmr-phase-detection