Large language models (LLMs) have been widely studied for their ability to store and utilize positive knowledge. However, negative knowledge, such as "lions don't live in the ocean", is also ubiquitous in the world but rarely mentioned explicitly in the text. What do LLMs know about negative knowledge? This work examines the ability of LLMs to negative commonsense knowledge. We design a constrained keywords-to-sentence generation task (CG) and a Boolean question-answering task (QA) to probe LLMs. Our experiments reveal that LLMs frequently fail to generate valid sentences grounded in negative commonsense knowledge, yet they can correctly answer polar yes-or-no questions. We term this phenomenon the belief conflict of LLMs. Our further analysis shows that statistical shortcuts and negation reporting bias from language modeling pre-training cause this conflict.
Entity relation extraction consists of two sub-tasks: entity recognition and relation extraction. Existing methods either tackle these two tasks separately or unify them with word-by-word interactions. In this paper, we propose HIORE, a new method for unified entity relation extraction. The key insight is to leverage the high-order interactions, i.e., the complex association among word pairs, which contains richer information than the first-order word-by-word interactions. For this purpose, we first devise a W-shape DNN (WNet) to capture coarse-level high-order connections. Then, we build a heuristic high-order graph and further calibrate the representations with a graph neural network (GNN). Experiments on three benchmarks (ACE04, ACE05, SciERC) show that HIORE achieves the state-of-the-art performance on relation extraction and an improvement of 1.1~1.8 F1 points over the prior best unified model.
Large language models (LLMs) have demonstrated remarkable potential in handling multilingual machine translation (MMT). In this paper, we systematically investigate the advantages and challenges of LLMs for MMT by answering two questions: 1) How well do LLMs perform in translating a massive number of languages? 2) Which factors affect LLMs' performance in translation? We evaluate popular LLMs, including XGLM, OPT, BLOOMZ, and ChatGPT, on 102 languages. Our empirical results show that even the best model ChatGPT still lags behind the supervised baseline NLLB in 83.33% of translation directions. Through further analysis, we discover that LLMs exhibit new working patterns when used for MMT. First, prompt semantics can surprisingly be ignored when given in-context exemplars, where LLMs still show strong performance even with unreasonable prompts. Second, cross-lingual exemplars can provide better task instruction for low-resource translation than exemplars in the same language pairs. Third, we observe the overestimated performance of BLOOMZ on dataset Flores-101, indicating the potential risk when using public datasets for evaluation.
Designing protein sequences with desired biological function is crucial in biology and chemistry. Recent machine learning methods use a surrogate sequence-function model to replace the expensive wet-lab validation. How can we efficiently generate diverse and novel protein sequences with high fitness? In this paper, we propose IsEM-Pro, an approach to generate protein sequences towards a given fitness criterion. At its core, IsEM-Pro is a latent generative model, augmented by combinatorial structure features from a separately learned Markov random fields (MRFs). We develop an Monte Carlo Expectation-Maximization method (MCEM) to learn the model. During inference, sampling from its latent space enhances diversity while its MRFs features guide the exploration in high fitness regions. Experiments on eight protein sequence design tasks show that our IsEM-Pro outperforms the previous best methods by at least 55% on average fitness score and generates more diverse and novel protein sequences.
The interplay between structural and electrical changes in the heart after myocardial infarction (MI) plays a key role in the initiation and maintenance of arrhythmia. The anatomical and electrophysiological properties of scar, border zone, and normal myocardium modify the electrocardiographic morphology, which is routinely analysed in clinical settings. However, the influence of various MI properties on the QRS is not intuitively predictable.In this work, we have systematically investigated the effects of 17 post-MI scenarios, varying the location, size, transmural extent, and conductive level of scarring and border zone area, on the forward-calculated QRS. Additionally, we have compared the contributions of different QRS score criteria for quantifying post-MI pathophysiology.The propagation of electrical activity in the ventricles is simulated via a Eikonal model on a unified coordinate system.The analysis has been performed on 49 subjects, and the results imply that the QRS is capable of identifying MI, suggesting the feasibility of inversely reconstructing infarct regions from QRS.There exist sensitivity variations of different QRS criteria for identifying 17 MI scenarios, which is informative for solving the inverse problem.
Pre-trained Language Models (PLMs), as parametric-based eager learners, have become the de-facto choice for current paradigms of Natural Language Processing (NLP). In contrast, k-Nearest-Neighbor (k-NN) classifiers, as the lazy learning paradigm, tend to mitigate over-fitting and isolated noise. In this paper, we revisit k-NN classifiers for augmenting the PLMs-based classifiers. From the methodological level, we propose to adopt k-NN with textual representations of PLMs in two steps: (1) Utilize k-NN as prior knowledge to calibrate the training process. (2) Linearly interpolate the probability distribution predicted by k-NN with that of the PLMs' classifier. At the heart of our approach is the implementation of k-NN-calibrated training, which treats predicted results as indicators for easy versus hard examples during the training process. From the perspective of the diversity of application scenarios, we conduct extensive experiments on fine-tuning, prompt-tuning paradigms and zero-shot, few-shot and fully-supervised settings, respectively, across eight diverse end-tasks. We hope our exploration will encourage the community to revisit the power of classical methods for efficient NLP\footnote{Code and datasets are available in https://github.com/zjunlp/Revisit-KNN.
Numerous diseases and aging can cause degeneration of people's balance ability resulting in limited mobility and even high risks of fall. Robotic technologies can provide more intensive rehabilitation exercises or be used as assistive devices to compensate for balance ability. However, With the new healthcare paradigm shifting from hospital care to home care, there is a gap in robotic systems that can provide care at home. This paper introduces Mobile Robotic Balance Assistant (MRBA), a compact and cost-effective balance assistive robot that can provide both rehabilitation training and activities of daily living (ADLs) assistance at home. A three degrees of freedom (3-DoF) robotic arm was designed to mimic the therapist arm function to provide balance assistance to the user. To minimize the interference to users' natural pelvis movements and gait patterns, the robot must have a Human-Robot Interface(HRI) that can detect user intention accurately and follow the user's movement smoothly and timely. Thus, a graceful user following control rule was proposed. The overall control architecture consists of two parts: an observer for human inputs estimation and an LQR-based controller with disturbance rejection. The proposed controller is validated in high-fidelity simulation with actual human trajectories, and the results successfully show the effectiveness of the method in different walking modes.
The interplay between structural and electrical changes in the heart after myocardial infarction (MI) plays a key role in the initiation and maintenance of arrhythmia. The anatomical and electrophysiological properties of scar, border zone, and normal myocardium modify the electrocardiographic morphology, which is routinely analysed in clinical settings. However, the influence of various MI properties on the QRS is not intuitively predictable.In this work, we have systematically investigated the effects of 17 post-MI scenarios, varying the location, size, transmural extent, and conductive level of scarring and border zone area, on the forward-calculated QRS. Additionally, we have compared the contributions of different QRS score criteria for quantifying post-MI pathophysiology.The propagation of electrical activity in the ventricles is simulated via a Eikonal model on a unified coordinate system.The analysis has been performed on 49 subjects, and the results imply that the QRS is capable of identifying MI, suggesting the feasibility of inversely reconstructing infarct regions from QRS.There exist sensitivity variations of different QRS criteria for identifying 17 MI scenarios, which is informative for solving the inverse problem.
Transfer learning is fundamental for addressing problems in settings with little training data. While several transfer learning approaches have been proposed in 3D, unfortunately, these solutions typically operate on an entire 3D object or even scene-level and thus, as we show, fail to generalize to new classes, such as deformable organic shapes. In addition, there is currently a lack of understanding of what makes pre-trained features transferable across significantly different 3D shape categories. In this paper, we make a step toward addressing these challenges. First, we analyze the link between feature locality and transferability in tasks involving deformable 3D objects, while also comparing different backbones and losses for local feature pre-training. We observe that with proper training, learned features can be useful in such tasks, but, crucially, only with an appropriate choice of the receptive field size. We then propose a differentiable method for optimizing the receptive field within 3D transfer learning. Jointly, this leads to the first learnable features that can successfully generalize to unseen classes of 3D shapes such as humans and animals. Our extensive experiments show that this approach leads to state-of-the-art results on several downstream tasks such as segmentation, shape correspondence, and classification. Our code is available at \url{https://github.com/pvnieo/vader}.