Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Information": models, code, and papers

Deep-Learning Estimation of Weight Distribution Using Joint Kinematics for Lower-Limb Exoskeleton Control

Feb 06, 2024
Clément Lhoste, Emek Barış Küçüktabak, Lorenzo Vianello, Lorenzo Amato, Matthew R. Short, Kevin Lynch, Jose L. Pons

In the control of lower-limb exoskeletons with feet, the phase in the gait cycle can be identified by monitoring the weight distribution at the feet. This phase information can be used in the exoskeleton's controller to compensate the dynamics of the exoskeleton and to assign impedance parameters. Typically the weight distribution is calculated using data from sensors such as treadmill force plates or insole force sensors. However, these solutions increase both the setup complexity and cost. For this reason, we propose a deep-learning approach that uses a short time window of joint kinematics to predict the weight distribution of an exoskeleton in real time. The model was trained on treadmill walking data from six users wearing a four-degree-of-freedom exoskeleton and tested in real time on three different users wearing the same device. This test set includes two users not present in the training set to demonstrate the model's ability to generalize across individuals. Results show that the proposed method is able to fit the actual weight distribution with R2=0.9 and is suitable for real-time control with prediction times less than 1 ms. Experiments in closed-loop exoskeleton control show that deep-learning-based weight distribution estimation can be used to replace force sensors in overground and treadmill walking.

Via

Access Paper or Ask Questions

LF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field Cameras

Jan 30, 2024
Fei Teng, Jiaming Zhang, Jiawei Liu, Kunyu Peng, Xina Cheng, Zhiyong Li, Kailun Yang

Leveraging the rich information extracted from light field (LF) cameras is instrumental for dense prediction tasks. However, adapting light field data to enhance Salient Object Detection (SOD) still follows the traditional RGB methods and remains under-explored in the community. Previous approaches predominantly employ a custom two-stream design to discover the implicit angular feature within light field cameras, leading to significant information isolation between different LF representations. In this study, we propose an efficient paradigm (LF Tracy) to address this limitation. We eschew the conventional specialized fusion and decoder architecture for a dual-stream backbone in favor of a unified, single-pipeline approach. This comprises firstly a simple yet effective data augmentation strategy called MixLD to bridge the connection of spatial, depth, and implicit angular information under different LF representations. A highly efficient information aggregation (IA) module is then introduced to boost asymmetric feature-wise information fusion. Owing to this innovative approach, our model surpasses the existing state-of-the-art methods, particularly demonstrating a 23% improvement over previous results on the latest large-scale PKU dataset. By utilizing only 28.9M parameters, the model achieves a 10% increase in accuracy with 3M additional parameters compared to its backbone using RGB images and an 86% rise to its backbone using LF images. The source code will be made publicly available at https://github.com/FeiBryantkit/LF-Tracy.

* The source code will be made publicly available at https://github.com/FeiBryantkit/LF-Tracy

Via

Access Paper or Ask Questions

Neural Models and Algorithms for Sensorimotor Control of an Octopus Arm

Feb 02, 2024
Tixian Wang, Udit Halder, Ekaterina Gribkova, Rhanor Gillette, Mattia Gazzola, Prashant G. Mehta

In this article, a biophysically realistic model of a soft octopus arm with internal musculature is presented. The modeling is motivated by experimental observations of sensorimotor control where an arm localizes and reaches a target. Major contributions of this article are: (i) development of models to capture the mechanical properties of arm musculature, the electrical properties of the arm peripheral nervous system (PNS), and the coupling of PNS with muscular contractions; (ii) modeling the arm sensory system, including chemosensing and proprioception; and (iii) algorithms for sensorimotor control, which include a novel feedback neural motor control law for mimicking target-oriented arm reaching motions, and a novel consensus algorithm for solving sensing problems such as locating a food source from local chemical sensory information (exogenous) and arm deformation information (endogenous). Several analytical results, including rest-state characterization and stability properties of the proposed sensing and motor control algorithms, are provided. Numerical simulations demonstrate the efficacy of our approach. Qualitative comparisons against observed arm rest shapes and target-oriented reaching motions are also reported.

Via

Access Paper or Ask Questions

PFDM: Parser-Free Virtual Try-on via Diffusion Model

Feb 05, 2024
Yunfang Niu, Dong Yi, Lingxiang Wu, Zhiwei Liu, Pengxiang Cai, Jinqiao Wang

Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are often produced by near-perfect parsers or manual labeling. To overcome the bottleneck, we propose a parser-free virtual try-on method based on the diffusion model (PFDM). Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information. To learn the model effectively, we synthesize many pseudo-images and construct sample pairs by wearing various garments on persons. Supervised by the large-scale expanded dataset, we fuse the person and garment features using a proposed Garment Fusion Attention (GFA) mechanism. Experiments demonstrate that our proposed PFDM can successfully handle complex cases, synthesize high-fidelity images, and outperform both state-of-the-art parser-free and parser-based models.

* Accepted by IEEE ICASSP 2024

Via

Access Paper or Ask Questions

Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate

Feb 05, 2024
Can Jin, Tong Che, Hongwu Peng, Yiyuan Li, Marco Pavone

Generalization remains a central challenge in machine learning. In this work, we propose Learning from Teaching (LoT), a novel regularization technique for deep neural networks to enhance generalization. Inspired by the human ability to capture concise and abstract patterns, we hypothesize that generalizable correlations are expected to be easier to teach. LoT operationalizes this concept to improve the generalization of the main model with auxiliary student learners. The student learners are trained by the main model and improve the main model to capture more generalizable and teachable correlations by providing feedback. Our experimental results across several domains, including Computer Vision, Natural Language Processing, and Reinforcement Learning, demonstrate that the introduction of LoT brings significant benefits compared to merely training models on the original training data. It suggests the effectiveness of LoT in identifying generalizable information without falling into the swamp of complex patterns in data, making LoT a valuable addition to the current machine learning frameworks.

Via

Access Paper or Ask Questions

Vision-Language Models Provide Promptable Representations for Reinforcement Learning

Feb 05, 2024
William Chen, Oier Mees, Aviral Kumar, Sergey Levine

Humans can quickly learn new behaviors by leveraging background world knowledge. In contrast, agents trained with reinforcement learning (RL) typically learn behaviors from scratch. We thus propose a novel approach that uses the vast amounts of general and indexable world knowledge encoded in vision-language models (VLMs) pre-trained on Internet-scale data for embodied RL. We initialize policies with VLMs by using them as promptable representations: embeddings that are grounded in visual observations and encode semantic features based on the VLM's internal knowledge, as elicited through prompts that provide task context and auxiliary information. We evaluate our approach on visually-complex, long horizon RL tasks in Minecraft and robot navigation in Habitat. We find that our policies trained on embeddings extracted from general-purpose VLMs outperform equivalent policies trained on generic, non-promptable image embeddings. We also find our approach outperforms instruction-following methods and performs comparably to domain-specific embeddings.

Via

Access Paper or Ask Questions

Zero-Shot Machine Unlearning at Scale via Lipschitz Regularization

Feb 05, 2024
Jack Foster, Kyle Fogarty, Stefan Schoepf, Cengiz Öztireli, Alexandra Brintrup

To comply with AI and data regulations, the need to forget private or copyrighted information from trained machine learning models is increasingly important. The key challenge in unlearning is forgetting the necessary data in a timely manner, while preserving model performance. In this work, we address the zero-shot unlearning scenario, whereby an unlearning algorithm must be able to remove data given only a trained model and the data to be forgotten. Under such a definition, existing state-of-the-art methods are insufficient. Building on the concepts of Lipschitz continuity, we present a method that induces smoothing of the forget sample's output, with respect to perturbations of that sample. We show this smoothing successfully results in forgetting while preserving general model performance. We perform extensive empirical evaluation of our method over a range of contemporary benchmarks, verifying that our method achieves state-of-the-art performance under the strict constraints of zero-shot unlearning.

Via

Access Paper or Ask Questions

HICH Image/Text (HICH-IT): Comprehensive Text and Image Datasets for Hypertensive Intracerebral Hemorrhage Research

Feb 05, 2024
Jie Li, Yulong Xia, Tongxin Yang, Fenglin Cai, Miao Wei, Zhiwei Zhang, Li Jiang

In this paper, we introduce a new dataset in the medical field of hypertensive intracerebral hemorrhage (HICH), called HICH-IT, which includes both electronic medical records (EMRs) and head CT images. This dataset is designed to enhance the accuracy of artificial intelligence in the diagnosis and treatment of HICH. This dataset, built upon the foundation of standard text and image data, incorporates specific annotations within the EMRs, extracting key content from the text information, and categorizes the annotation content of imaging data into four types: brain midline, hematoma, left and right cerebral ventricle. HICH-IT aims to be a foundational dataset for feature learning in image segmentation tasks and named entity recognition. To further understand the dataset, we have trained deep learning algorithms to observe the performance. The pretrained models have been released at both www.daip.club and github.com/Deep-AI-Application-DAIP. The dataset has been uploaded to https://github.com/CYBUS123456/HICH-IT-Datasets. Index Terms-HICH, Deep learning, Intraparenchymal hemorrhage, named entity recognition, novel dataset

Via

Access Paper or Ask Questions

Enhancing Textbook Question Answering Task with Large Language Models and Retrieval Augmented Generation

Feb 05, 2024
Hessa Abdulrahman Alawwad, Areej Alhothali, Usman Naseem, Ali Alkhathlan, Amani Jamal

Textbook question answering (TQA) is a challenging task in artificial intelligence due to the complex nature of context and multimodal data. Although previous research has significantly improved the task, there are still some limitations including the models' weak reasoning and inability to capture contextual information in the lengthy context. The introduction of large language models (LLMs) has revolutionized the field of AI, however, directly applying LLMs often leads to inaccurate answers. This paper proposes a methodology that handle the out-of-domain scenario in TQA where concepts are spread across different lessons by incorporating the retrieval augmented generation (RAG) technique and utilize transfer learning to handle the long context and enhance reasoning abilities. Through supervised fine-tuning of the LLM model Llama-2 and the incorporation of RAG, our architecture outperforms the baseline, achieving a 4.12% accuracy improvement on validation set and 9.84% on test set for non-diagram multiple-choice questions.

Via

Access Paper or Ask Questions

SDRDPy: An application to graphically visualize the knowledge obtained with supervised descriptive rule algorithms

Jan 31, 2024
M. A. Padilla-Rascon, P. Gonzalez, C. J. Carmona

SDRDPy is a desktop application that allows experts an intuitive graphic and tabular representation of the knowledge extracted by any supervised descriptive rule discovery algorithm. The application is able to provide an analysis of the data showing the relevant information of the data set and the relationship between the rules, data and the quality measures associated for each rule regardless of the tool where algorithm has been executed. All of the information is presented in a user-friendly application in order to facilitate expert analysis and also the exportation of reports in different formats.

Via

Access Paper or Ask Questions