Previous STRIPS domain model acquisition approaches that learn from state traces start with the names and parameters of the actions to be learned. Therefore their only task is to deduce the preconditions and effects of the given actions. In this work, we explore learning in situations when the parameters of learned actions are not provided. We define two levels of trace quality based on which information is provided and present an algorithm for each. In one level (L1), the states in the traces are labeled with action names, so we can deduce the number and names of the actions, but we still need to work out the number and types of parameters. In the other level (L2), the states are additionally labeled with objects that constitute the parameters of the corresponding grounded actions. Here we still need to deduce the types of the parameters in the learned actions. We experimentally evaluate the proposed algorithms and compare them with the state-of-the-art learning tool FAMA on a large collection of IPC benchmarks. The evaluation shows that our new algorithms are faster, can handle larger inputs and provide better results in terms of learning action models more similar to reference models.
Social networks influence behaviors, preferences, and relationships and play a crucial role in the dissemination of information and norms within human societies. As large language models (LLMs) increasingly integrate into social and professional environments, understanding their behavior within the context of social networks and interactions becomes essential. Our study analyzes the behaviors of standard network structures and real-world networks to determine whether the dynamics of multiple LLMs align with human social dynamics. We explore various social network principles, including micro-level concepts such as preferential attachment, triadic closure, and homophily, as well as macro-level concepts like community structure and the small-world phenomenon. Our findings suggest that LLMs demonstrate all these principles when they are provided with network structures and asked about their preferences regarding network formation. Furthermore, we investigate LLMs' decision-making based on real-world networks to compare the strengths of these principles. Our results reveal that triadic closure and homophily have a stronger influence than preferential attachment and that LLMs substantially exceed random guessing in the task of network formation predictions. Overall, our study contributes to the development of socially aware LLMs by shedding light on LLMs' network formation behaviors and exploring their impacts on social dynamics and norms.
Modern techniques like contrastive learning have been effectively used in many areas, including computer vision, natural language processing, and graph-structured data. Creating positive examples that assist the model in learning robust and discriminative representations is a crucial stage in contrastive learning approaches. Usually, preset human intuition directs the selection of relevant data augmentations. Due to patterns that are easily recognized by humans, this rule of thumb works well in the vision and language domains. However, it is impractical to visually inspect the temporal structures in time series. The diversity of time series augmentations at both the dataset and instance levels makes it difficult to choose meaningful augmentations on the fly. In this study, we address this gap by analyzing time series data augmentation using information theory and summarizing the most commonly adopted augmentations in a unified format. We then propose a contrastive learning framework with parametric augmentation, AutoTCL, which can be adaptively employed to support time series representation learning. The proposed approach is encoder-agnostic, allowing it to be seamlessly integrated with different backbone encoders. Experiments on univariate forecasting tasks demonstrate the highly competitive results of our method, with an average 6.5\% reduction in MSE and 4.7\% in MAE over the leading baselines. In classification tasks, AutoTCL achieves a $1.2\%$ increase in average accuracy.
System logs are some of the most important information for the maintenance of software systems, which have become larger and more complex in recent years. The goal of log-based anomaly detection is to automatically detect system anomalies by analyzing the large number of logs generated in a short period of time, which is a critical challenge in the real world. Previous studies have used a log parser to extract templates from unstructured log data and detect anomalies on the basis of patterns of the template occurrences. These methods have limitations for logs with unknown templates. Furthermore, since most log anomalies are known to be point anomalies rather than contextual anomalies, detection methods based on occurrence patterns can cause unnecessary delays in detection. In this paper, we propose LogELECTRA, a new log anomaly detection model that analyzes a single line of log messages more deeply on the basis of self-supervised anomaly detection. LogELECTRA specializes in detecting log anomalies as point anomalies by applying ELECTRA, a natural language processing model, to analyze the semantics of a single line of log messages. LogELECTRA outperformed existing state-of-the-art methods in experiments on the public benchmark log datasets BGL, Sprit, and Thunderbird.
Multi-span answer extraction, also known as the task of multi-span question answering (MSQA), is critical for real-world applications, as it requires extracting multiple pieces of information from a text to answer complex questions. Despite the active studies and rapid progress in English MSQA research, there is a notable lack of publicly available MSQA benchmark in Chinese. Previous efforts for constructing MSQA datasets predominantly emphasized entity-centric contextualization, resulting in a bias towards collecting factoid questions and potentially overlooking questions requiring more detailed descriptive responses. To overcome these limitations, we present CLEAN, a comprehensive Chinese multi-span question answering dataset that involves a wide range of open-domain subjects with a substantial number of instances requiring descriptive answers. Additionally, we provide established models from relevant literature as baselines for CLEAN. Experimental results and analysis show the characteristics and challenge of the newly proposed CLEAN dataset for the community. Our dataset, CLEAN, will be publicly released at zhiyiluo.site/misc/clean_v1.0_ sample.json.
We present Universal Manipulation Interface (UMI) -- a data collection and policy learning framework that allows direct skill transfer from in-the-wild human demonstrations to deployable robot policies. UMI employs hand-held grippers coupled with careful interface design to enable portable, low-cost, and information-rich data collection for challenging bimanual and dynamic manipulation demonstrations. To facilitate deployable policy learning, UMI incorporates a carefully designed policy interface with inference-time latency matching and a relative-trajectory action representation. The resulting learned policies are hardware-agnostic and deployable across multiple robot platforms. Equipped with these features, UMI framework unlocks new robot manipulation capabilities, allowing zero-shot generalizable dynamic, bimanual, precise, and long-horizon behaviors, by only changing the training data for each task. We demonstrate UMI's versatility and efficacy with comprehensive real-world experiments, where policies learned via UMI zero-shot generalize to novel environments and objects when trained on diverse human demonstrations. UMI's hardware and software system is open-sourced at https://umi-gripper.github.io.
Effective diabetes management is crucial for maintaining health in diabetic patients. Large Language Models (LLMs) have opened new avenues for diabetes management, facilitating their efficacy. However, current LLM-based approaches are limited by their dependence on general sources and lack of integration with domain-specific knowledge, leading to inaccurate responses. In this paper, we propose a knowledge-infused LLM-powered conversational health agent (CHA) for diabetic patients. We customize and leverage the open-source openCHA framework, enhancing our CHA with external knowledge and analytical capabilities. This integration involves two key components: 1) incorporating the American Diabetes Association dietary guidelines and the Nutritionix information and 2) deploying analytical tools that enable nutritional intake calculation and comparison with the guidelines. We compare the proposed CHA with GPT4. Our evaluation includes 100 diabetes-related questions on daily meal choices and assessing the potential risks associated with the suggested diet. Our findings show that the proposed agent demonstrates superior performance in generating responses to manage essential nutrients.
Network telemetry based on data models is expected to become the standard mechanism for collecting operational data from network devices efficiently. But the wide variety of standard and proprietary data models along with the different implementations of telemetry protocols offered by network vendors, become a barrier when monitoring heterogeneous network infrastructures. To facilitate the integration and sharing of context information related to model-driven telemetry, this work proposes a semantic network inventory that integrates new information models specifically developed to capture context information in a vendor-agnostic fashion using current standards defined for context management. To automate the integration of this context information within the network inventory, a reference architecture is designed. Finally, a prototype of the solution is implemented and validated through a case study that illustrates how the network inventory can ease the operation of model-driven telemetry in multi-vendor networks.
The use of hyperspectral imaging for medical applications is becoming more common in recent years. One of the main obstacles that researchers find when developing hyperspectral algorithms for medical applications is the lack of specific, publicly available, and hyperspectral medical data. The work described in this paper was developed within the framework of the European project HELICoiD (HypErspectraL Imaging Cancer Detection), which had as a main goal the application of hyperspectral imaging to the delineation of brain tumors in real-time during neurosurgical operations. In this paper, the methodology followed to generate the first hyperspectral database of in-vivo human brain tissues is presented. Data was acquired employing a customized hyperspectral acquisition system capable of capturing information in the Visual and Near InfraRed (VNIR) range from 400 to 1000 nm. Repeatability was assessed for the cases where two images of the same scene were captured consecutively. The analysis reveals that the system works more efficiently in the spectral range between 450 and 900 nm. A total of 36 hyperspectral images from 22 different patients were obtained. From these data, more than 300 000 spectral signatures were labeled employing a semi-automatic methodology based on the spectral angle mapper algorithm. Four different classes were defined: normal tissue, tumor tissue, blood vessel, and background elements. All the hyperspectral data has been made available in a public repository.
In recent years, deep neural networks (DNNs) have gained widespread adoption for continuous mobile object detection (OD) tasks, particularly in autonomous systems. However, a prevalent issue in their deployment is the one-size-fits-all approach, where a single DNN is used, resulting in inefficient utilization of computational resources. This inefficiency is particularly detrimental in energy-constrained systems, as it degrades overall system efficiency. We identify that, the contextual information embedded in the input data stream (e.g. the frames in the camera feed that the OD models are run on) could be exploited to allow a more efficient multi-model-based OD process. In this paper, we propose SHIFT which continuously selects from a variety of DNN-based OD models depending on the dynamically changing contextual information and computational constraints. During this selection, SHIFT uniquely considers multi-accelerator execution to better optimize the energy-efficiency while satisfying the latency constraints. Our proposed methodology results in improvements of up to 7.5x in energy usage and 2.8x in latency compared to state-of-the-art GPU-based single model OD approaches.