This paper investigates the complexities of integrating Large Language Models (LLMs) into software products, with a focus on the challenges encountered for determining their readiness for release. Our systematic review of grey literature identifies common challenges in deploying LLMs, ranging from pre-training and fine-tuning to user experience considerations. The study introduces a comprehensive checklist designed to guide practitioners in evaluating key release readiness aspects such as performance, monitoring, and deployment strategies, aiming to enhance the reliability and effectiveness of LLM-based applications in real-world settings.
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs). Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs. Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees. Conversely, reinforcement learning (RL) stands out for its adaptability to uncertainties and reduced inference time, enabling real-time responsiveness. However, the effective implementation of RL is contingent on building accurate simulation models for WDNs, and prior applications have been limited by errors in simulation training data. These errors can potentially cause the RL agent to learn misleading patterns and actions and recommend suboptimal operational strategies. To overcome these challenges, we present an improved "hybrid RL" methodology. This method integrates the benefits of RL while anchoring it in historical data, which serves as a baseline to incrementally introduce optimal control recommendations. By leveraging operational data as a foundation for the agent's actions, we enhance the explainability of the agent's actions, foster more robust recommendations, and minimize error. Our findings demonstrate that the hybrid RL agent can significantly improve sustainability, operational efficiency, and dynamically adapt to emerging scenarios in real-world WDNs.
With the rise of deep learning, large datasets and complex models have become common, requiring significant computing power. To address this, data distillation has emerged as a technique to quickly train models with lower memory and time requirements. However, data distillation on text-based datasets hasn't been explored much because of the challenges rising due to its discrete nature. Additionally, existing dataset distillation methods often struggle to generalize to new architectures. In the paper, we propose several data distillation techniques for multilingual text classification datasets using language-model-based learning methods. We conduct experiments to analyze their performance in terms of classification strength, and cross-architecture generalization. Furthermore, we investigate the language-specific fairness of the data summaries generated by these methods. Our approach builds upon existing techniques, enhancing cross-architecture generalization in the text data distillation domain.
Visualizations such as bar charts, scatter plots, and objects on geographical maps often convey critical information, including exact and relative numeric values, using shapes. The choice of shape and method of encoding information is often arbitrarily, or based on convention. However, past studies have shown that the human eye can be fooled by visual representations. The Ebbinghaus illusion demonstrates that the perceived relative sizes of shapes depends on their configuration, which in turn can affect judgements, especially in visualizations like proportional symbol maps. In this study we evaluate the effects of varying the type of shapes and metrics for encoding data in visual representations on a spatio-temporal map interface. We find that some combinations of shape and metric are more conducive to accurate human judgements than others, and provide recommendations for applying these findings in future visualization designs.
With the growing use of deep learning methods, particularly graph neural networks, which encode intricate interconnectedness information, for a variety of real tasks, there is a necessity for explainability in such settings. In this paper, we demonstrate the applicability of popular explainability approaches on Graph Attention Networks (GAT) for a graph-based super-pixel image classification task. We assess the qualitative and quantitative performance of these techniques on three different datasets and describe our findings. The results shed a fresh light on the notion of explainability in GNNs, particularly GATs.
Mid-Air Helicopter Delivery (MAHD) is a new Entry, Descent and Landing (EDL) architecture to enable in situ mobility for Mars science at lower cost than previous missions. It uses a jetpack to slow down a Mars Science Helicopter (MSH) after separation from the backshell, and reach aerodynamic conditions suitable for helicopter take-off in mid air. For given aeroshell dimensions, only MAHD's lander-free approach leaves enough room in the aeroshell to accommodate the largest rotor option for MSH. This drastically improves flight performance, notably allowing +150\% increased science payload mass. Compared to heritage EDL approaches, the simpler MAHD architecture is also likely to reduce cost, and enables access to more hazardous and higher-elevation terrains on Mars. This paper introduces a design for the MAHD system architecture and operations. We present a mechanical configuration that fits both MSH and the jetpack within the 2.65-m Mars heritage aeroshell, and a jetpack control architecture which fully leverages the available helicopter avionics. We discuss preliminary numerical models of the flow dynamics resulting from the interaction between the jets, the rotors and the side winds. We define a force-torque sensing architecture capable of handling the wind and trimming the rotors to prepare for safe take-off. Finally, we analyze the dynamic environment and closed-loop control simulation results to demonstrate the preliminary feasibility of MAHD.
Image Inpainting is one of the very popular tasks in the field of image processing with broad applications in computer vision. In various practical applications, images are often deteriorated by noise due to the presence of corrupted, lost, or undesirable information. There have been various restoration techniques used in the past with both classical and deep learning approaches for handling such issues. Some traditional methods include image restoration by filling gap pixels using the nearby known pixels or using the moving average over the same. The aim of this paper is to perform image inpainting using robust deep learning methods that use partial convolution layers.
Recent advancements in language models based on recurrent neural networks and transformers architecture have achieved state-of-the-art results on a wide range of natural language processing tasks such as pos tagging, named entity recognition, and text classification. However, most of these language models are pre-trained in high resource languages like English, German, Spanish. Multi-lingual language models include Indian languages like Hindi, Telugu, Bengali in their training corpus, but they often fail to represent the linguistic features of these languages as they are not the primary language of the study. We introduce HinFlair, which is a language representation model (contextual string embeddings) pre-trained on a large monolingual Hindi corpus. Experiments were conducted on 6 text classification datasets and a Hindi dependency treebank to analyze the performance of these contextualized string embeddings for the Hindi language. Results show that HinFlair outperforms previous state-of-the-art publicly available pre-trained embeddings for downstream tasks like text classification and pos tagging. Also, HinFlair when combined with FastText embeddings outperforms many transformers-based language models trained particularly for the Hindi language.
Motivation: The proliferation of Biomedical research articles has made the task of information retrieval more important than ever. Scientists and Researchers are having difficulty in finding articles that contain information relevant to them. Proper extraction of biomedical entities like Disease, Drug/chem, Species, Gene/protein, can considerably improve the filtering of articles resulting in better extraction of relevant information. Performance on BioNer benchmarks has progressively improved because of progression in transformers-based models like BERT, XLNet, OpenAI, GPT2, etc. These models give excellent results; however, they are computationally expensive and we can achieve better scores for domain-specific tasks using other contextual string-based models and LSTM-CRF based sequence tagger. Results: We introduce BioNerFlair, a method to train models for biomedical named entity recognition using Flair plus GloVe embeddings and Bidirectional LSTM-CRF based sequence tagger. With almost the same generic architecture widely used for named entity recognition, BioNerFlair outperforms previous state-of-the-art models. I performed experiments on 8 benchmarks datasets for biomedical named entity recognition. Compared to current state-of-the-art models, BioNerFlair achieves the best F1-score of 90.17 beyond 84.72 on the BioCreative II gene mention (BC2GM) corpus, best F1-score of 94.03 beyond 92.36 on the BioCreative IV chemical and drug (BC4CHEMD) corpus, best F1-score of 88.73 beyond 78.58 on the JNLPBA corpus, best F1-score of 91.1 beyond 89.71 on the NCBI disease corpus, best F1-score of 85.48 beyond 78.98 on the Species-800 corpus, while near best results was observed on BC5CDR-chem, BC3CDR-disease, and LINNAEUS corpus.