The objective of this paper is to explore the opportunities for human information behaviour research to inform and influence the field of machine learning and the resulting machine information behaviour. Using the development of foundation models in machine learning as an example, the paper illustrates how human information behaviour research can bring to machine learning a more nuanced view of information and informing, a better understanding of information need and how that affects the communication among people and systems, guidance on the nature of context and how to operationalize that in models and systems, and insights into bias, misinformation, and marginalization. Despite their clear differences, the fields of information behaviour and machine learning share many common objectives, paradigms, and key research questions. The example of foundation models illustrates that human information behaviour research has much to offer in addressing some of the challenges emerging in the nascent area of machine information behaviour.
Natural language is one of the ways information is encoded and it has highly abstracted and conceptualized the information. This paper disassembles the information represented by natural language, analyzes the classification coding system of attribute information and the abstraction relation between attribute information and entities in the real world, constructs the storage model of information, and simulate the attribute information precessing process in one of the attribute spaces, interprets how the relations which represented by "Be", "Of", "Have", and so on are embodied in the information storage data structures and the corresponding data reading modes, reclassifies the sentences types from the perspective of task types and data reading modes. Then, simulated the understanding process (the information processing process) on a dialogue example. Finally, the author summarizes the basic conditions of understanding and gives out the definition of understanding from a personal point of view. The study in this paper provides a practical, theoretical basis and research methods for NLU. It also can be applied in large-scale, multi-type information processing in the artificial intelligence (AI) area.
This paper presents an online informative path planning approach for active information gathering on three-dimensional surfaces using aerial robots. Most existing works on surface inspection focus on planning a path offline that can provide full coverage of the surface, which inherently assumes the surface information is uniformly distributed hence ignoring potential spatial correlations of the information field. In this paper, we utilize manifold Gaussian processes (mGPs) with geodesic kernel functions for mapping surface information fields and plan informative paths online in a receding horizon manner. Our approach actively plans information-gathering paths based on recent observations that respect dynamic constraints of the vehicle and a total flight time budget. We provide planning results for simulated temperature modeling for simple and complex 3D surface geometries (a cylinder and an aircraft model). We demonstrate that our informative planning method outperforms traditional approaches such as 3D coverage planning and random exploration, both in reconstruction error and information-theoretic metrics. We also show that by taking spatial correlations of the information field into planning using mGPs, the information gathering efficiency is significantly improved.
An image caption should fluently present the essential information in a given image, including informative, fine-grained entity mentions and the manner in which these entities interact. However, current captioning models are usually trained to generate captions that only contain common object names, thus falling short on an important "informativeness" dimension. We present a mechanism for integrating image information together with fine-grained labels (assumed to be generated by some upstream models) into a caption that describes the image in a fluent and informative manner. We introduce a multimodal, multi-encoder model based on Transformer that ingests both image features and multiple sources of entity labels. We demonstrate that we can learn to control the appearance of these entity labels in the output, resulting in captions that are both fluent and informative.
In pattern recognition, learning, and data mining one obtains information from information-carrying objects. This involves an objective definition of the information in a single object, the information to go from one object to another object in a pair of objects, the information to go from one object to any other object in a multiple of objects, and the shared information between objects. This is called "information distance." We survey a selection of new developments in information distance.
Reshef et al. recently proposed a new statistical measure, the "maximal information coefficient" (MIC), for quantifying arbitrary dependencies between pairs of stochastic quantities. MIC is based on mutual information, a fundamental quantity in information theory that is widely understood to serve this need. MIC, however, is not an estimate of mutual information. Indeed, it was claimed that MIC possesses a desirable mathematical property called "equitability" that mutual information lacks. This was not proven; instead it was argued solely through the analysis of simulated data. Here we show that this claim, in fact, is incorrect. First we offer mathematical proof that no (non-trivial) dependence measure satisfies the definition of equitability proposed by Reshef et al.. We then propose a self-consistent and more general definition of equitability that follows naturally from the Data Processing Inequality. Mutual information satisfies this new definition of equitability while MIC does not. Finally, we show that the simulation evidence offered by Reshef et al. was artifactual. We conclude that estimating mutual information is not only practical for many real-world applications, but also provides a natural solution to the problem of quantifying associations in large data sets.
Exploration tasks are embedded in many robotics applications, such as search and rescue and space exploration. Information-based exploration algorithms aim to find the most informative trajectories by maximizing an information-theoretic metric, such as the mutual information between the map and potential future measurements. Unfortunately, most existing information-based exploration algorithms are plagued by the computational difficulty of evaluating the Shannon mutual information metric. In this paper, we consider the fundamental problem of evaluating Shannon mutual information between the map and a range measurement. First, we consider 2D environments. We propose a novel algorithm, called the Fast Shannon Mutual Information (FSMI). The key insight behind the algorithm is that a certain integral can be computed analytically, leading to substantial computational savings. Second, we consider 3D environments, represented by efficient data structures, e.g., an OctoMap, such that the measurements are compressed by Run-Length Encoding (RLE). We propose a novel algorithm, called FSMI-RLE, that efficiently evaluates the Shannon mutual information when the measurements are compressed using RLE. For both the FSMI and the FSMI-RLE, we also propose variants that make different assumptions on the sensor noise distribution for the purpose of further computational savings. We evaluate the proposed algorithms in extensive experiments. In particular, we show that the proposed algorithms outperform existing algorithms that compute Shannon mutual information as well as other algorithms that compute the Cauchy-Schwarz Quadratic mutual information (CSQMI). In addition, we demonstrate the computation of Shannon mutual information on a 3D map for the first time.
A common goal in the areas of secure information flow and privacy is to build effective defenses against unwanted leakage of information. To this end, one must be able to reason about potential attacks and their interplay with possible defenses. In this paper we propose a game-theoretic framework to formalize strategies of attacker and defender in the context of information leakage, and provide a basis for developing optimal defense methods. A crucial novelty of our games is that their utility is given by information leakage, which in some cases may behave in a non-linear way. This causes a significant deviation from classic game theory, in which utility functions are linear with respect to players' strategies. Hence, a key contribution of this paper is the establishment of the foundations of information leakage games. We consider two main categories of games, depending on the particular notion of information leakage being captured. The first category, which we call QIF-games, is tailored for the theory of quantitative information flow (QIF). The second one, which we call DP-games, corresponds to differential privacy (DP).
Feature extraction and description is an important topic of computer vision, as it is the starting point of a number of tasks such as image reconstruction, stitching, registration, and recognition among many others. In this paper, two new image features are proposed: the Information Ratio (IR) and the Mutual Information Ratio (MIR). The IR is a feature of a single image, while the MIR describes features common across two or more images.We begin by introducing the IR and the MIR and motivate these features in an information theoretical context as the ratio of the self-information of an intensity level over the information contained over the pixels of the same intensity. Notably, the relationship of the IR and MIR with the image entropy and mutual information, classic information measures, are discussed. Finally, the effectiveness of these features is tested through feature extraction over INRIA Copydays datasets and feature matching over the Oxfords Affine Covariant Regions. These numerical evaluations validate the relevance of the IR and MIR in practical computer vision tasks
Selecting a minimal feature set that is maximally informative about a target variable is a central task in machine learning and statistics. Information theory provides a powerful framework for formulating feature selection algorithms -- yet, a rigorous, information-theoretic definition of feature relevancy, which accounts for feature interactions such as redundant and synergistic contributions, is still missing. We argue that this lack is inherent to classical information theory which does not provide measures to decompose the information a set of variables provides about a target into unique, redundant, and synergistic contributions. Such a decomposition has been introduced only recently by the partial information decomposition (PID) framework. Using PID, we clarify why feature selection is a conceptually difficult problem when approached using information theory and provide a novel definition of feature relevancy and redundancy in PID terms. From this definition, we show that the conditional mutual information (CMI) maximizes relevancy while minimizing redundancy and propose an iterative, CMI-based algorithm for practical feature selection. We demonstrate the power of our CMI-based algorithm in comparison to the unconditional mutual information on benchmark examples and provide corresponding PID estimates to highlight how PID allows to quantify information contribution of features and their interactions in feature-selection problems.