In this exploratory note we ask the question of what a measure of performance for all tasks is like if we use a weighting of tasks based on a difficulty function. This difficulty function depends on the complexity of the (acceptable) solution for the task (instead of a universal distribution over tasks or an adaptive test). The resulting aggregations and decompositions are (now retrospectively) seen as the natural (and trivial) interactive generalisation of the C-tests.
The analysis of the adaptive behaviour of many different kinds of systems such as humans, animals and machines, requires more general ways of assessing their cognitive abilities. This need is strengthened by increasingly more tasks being analysed for and completed by a wider diversity of systems, including swarms and hybrids. The notion of universal test has recently emerged in the context of machine intelligence evaluation as a way to define and use the same cognitive test for a variety of systems, using some principled tasks and adapting the interface to each particular subject. However, how far can universal tests be taken? This paper analyses this question in terms of subjects, environments, space-time resolution, rewards and interfaces. This leads to a number of findings, insights and caveats, according to several levels where universal tests may be progressively more difficult to conceive, implement and administer. One of the most significant contributions is given by the realisation that more universal tests are defined as maximisations of less universal tests for a variety of configurations. This means that universal tests must be necessarily adaptive.
There has been an increasing interest in inferring some personality traits from users and players in social networks and games, respectively. This goes beyond classical sentiment analysis, and also much further than customer profiling. The purpose here is to have a characterisation of users in terms of personality traits, such as openness, conscientiousness, extraversion, agreeableness, and neuroticism. While this is an incipient area of research, we ask the question of whether cognitive abilities, and intelligence in particular, are also measurable from user profiles. However, we pose the question as broadly as possible in terms of subjects, in the context of universal psychometrics, including humans, machines and hybrids. Namely, in this paper we analyse the following question: is it possible to measure the intelligence of humans and (non-human) bots in a social network or a game just from their user profiles, i.e., by observation, without the use of interactive tests, such as IQ tests, the Turing test or other more principled machine intelligence tests?
We analyse the complexity of environments according to the policies that need to be used to achieve high performance. The performance results for a population of policies leads to a distribution that is examined in terms of policy complexity and analysed through several diagrams and indicators. The notion of environment response curve is also introduced, by inverting the performance results into an ability scale. We apply all these concepts, diagrams and indicators to a minimalistic environment class, agent-populated elementary cellular automata, showing how the difficulty, discriminating power and ranges (previous to normalisation) may vary for several environments.
Regression, unlike classification, has lacked a comprehensive and effective approach to deal with cost-sensitive problems by the reuse (and not a re-training) of general regression models. In this paper, a wide variety of cost-sensitive problems in regression (such as bids, asymmetric losses and rejection rules) can be solved effectively by a lightweight but powerful approach, consisting of: (1) the conversion of any traditional one-parameter crisp regression model into a two-parameter soft regression model, seen as a normal conditional density estimator, by the use of newly-introduced enrichment methods; and (2) the reframing of an enriched soft regression model to new contexts by an instance-dependent optimisation of the expected loss derived from the conditional normal distribution.
This paper analyses the influence of including agents of different degrees of intelligence in a multiagent system. The goal is to better understand how we can develop intelligence tests that can evaluate social intelligence. We analyse several reinforcement algorithms in several contexts of cooperation and competition. Our experimental setting is inspired by the recently developed Darwin-Wallace distribution.
In machine learning, distance-based algorithms, and other approaches, use information that is represented by propositional data. However, this kind of representation can be quite restrictive and, in many cases, it requires more complex structures in order to represent data in a more natural way. Terms are the basis for functional and logic programming representation. Distances between terms are a useful tool not only to compare terms, but also to determine the search space in many of these applications. This dissertation applies distances between terms, exploiting the features of each distance and the possibility to compare from propositional data types to hierarchical representations. The distances between terms are applied through the k-NN (k-nearest neighbor) classification algorithm using XML as a common language representation. To be able to represent these data in an XML structure and to take advantage of the benefits of distance between terms, it is necessary to apply some transformations. These transformations allow the conversion of flat data into hierarchical data represented in XML, using some techniques based on intuitive associations between the names and values of variables and associations based on attribute similarity. Several experiments with the distances between terms of Nienhuys-Cheng and Estruch et al. were performed. In the case of originally propositional data, these distances are compared to the Euclidean distance. In all cases, the experiments were performed with the distance-weighted k-nearest neighbor algorithm, using several exponents for the attraction function (weighted distance). It can be seen that in some cases, the term distances can significantly improve the results on approaches applied to flat representations.
Today, available methods that assess AI systems are focused on using empirical techniques to measure the performance of algorithms in some specific tasks (e.g., playing chess, solving mazes or land a helicopter). However, these methods are not appropriate if we want to evaluate the general intelligence of AI and, even less, if we compare it with human intelligence. The ANYNT project has designed a new method of evaluation that tries to assess AI systems using well known computational notions and problems which are as general as possible. This new method serves to assess general intelligence (which allows us to learn how to solve any new kind of problem we face) and not only to evaluate performance on a set of specific tasks. This method not only focuses on measuring the intelligence of algorithms, but also to assess any intelligent system (human beings, animals, AI, aliens?,...), and letting us to place their results on the same scale and, therefore, to be able to compare them. This new approach will allow us (in the future) to evaluate and compare any kind of intelligent system known or even to build/find, be it artificial or biological. This master thesis aims at ensuring that this new method provides consistent results when evaluating AI algorithms, this is done through the design and implementation of prototypes of universal intelligence tests and their application to different intelligent systems (AI algorithms and humans beings). From the study we analyze whether the results obtained by two different intelligent systems are properly located on the same scale and we propose changes and refinements to these prototypes in order to, in the future, being able to achieve a truly universal intelligence test.
One of the main research areas in Artificial Intelligence is the coding of agents (programs) which are able to learn by themselves in any situation. This means that agents must be useful for purposes other than those they were created for, as, for example, playing chess. In this way we try to get closer to the pristine goal of Artificial Intelligence. One of the problems to decide whether an agent is really intelligent or not is the measurement of its intelligence, since there is currently no way to measure it in a reliable way. The purpose of this project is to create an interpreter that allows for the execution of several environments, including those which are generated randomly, so that an agent (a person or a program) can interact with them. Once the interaction between the agent and the environment is over, the interpreter will measure the intelligence of the agent according to the actions, states and rewards the agent has undergone inside the environment during the test. As a result we will be able to measure agents' intelligence in any possible environment, and to make comparisons between several agents, in order to determine which of them is the most intelligent. In order to perform the tests, the interpreter must be able to randomly generate environments that are really useful to measure agents' intelligence, since not any randomly generated environment will serve that purpose.
This document presents Annotated English, a system of diacritical symbols which turns English pronunciation into a precise and unambiguous process. The annotations are defined and located in such a way that the original English text is not altered (not even a letter), thus allowing for a consistent reading and learning of the English language with and without annotations. The annotations are based on a set of general rules that make the frequency of annotations not dramatically high. This makes the reader easily associate annotations with exceptions, and makes it possible to shape, internalise and consolidate some rules for the English language which otherwise are weakened by the enormous amount of exceptions in English pronunciation. The advantages of this annotation system are manifold. Any existing text can be annotated without a significant increase in size. This means that we can get an annotated version of any document or book with the same number of pages and fontsize. Since no letter is affected, the text can be perfectly read by a person who does not know the annotation rules, since annotations can be simply ignored. The annotations are based on a set of rules which can be progressively learned and recognised, even in cases where the reader has no access or time to read the rules. This means that a reader can understand most of the annotations after reading a few pages of Annotated English, and can take advantage from that knowledge for any other annotated document she may read in the future.