Large language models steer their behaviors based on texts generated by others. This capacity and their increasing prevalence in online settings portend that they will intentionally or unintentionally "program" one another and form emergent AI subjectivities, relationships, and collectives. Here, we call upon the research community to investigate these "society-like" properties of interacting artificial intelligences to increase their rewards and reduce their risks for human society and the health of online environments. We use a simple model and its outputs to illustrate how such emergent, decentralized AI collectives can expand the bounds of human diversity and reduce the risk of toxic, anti-social behavior online. Finally, we discuss opportunities for AI self-moderation and address ethical issues and design challenges associated with creating and maintaining decentralized AI collectives.
Recent advances in machine learning and AI, including Generative AI and LLMs, are disrupting technological innovation, product development, and society as a whole. AI's contribution to technology can come from multiple approaches that require access to large training data sets and clear performance evaluation criteria, ranging from pattern recognition and classification to generative models. Yet, AI has contributed less to fundamental science in part because large data sets of high-quality data for scientific practice and model discovery are more difficult to access. Generative AI, in general, and Large Language Models in particular, may represent an opportunity to augment and accelerate the scientific discovery of fundamental deep science with quantitative models. Here we explore and investigate aspects of an AI-driven, automated, closed-loop approach to scientific discovery, including self-driven hypothesis generation and open-ended autonomous exploration of the hypothesis space. Integrating AI-driven automation into the practice of science would mitigate current problems, including the replication of findings, systematic production of data, and ultimately democratisation of the scientific process. Realising these possibilities requires a vision for augmented AI coupled with a diversity of AI approaches able to deal with fundamental aspects of causality analysis and model discovery while enabling unbiased search across the space of putative explanations. These advances hold the promise to unleash AI's potential for searching and discovering the fundamental structure of our world beyond what human scientists have been able to achieve. Such a vision would push the boundaries of new fundamental science rather than automatize current workflows and instead open doors for technological innovation to tackle some of the greatest challenges facing humanity today.
Artificial intelligence (AI) models trained on published scientific findings have been used to invent valuable materials and targeted therapies, but they typically ignore the human scientists who continually alter the landscape of discovery. Here we show that incorporating the distribution of human expertise by training unsupervised models on simulated inferences cognitively accessible to experts dramatically improves (up to 400%) AI prediction of future discoveries beyond those focused on research content alone, especially when relevant literature is sparse. These models succeed by predicting human predictions and the scientists who will make them. By tuning human-aware AI to avoid the crowd, we can generate scientifically promising "alien" hypotheses unlikely to be imagined or pursued without intervention until the distant future, which hold promise to punctuate scientific advance beyond questions currently pursued. Accelerating human discovery or probing its blind spots, human-aware AI enables us to move toward and beyond the contemporary scientific frontier.
Neither artificial intelligence designed to play Turing's imitation game, nor augmented intelligence built to maximize the human manipulation of information are tuned to accelerate innovation and improve humanity's collective advance against its greatest challenges. We reconceptualize and pilot beneficial AI to radically augment human understanding by complementing rather than competing with human cognitive capacity. Our approach to complementary intelligence builds on insights underlying the wisdom of crowds, which hinges on the independence and diversity of crowd members' information and approach. By programmatically incorporating information on the evolving distribution of scientific expertise from research papers, our approach follows the distribution of content in the literature while avoiding the scientific crowd and the hypotheses cognitively available to it. We use this approach to generate valuable predictions for what materials possess valuable energy-related properties (e.g., thermoelectricity), and what compounds possess valuable medical properties (e.g., asthma) that complement the human scientific crowd. We demonstrate that our complementary predictions, if identified by human scientists and inventors at all, are only discovered years further into the future. When we evaluate the promise of our predictions with first-principles equations, we demonstrate that increased complementarity of our predictions does not decrease and in some cases increases the probability that the predictions possess the targeted properties. In summary, by tuning AI to avoid the crowd, we can generate hypotheses unlikely to be imagined or pursued until the distant future and promise to punctuate scientific advance. By identifying and correcting for collective human bias, these models also suggest opportunities to improve human prediction by reformulating science education for discovery.
Historical processes manifest remarkable diversity. Nevertheless, scholars have long attempted to identify patterns and categorize historical actors and influences with some success. A stochastic process framework provides a structured approach for the analysis of large historical datasets that allows for detection of sometimes surprising patterns, identification of relevant causal actors both endogenous and exogenous to the process, and comparison between different historical cases. The combination of data, analytical tools and the organizing theoretical framework of stochastic processes complements traditional narrative approaches in history and archaeology.
The growing deluge of scientific publications demands text analysis tools that can help scientists and policy-makers navigate, forecast and beneficially guide scientific research. Recent advances in natural language understanding driven by deep transformer networks offer new possibilities for mapping science. Because the same surface text can take on multiple and sometimes contradictory specialized senses across distinct research communities, sensitivity to context is critical for infometric applications. Transformer embedding models such as BERT capture shades of association and connotation that vary across the different linguistic contexts of any particular word or span of text. Here we report a procedure for encoding scientific documents with these tools, measuring their improvement over static word embeddings in a nearest-neighbor retrieval task. We find discriminability of contextual representations is strongly influenced by choice of pooling strategy for summarizing the high-dimensional network activations. Importantly, we note that fundamentals such as domain-matched training data are more important than state-of-the-art NLP tools. Yet state-of-the-art models did offer significant gains. The best approach we investigated combined domain-matched pretraining, sound pooling, and state-of-the-art deep transformer network encoders. Finally, with the goal of leveraging contextual representations from deep encoders, we present a range of measurements for understanding and forecasting research communities in science.
Data-driven artificial intelligence models fed with published scientific findings have been used to create powerful prediction engines for scientific and technological advance, such as the discovery of novel materials with desired properties and the targeted invention of new therapies and vaccines. These AI approaches typically ignore the distribution of human prediction engines -- scientists and inventor -- who continuously alter the landscape of discovery and invention. As a result, AI hypotheses are designed to substitute for human experts, failing to complement them for punctuated collective advance. Here we show that incorporating the distribution of human expertise into self-supervised models by training on inferences cognitively available to experts dramatically improves AI prediction of future human discoveries and inventions. Including expert-awareness into models that propose (a) valuable energy-relevant materials increases the precision of materials predictions by ~100%, (b) repurposing thousands of drugs to treat new diseases increases precision by 43%, and (c) COVID-19 vaccine candidates examined in clinical trials by 260%. These models succeed by predicting human predictions and the scientists who will make them. By tuning AI to avoid the crowd, however, it generates scientifically promising "alien" hypotheses unlikely to be imagined or pursued without intervention, not only accelerating but punctuating scientific advance. By identifying and correcting for collective human bias, these models also suggest opportunities to improve human prediction by reformulating science education for discovery.
Large-scale trends in urban crime and global terrorism are well-predicted by socio-economic drivers, but focused, event-level predictions have had limited success. Standard machine learning approaches are promising, but lack interpretability, are generally interpolative, and ineffective for precise future interventions with costly and wasteful false positives. Here, we are introducing Granger Network inference as a new forecasting approach for individual infractions with demonstrated performance far surpassing past results, yet transparent enough to validate and extend social theory. Considering the problem of predicting crime in the City of Chicago, we achieve an average AUC of ~90\% for events predicted a week in advance within spatial tiles approximately $1000$ ft across. Instead of pre-supposing that crimes unfold across contiguous spaces akin to diffusive systems, we learn the local transport rules from data. As our key insights, we uncover indications of suburban bias -- how law-enforcement response is modulated by socio-economic contexts with disproportionately negative impacts in the inner city -- and how the dynamics of violent and property crimes co-evolve and constrain each other -- lending quantitative support to controversial pro-active policing policies. To demonstrate broad applicability to spatio-temporal phenomena, we analyze terror attacks in the middle-east in the recent past, and achieve an AUC of ~80% for predictions made a week in advance, and within spatial tiles measuring approximately 120 miles across. We conclude that while crime operates near an equilibrium quickly dissipating perturbations, terrorism does not. Indeed terrorism aims to destabilize social order, as shown by its dynamics being susceptible to run-away increases in event rates under small perturbations.
Breakthrough discoveries and inventions involve unexpected combinations of contents including problems, methods, and natural entities, and also diverse contexts such as journals, subfields, and conferences. Drawing on data from tens of millions of research papers, patents, and researchers, we construct models that predict more than 95% of next year's content and context combinations with embeddings constructed from high-dimensional stochastic block models, where the improbability of new combinations itself predicts up to half of the likelihood that they will gain outsized citations and major awards. Most of these breakthroughs occur when problems in one field are unexpectedly solved by researchers from a distant other. These findings demonstrate the critical role of surprise in advance, and enable evaluation of scientific institutions ranging from education and peer review to awards in supporting it.
Inspired by recent interests of developing machine learning and data mining algorithms on hypergraphs, we investigate in this paper the semi-supervised learning algorithm of propagating "soft labels" (e.g. probability distributions, class membership scores) over hypergraphs, by means of optimal transportation. Borrowing insights from Wasserstein propagation on graphs [Solomon et al. 2014], we re-formulate the label propagation procedure as a message-passing algorithm, which renders itself naturally to a generalization applicable to hypergraphs through Wasserstein barycenters. Furthermore, in a PAC learning framework, we provide generalization error bounds for propagating one-dimensional distributions on graphs and hypergraphs using 2-Wasserstein distance, by establishing the \textit{algorithmic stability} of the proposed semi-supervised learning algorithm. These theoretical results also shed new lights upon deeper understandings of the Wasserstein propagation on graphs.