Ubiquitous internet access is reshaping the way we live, but it is accompanied by unprecedented challenges to prevent chronic diseases planted in long exposure to unhealthy lifestyles. This paper proposes leveraging online shopping behaviors as a proxy for personal lifestyle choices to freshen chronic disease prevention literacy targeted for times when e-commerce user experience has been assimilated into most people's daily life. Here, retrospective longitudinal query logs and purchase records from millions of online shoppers were accessed, constructing a broad spectrum of lifestyle features covering assorted product categories and buyer personas. Using the lifestyle-related information preceding their first purchases of prescription drugs, we could determine associations between online shoppers' past lifestyle choices and if they suffered from a particular chronic disease. Novel lifestyle risk factors were discovered in two exemplars -- depression and diabetes, most of which showed cognitive congruence with existing healthcare knowledge. Further, such empirical findings could be adopted to locate online shoppers at high risk of chronic diseases with fair accuracy (e.g., [area under the receiver operating characteristic curve] AUC=0.68 for depression and AUC=0.70 for diabetes), closely matching the performance of screening surveys benchmarked against medical diagnosis. Unobtrusive chronic disease surveillance via e-commerce sites may soon meet consenting individuals in the digital space they already inhabit.
The National Institutes of Health's (NIH) Human Biomolecular Atlas Program (HuBMAP) aims to create a comprehensive high-resolution atlas of all the cells in the healthy human body. Multiple laboratories across the United States are collecting tissue specimens from different organs of donors who vary in sex, age, and body size. Integrating and harmonizing the data derived from these samples and 'mapping' them into a common three-dimensional (3D) space is a major challenge. The key to making this possible is a 'Common Coordinate Framework' (CCF), which provides a semantically annotated, 3D reference system for the entire body. The CCF enables contributors to HuBMAP to 'register' specimens and datasets within a common spatial reference system, and it supports a standardized way to query and 'explore' data in a spatially and semantically explicit manner. [...] This paper describes the construction and usage of a CCF for the human body and its reference implementation in HuBMAP. The CCF consists of (1) a CCF Clinical Ontology, which provides metadata about the specimen and donor (the 'who'); (2) a CCF Semantic Ontology, which describes 'what' part of the body a sample came from and details anatomical structures, cell types, and biomarkers (ASCT+B); and (3) a CCF Spatial Ontology, which indicates 'where' a tissue sample is located in a 3D coordinate system. An initial version of all three CCF ontologies has been implemented for the first HuBMAP Portal release. It was successfully used by Tissue Mapping Centers to semantically annotate and spatially register 48 kidney and spleen tissue blocks. The blocks can be queried and explored in their clinical, semantic, and spatial context via the CCF user interface in the HuBMAP Portal.
We show how faceted search using a combination of traditional classification systems and mixed-membership topic models can go beyond keyword search to inform resource discovery, hypothesis formulation, and argument extraction for interdisciplinary research. Our test domain is the history and philosophy of scientific work on animal mind and cognition. The methods can be generalized to other research areas and ultimately support a system for semi-automatic identification of argument structures. We provide a case study for the application of the methods to the problem of identifying and extracting arguments about anthropomorphism during a critical period in the development of comparative psychology. We show how a combination of classification systems and mixed-membership models trained over large digital libraries can inform resource discovery in this domain. Through a novel approach of "drill-down" topic modeling---simultaneously reducing both the size of the corpus and the unit of analysis---we are able to reduce a large collection of fulltext volumes to a much smaller set of pages within six focal volumes containing arguments of interest to historians and philosophers of comparative psychology. The volumes identified in this way did not appear among the first ten results of the keyword search in the HathiTrust digital library and the pages bear the kind of "close reading" needed to generate original interpretations that is the heart of scholarly work in the humanities. Zooming back out, we provide a way to place the books onto a map of science originally constructed from very different data and for different purposes. The multilevel approach advances understanding of the intellectual and societal contexts in which writings are interpreted.