Training or finetuning large-scale language models (LLMs) such as GPT-3 requires substantial computation resources, motivating recent efforts to explore parameter-efficient adaptation to downstream tasks. One practical area of research is to treat these models as black boxes and interact with them through their inference APIs. In this paper, we investigate how to optimize few-shot text classification without accessing the gradients of the LLMs. To achieve this, we treat the black-box model as a feature extractor and train a classifier with the augmented text data. Data augmentation is performed using prompt-based finetuning on an auxiliary language model with a much smaller parameter size than the black-box model. Through extensive experiments on eight text classification datasets, we show that our approach, dubbed BT-Classifier, significantly outperforms state-of-the-art black-box few-shot learners and performs on par with methods that rely on full-model tuning.
The problem of air pollution threatens public health. Air quality forecasting can provide the air quality index hours or even days later, which can help the public to prevent air pollution in advance. Previous works focus on citywide air quality forecasting and cannot solve nationwide city forecasting problem, whose difficulties lie in capturing the latent dependencies between geographically distant but highly correlated cities. In this paper, we propose the group-aware graph neural network (GAGNN), a hierarchical model for nationwide city air quality forecasting. The model constructs a city graph and a city group graph to model the spatial and latent dependencies between cities, respectively. GAGNN introduces differentiable grouping network to discover the latent dependencies among cities and generate city groups. Based on the generated city groups, a group correlation encoding module is introduced to learn the correlations between them, which can effectively capture the dependencies between city groups. After the graph construction, GAGNN implements message passing mechanism to model the dependencies between cities and city groups. The evaluation experiments on Chinese city air quality dataset indicate that our GAGNN outperforms existing forecasting models.
Accurately forecasting air quality is critical to protecting general public from lung and heart diseases. This is a challenging task due to the complicated interactions among distinct pollution sources and various other influencing factors. Existing air quality forecasting methods cannot effectively model the diffusion processes of air pollutants between cities and monitoring stations, which may suddenly deteriorate the air quality of a region. In this paper, we propose HighAir, i.e., a hierarchical graph neural network-based air quality forecasting method, which adopts an encoder-decoder architecture and considers complex air quality influencing factors, e.g., weather and land usage. Specifically, we construct a city-level graph and station-level graphs from a hierarchical perspective, which can consider city-level and station-level patterns, respectively. We design two strategies, i.e., upper delivery and lower updating, to implement the inter-level interactions, and introduce message passing mechanism to implement the intra-level interactions. We dynamically adjust edge weights based on wind direction to model the correlations between dynamic factors and air quality. We compare HighAir with the state-of-the-art air quality forecasting methods on the dataset of Yangtze River Delta city group, which covers 10 major cities within 61,500 km2. The experimental results show that HighAir significantly outperforms other methods.
Waste recycling is an important way of saving energy and materials in the production process. In general cases recyclable objects are mixed with unrecyclable objects, which raises a need for identification and classification. This paper proposes a convolutional neural network (CNN) model to complete both tasks. The model uses transfer learning from a pretrained Resnet-50 CNN to complete feature extraction. A subsequent fully connected layer for classification was trained on the augmented TrashNet dataset . In the application, sliding-window is used for image segmentation in the pre-classification stage. In the post-classification stage, the labelled sample points are integrated with Gaussian Clustering to locate the object. The resulting model has achieved an overall detection rate of 48.4% in simulation and final classification accuracy of 92.4%.
Individuals do not respond uniformly to treatments, events, or interventions. Sociologists routinely partition samples into subgroups to explore how the effects of treatments vary by covariates like race, gender, and socioeconomic status. In so doing, analysts determine the key subpopulations based on theoretical priors. Data-driven discoveries are also routine, yet the analyses by which sociologists typically go about them are problematic and seldom move us beyond our expectations, and biases, to explore new meaningful subgroups. Emerging machine learning methods allow researchers to explore sources of variation that they may not have previously considered, or envisaged. In this paper, we use causal trees to recursively partition the sample and uncover sources of treatment effect heterogeneity. We use honest estimation, splitting the sample into a training sample to grow the tree and an estimation sample to estimate leaf-specific effects. Assessing a central topic in the social inequality literature, college effects on wages, we compare what we learn from conventional approaches for exploring variation in effects to causal trees. Given our use of observational data, we use leaf-specific matching and sensitivity analyses to address confounding and offer interpretations of effects based on observed and unobserved heterogeneity. We encourage researchers to follow similar practices in their work on variation in sociological effects.