Zero-shot learning provides models for targets for which instances are not available, commonly called unobserved targets. The availability of target side information becomes crucial in this context in order to properly induce models for these targets. The literature is plenty of strategies to cope with this scenario, but specifically designed on the basis of a zero-shot classification scenario, mostly in computer vision and image classification, but they are either not applicable or easily extensible for a zero-shot regression framework for which a continuos value is required to be predicted rather than a label. In fact, there is a considerable lack of methods for zero-shot regression in the literature. Two approaches for zero-shot regression that work in a two-phase procedure were recently proposed. They first learn the observed target models through a classical regression learning ignoring the target side information. Then, they aggregate those observed target models afterwards exploiting the target side information and the models for the unobserved targets are induced. Despite both have shown quite good performance because of the different treatment they grant to the common features and to the side information, they exploit features and side information separately, avoiding a global optimization for providing the unobserved target models. The proposal of this paper is a novel method that jointly takes features and side information in a one-phase learning process, but treating side information properly and in a more deserving way than as common features. A specific kernel that properly merges features and side information is proposed for this purpose resulting in a novel approach that exhibits better performance over both artificial and real datasets.
This research arises from the need to predict the amount of air pollutants in meteorological stations. Air pollution depends on the location of the stations (weather conditions and activities in the surroundings). Frequently, the surrounding information is not considered in the learning process. This information is known beforehand in the absence of unobserved weather conditions and remains constant for the same station. Considering the surrounding information as side information facilitates the generalization for predicting pollutants in new stations, leading to a zero-shot regression scenario. Available methods in zero-shot typically lean towards classification, and are not easily extensible to regression. This paper proposes two zero-shot methods for regression. The first method is a similarity based approach that learns models from features and aggregates them using side information. However, potential knowledge of the feature models may be lost in the aggregation. The second method overcomes this drawback by replacing the aggregation procedure and learning the correspondence between side information and feature-induced models, instead. Both proposals are compared with a baseline procedure using artificial datasets, UCI repository communities and crime datasets, and the pollutants. Both approaches outperform the baseline method, but the parameter learning approach manifests its superiority over the similarity based method.