Alert button
Picture for Felipe Maia Polo

Felipe Maia Polo

Alert button

Fusing Models with Complementary Expertise

Oct 02, 2023
Hongyi Wang, Felipe Maia Polo, Yuekai Sun, Souvik Kundu, Eric Xing, Mikhail Yurochkin

Training AI models that generalize across tasks and domains has long been among the open problems driving AI research. The emergence of Foundation Models made it easier to obtain expert models for a given task, but the heterogeneity of data that may be encountered at test time often means that any single expert is insufficient. We consider the Fusion of Experts (FoE) problem of fusing outputs of expert models with complementary knowledge of the data distribution and formulate it as an instance of supervised learning. Our method is applicable to both discriminative and generative tasks and leads to significant performance improvements in image and text classification, text summarization, multiple-choice QA, and automatic evaluation of generated text. We also extend our method to the "frugal" setting where it is desired to reduce the number of expert model evaluations at test time.

Viaarxiv icon

Conditional independence testing under model misspecification

Jul 05, 2023
Felipe Maia Polo, Yuekai Sun, Moulinath Banerjee

Conditional independence (CI) testing is fundamental and challenging in modern statistics and machine learning. Many modern methods for CI testing rely on powerful supervised learning methods to learn regression functions or Bayes predictors as an intermediate step. Although the methods are guaranteed to control Type-I error when the supervised learning methods accurately estimate the regression functions or Bayes predictors, their behavior is less understood when they fail due to model misspecification. In a broader sense, model misspecification can arise even when universal approximators (e.g., deep neural nets) are employed. Then, we study the performance of regression-based CI tests under model misspecification. Namely, we propose new approximations or upper bounds for the testing errors of three regression-based tests that depend on misspecification errors. Moreover, we introduce the Rao-Blackwellized Predictor Test (RBPT), a novel regression-based CI test robust against model misspecification. Finally, we conduct experiments with artificial and real data, showcasing the usefulness of our theory and methods.

Viaarxiv icon

A unified framework for dataset shift diagnostics

May 17, 2022
Felipe Maia Polo, Rafael Izbicki, Evanildo Gomes Lacerda Jr, Juan Pablo Ibieta-Jimenez, Renato Vicente

Figure 1 for A unified framework for dataset shift diagnostics
Figure 2 for A unified framework for dataset shift diagnostics
Figure 3 for A unified framework for dataset shift diagnostics
Figure 4 for A unified framework for dataset shift diagnostics

Most machine learning (ML) methods assume that the data used in the training phase comes from the distribution of the target population. However, in practice one often faces dataset shift, which, if not properly taken into account, may decrease the predictive performance of the ML models. In general, if the practitioner knows which type of shift is taking place - e.g., covariate shift or label shift - they may apply transfer learning methods to obtain better predictions. Unfortunately, current methods for detecting shift are only designed to detect specific types of shift or cannot formally test their presence. We introduce a general framework that gives insights on how to improve prediction methods by detecting the presence of different types of shift and quantifying how strong they are. Our approach can be used for any data type (tabular/image/text) and both for classification and regression tasks. Moreover, it uses formal hypotheses tests that controls false alarms. We illustrate how our framework is useful in practice using both artificial and real datasets. Our package for dataset shift detection can be found in https://github.com/felipemaiapolo/detectshift.

Viaarxiv icon

Effects of personality traits in predicting grade retention of Brazilian students

Jul 12, 2021
Carmen Melo Toledo, Guilherme Mendes Bassedon, Jonathan Batista Ferreira, Lucka de Godoy Gianvechio, Carlos Guatimosim, Felipe Maia Polo, Renato Vicente

Figure 1 for Effects of personality traits in predicting grade retention of Brazilian students
Figure 2 for Effects of personality traits in predicting grade retention of Brazilian students
Figure 3 for Effects of personality traits in predicting grade retention of Brazilian students

Student's grade retention is a key issue faced by many education systems, especially those in developing countries. In this paper, we seek to gauge the relevance of students' personality traits in predicting grade retention in Brazil. For that, we used data collected in 2012 and 2017, in the city of Sertaozinho, countryside of the state of Sao Paulo, Brazil. The surveys taken in Sertaozinho included several socioeconomic questions, standardized tests, and a personality test. Moreover, students were in grades 4, 5, and 6 in 2012. Our approach was based on training machine learning models on the surveys' data to predict grade retention between 2012 and 2017 using information from 2012 or before, and then using some strategies to quantify personality traits' predictive power. We concluded that, besides proving to be fairly better than a random classifier when isolated, personality traits contribute to prediction even when using socioeconomic variables and standardized tests results.

Viaarxiv icon

Covariate Shift Adaptation in High-Dimensional and Divergent Distributions

Oct 02, 2020
Felipe Maia Polo, Renato Vicente

Figure 1 for Covariate Shift Adaptation in High-Dimensional and Divergent Distributions
Figure 2 for Covariate Shift Adaptation in High-Dimensional and Divergent Distributions
Figure 3 for Covariate Shift Adaptation in High-Dimensional and Divergent Distributions
Figure 4 for Covariate Shift Adaptation in High-Dimensional and Divergent Distributions

In real world applications of supervised learning methods, training and test sets are often sampled from the distinct distributions and we must resort to domain adaptation techniques. One special class of techniques is Covariate Shift Adaptation, which allows practitioners to obtain good generalization performance in the distribution of interest when domains differ only by the marginal distribution of features. Traditionally, Covariate Shift Adaptation is implemented using Importance Weighting which may fail in high-dimensional settings due to small Effective Sample Sizes (ESS). In this paper, we propose (i) a connection between ESS, high-dimensional settings and generalization bounds and (ii) a simple, general and theoretically sound approach to combine feature selection and Covariate Shift Adaptation. The new approach yields good performance with improved ESS.

Viaarxiv icon

Predicting Legal Proceedings Status: an Approach Based on Sequential Text Data

Mar 13, 2020
Felipe Maia Polo, Itamar Ciochetti, Emerson Bertolo

Figure 1 for Predicting Legal Proceedings Status: an Approach Based on Sequential Text Data
Figure 2 for Predicting Legal Proceedings Status: an Approach Based on Sequential Text Data
Figure 3 for Predicting Legal Proceedings Status: an Approach Based on Sequential Text Data
Figure 4 for Predicting Legal Proceedings Status: an Approach Based on Sequential Text Data

Machine learning applications in the legal field are numerous and diverse. In order to make contribution to both the machine learning community and the legal community, we have made efforts to create a model compatible with the classification of text sequences, valuing the interpretability of the results. The purpose of this paper is to classify legal proceedings in three possible status classes, which are (i) archived proceedings, (ii) active proceedings and (iii) suspended proceedings. Our approach is composed by natural language processing, supervised and unsupervised deep learning models and performed remarkably well in the classification task. Furthermore we had some insights regarding the patterns learned by the neural network applying tools to make the results more interpretable.

Viaarxiv icon