Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mihaela van der Schaar

You can't handle the truth: Data-centric insights improve pseudo-labeling

Jun 19, 2024

Nabeel Seedat, Nicolas Huynh, Fergus Imrie, Mihaela van der Schaar

Figure 1 for You can't handle the truth: Data-centric insights improve pseudo-labeling

Figure 2 for You can't handle the truth: Data-centric insights improve pseudo-labeling

Figure 3 for You can't handle the truth: Data-centric insights improve pseudo-labeling

Figure 4 for You can't handle the truth: Data-centric insights improve pseudo-labeling

Abstract:Pseudo-labeling is a popular semi-supervised learning technique to leverage unlabeled data when labeled samples are scarce. The generation and selection of pseudo-labels heavily rely on labeled data. Existing approaches implicitly assume that the labeled data is gold standard and 'perfect'. However, this can be violated in reality with issues such as mislabeling or ambiguity. We address this overlooked aspect and show the importance of investigating labeled data quality to improve any pseudo-labeling method. Specifically, we introduce a novel data characterization and selection framework called DIPS to extend pseudo-labeling. We select useful labeled and pseudo-labeled samples via analysis of learning dynamics. We demonstrate the applicability and impact of DIPS for various pseudo-labeling methods across an extensive range of real-world tabular and image datasets. Additionally, DIPS improves data efficiency and reduces the performance distinctions between different pseudo-labelers. Overall, we highlight the significant benefits of a data-centric rethinking of pseudo-labeling in real-world settings.

* Published in the Journal of Data-centric Machine Learning Research (DMLR) *Seedat & Huynh contributed equally

Via

Access Paper or Ask Questions

Discovering Preference Optimization Algorithms with and for Large Language Models

Jun 12, 2024

Chris Lu, Samuel Holt, Claudio Fanconi, Alex J. Chan, Jakob Foerster, Mihaela van der Schaar, Robert Tjarko Lange

Figure 1 for Discovering Preference Optimization Algorithms with and for Large Language Models

Figure 2 for Discovering Preference Optimization Algorithms with and for Large Language Models

Figure 3 for Discovering Preference Optimization Algorithms with and for Large Language Models

Figure 4 for Discovering Preference Optimization Algorithms with and for Large Language Models

Abstract:Offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs. Typically, preference optimization is approached as an offline supervised learning task using manually-crafted convex loss functions. While these methods are based on theoretical insights, they are inherently constrained by human creativity, so the large search space of possible loss functions remains under explored. We address this by performing LLM-driven objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention. Specifically, we iteratively prompt an LLM to propose and implement new preference optimization loss functions based on previously-evaluated performance metrics. This process leads to the discovery of previously-unknown and performant preference optimization algorithms. The best performing of these we call Discovered Preference Optimization (DiscoPOP), a novel algorithm that adaptively blends logistic and exponential losses. Experiments demonstrate the state-of-the-art performance of DiscoPOP and its successful transfer to held-out tasks.

Via

Access Paper or Ask Questions

Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise

Jun 05, 2024

Thomas Pouplin, Alan Jeffares, Nabeel Seedat, Mihaela van der Schaar

Abstract:Constructing valid prediction intervals rather than point estimates is a well-established approach for uncertainty quantification in the regression setting. Models equipped with this capacity output an interval of values in which the ground truth target will fall with some prespecified probability. This is an essential requirement in many real-world applications where simple point predictions' inability to convey the magnitude and frequency of errors renders them insufficient for high-stakes decisions. Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the (non-parametric) distribution of outputs. This method is simple, computationally inexpensive, interpretable, assumption-free, and effective. However, it does require that the specific quantiles being learned are chosen a priori. This results in (a) intervals that are arbitrarily symmetric around the median which is sub-optimal for realistic skewed distributions, or (b) learning an excessive number of intervals. In this work, we propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint whilst maintaining its strengths. We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities (e.g. mean width) whilst retaining the essential coverage guarantees of quantile regression.

* Accepted at International Conference on Machine Learning (ICML) 2024

Via

Access Paper or Ask Questions

Meta-Learners for Partially-Identified Treatment Effects Across Multiple Environments

Jun 04, 2024

Jonas Schweisthal, Dennis Frauen, Mihaela van der Schaar, Stefan Feuerriegel

Figure 1 for Meta-Learners for Partially-Identified Treatment Effects Across Multiple Environments

Figure 2 for Meta-Learners for Partially-Identified Treatment Effects Across Multiple Environments

Figure 3 for Meta-Learners for Partially-Identified Treatment Effects Across Multiple Environments

Figure 4 for Meta-Learners for Partially-Identified Treatment Effects Across Multiple Environments

Abstract:Estimating the conditional average treatment effect (CATE) from observational data is relevant for many applications such as personalized medicine. Here, we focus on the widespread setting where the observational data come from multiple environments, such as different hospitals, physicians, or countries. Furthermore, we allow for violations of standard causal assumptions, namely, overlap within the environments and unconfoundedness. To this end, we move away from point identification and focus on partial identification. Specifically, we show that current assumptions from the literature on multiple environments allow us to interpret the environment as an instrumental variable (IV). This allows us to adapt bounds from the IV literature for partial identification of CATE by leveraging treatment assignment mechanisms across environments. Then, we propose different model-agnostic learners (so-called meta-learners) to estimate the bounds that can be used in combination with arbitrary machine learning models. We further demonstrate the effectiveness of our meta-learners across various experiments using both simulated and real-world data. Finally, we discuss the applicability of our meta-learners to partial identification in instrumental variable settings, such as randomized controlled trials with non-compliance.

* Accepted at ICML 2024

Via

Access Paper or Ask Questions

Inverse-RLignment: Inverse Reinforcement Learning from Demonstrations for LLM Alignment

May 24, 2024

Hao Sun, Mihaela van der Schaar

Abstract:Aligning Large Language Models (LLMs) is crucial for enhancing their safety and utility. However, existing methods, primarily based on preference datasets, face challenges such as noisy labels, high annotation costs, and privacy concerns. In this work, we introduce Alignment from Demonstrations (AfD), a novel approach leveraging high-quality demonstration data to overcome these challenges. We formalize AfD within a sequential decision-making framework, highlighting its unique challenge of missing reward signals. Drawing insights from forward and inverse reinforcement learning, we introduce divergence minimization objectives for AfD. Analytically, we elucidate the mass-covering and mode-seeking behaviors of various approaches, explaining when and why certain methods are superior. Practically, we propose a computationally efficient algorithm that extrapolates over a tailored reward model for AfD. We validate our key insights through experiments on the Harmless and Helpful tasks, demonstrating their strong empirical performance while maintaining simplicity.

Via

Access Paper or Ask Questions

A Study of Posterior Stability for Time-Series Latent Diffusion

May 22, 2024

Yangming Li, Mihaela van der Schaar

Abstract:Latent diffusion has shown promising results in image generation and permits efficient sampling. However, this framework might suffer from the problem of posterior collapse when applied to time series. In this paper, we conduct an impact analysis of this problem. With a theoretical insight, we first explain that posterior collapse reduces latent diffusion to a VAE, making it less expressive. Then, we introduce the notion of dependency measures, showing that the latent variable sampled from the diffusion model loses control of the generation process in this situation and that latent diffusion exhibits dependency illusion in the case of shuffled time series. We also analyze the causes of posterior collapse and introduce a new framework based on this analysis, which addresses the problem and supports a more expressive prior distribution. Our experiments on various real-world time-series datasets demonstrate that our new model maintains a stable posterior and outperforms the baselines in time series generation.

* Paper under review

Via

Access Paper or Ask Questions

Why Tabular Foundation Models Should Be a Research Priority

May 02, 2024

Boris van Breugel, Mihaela van der Schaar

Figure 1 for Why Tabular Foundation Models Should Be a Research Priority

Figure 2 for Why Tabular Foundation Models Should Be a Research Priority

Figure 3 for Why Tabular Foundation Models Should Be a Research Priority

Abstract:Recent text and image foundation models are incredibly impressive, and these models are attracting an ever-increasing portion of research resources. In this position piece we aim to shift the ML research community's priorities ever so slightly to a different modality: tabular data. Tabular data is the dominant modality in many fields, yet it is given hardly any research attention and significantly lags behind in terms of scale and power. We believe the time is now to start developing tabular foundation models, or what we coin a Large Tabular Model (LTM). LTMs could revolutionise the way science and ML use tabular data: not as single datasets that are analyzed in a vacuum, but contextualized with respect to related datasets. The potential impact is far-reaching: from few-shot tabular models to automating data science; from out-of-distribution synthetic data to empowering multidisciplinary scientific discovery. We intend to excite reflections on the modalities we study, and convince some researchers to study large tabular models.

* Accepted at International Conference on Machine Learning (ICML 2024)

Via

Access Paper or Ask Questions

Shape Arithmetic Expressions: Advancing Scientific Discovery Beyond Closed-Form Equations

Apr 15, 2024

Krzysztof Kacprzyk, Mihaela van der Schaar

Abstract:Symbolic regression has excelled in uncovering equations from physics, chemistry, biology, and related disciplines. However, its effectiveness becomes less certain when applied to experimental data lacking inherent closed-form expressions. Empirically derived relationships, such as entire stress-strain curves, may defy concise closed-form representation, compelling us to explore more adaptive modeling approaches that balance flexibility with interpretability. In our pursuit, we turn to Generalized Additive Models (GAMs), a widely used class of models known for their versatility across various domains. Although GAMs can capture non-linear relationships between variables and targets, they cannot capture intricate feature interactions. In this work, we investigate both of these challenges and propose a novel class of models, Shape Arithmetic Expressions (SHAREs), that fuses GAM's flexible shape functions with the complex feature interactions found in mathematical expressions. SHAREs also provide a unifying framework for both of these approaches. We also design a set of rules for constructing SHAREs that guarantee transparency of the found expressions beyond the standard constraints based on the model's size.

* To appear in the Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS) 2024, Valencia, Spain. PMLR: Volume 238

Via

Access Paper or Ask Questions

ODE Discovery for Longitudinal Heterogeneous Treatment Effects Inference

Mar 16, 2024

Krzysztof Kacprzyk, Samuel Holt, Jeroen Berrevoets, Zhaozhi Qian, Mihaela van der Schaar

Figure 1 for ODE Discovery for Longitudinal Heterogeneous Treatment Effects Inference

Figure 2 for ODE Discovery for Longitudinal Heterogeneous Treatment Effects Inference

Figure 3 for ODE Discovery for Longitudinal Heterogeneous Treatment Effects Inference

Figure 4 for ODE Discovery for Longitudinal Heterogeneous Treatment Effects Inference

Abstract:Inferring unbiased treatment effects has received widespread attention in the machine learning community. In recent years, our community has proposed numerous solutions in standard settings, high-dimensional treatment settings, and even longitudinal settings. While very diverse, the solution has mostly relied on neural networks for inference and simultaneous correction of assignment bias. New approaches typically build on top of previous approaches by proposing new (or refined) architectures and learning algorithms. However, the end result -- a neural-network-based inference machine -- remains unchallenged. In this paper, we introduce a different type of solution in the longitudinal setting: a closed-form ordinary differential equation (ODE). While we still rely on continuous optimization to learn an ODE, the resulting inference machine is no longer a neural network. Doing so yields several advantages such as interpretability, irregular sampling, and a different set of identification assumptions. Above all, we consider the introduction of a completely new type of solution to be our most important contribution as it may spark entirely new innovations in treatment effects in general. We facilitate this by formulating our contribution as a framework that can transform any ODE discovery method into a treatment effects method.

Via

Access Paper or Ask Questions

Dissecting Sample Hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AI

Mar 07, 2024

Nabeel Seedat, Fergus Imrie, Mihaela van der Schaar

Figure 1 for Dissecting Sample Hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AI

Figure 2 for Dissecting Sample Hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AI

Figure 3 for Dissecting Sample Hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AI

Figure 4 for Dissecting Sample Hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AI

Abstract:Characterizing samples that are difficult to learn from is crucial to developing highly performant ML models. This has led to numerous Hardness Characterization Methods (HCMs) that aim to identify "hard" samples. However, there is a lack of consensus regarding the definition and evaluation of "hardness". Unfortunately, current HCMs have only been evaluated on specific types of hardness and often only qualitatively or with respect to downstream performance, overlooking the fundamental quantitative identification task. We address this gap by presenting a fine-grained taxonomy of hardness types. Additionally, we propose the Hardness Characterization Analysis Toolkit (H-CAT), which supports comprehensive and quantitative benchmarking of HCMs across the hardness taxonomy and can easily be extended to new HCMs, hardness types, and datasets. We use H-CAT to evaluate 13 different HCMs across 8 hardness types. This comprehensive evaluation encompassing over 14K setups uncovers strengths and weaknesses of different HCMs, leading to practical tips to guide HCM selection and future development. Our findings highlight the need for more comprehensive HCM evaluation, while we hope our hardness taxonomy and toolkit will advance the principled evaluation and uptake of data-centric AI methods.

* Published at International Conference on Learning Representations (ICLR) 2024

Via

Access Paper or Ask Questions