Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniel Khalil

Steering Generative Models with Experimental Data for Protein Fitness Optimization

May 21, 2025

Jason Yang, Wenda Chu, Daniel Khalil, Raul Astudillo, Bruce J. Wittmann, Frances H. Arnold, Yisong Yue

Abstract:Protein fitness optimization involves finding a protein sequence that maximizes desired quantitative properties in a combinatorially large design space of possible sequences. Recent developments in steering protein generative models (e.g diffusion models, language models) offer a promising approach. However, by and large, past studies have optimized surrogate rewards and/or utilized large amounts of labeled data for steering, making it unclear how well existing methods perform and compare to each other in real-world optimization campaigns where fitness is measured by low-throughput wet-lab assays. In this study, we explore fitness optimization using small amounts (hundreds) of labeled sequence-fitness pairs and comprehensively evaluate strategies such as classifier guidance and posterior sampling for guiding generation from different discrete diffusion models of protein sequences. We also demonstrate how guidance can be integrated into adaptive sequence selection akin to Thompson sampling in Bayesian optimization, showing that plug-and-play guidance strategies offer advantages compared to alternatives such as reinforcement learning with protein language models.

Via

Access Paper or Ask Questions

Learning Keypoints for Multi-Agent Behavior Analysis using Self-Supervision

Sep 14, 2024

Daniel Khalil, Christina Liu, Pietro Perona, Jennifer J. Sun, Markus Marks

Figure 1 for Learning Keypoints for Multi-Agent Behavior Analysis using Self-Supervision

Figure 2 for Learning Keypoints for Multi-Agent Behavior Analysis using Self-Supervision

Figure 3 for Learning Keypoints for Multi-Agent Behavior Analysis using Self-Supervision

Figure 4 for Learning Keypoints for Multi-Agent Behavior Analysis using Self-Supervision

Abstract:The study of social interactions and collective behaviors through multi-agent video analysis is crucial in biology. While self-supervised keypoint discovery has emerged as a promising solution to reduce the need for manual keypoint annotations, existing methods often struggle with videos containing multiple interacting agents, especially those of the same species and color. To address this, we introduce B-KinD-multi, a novel approach that leverages pre-trained video segmentation models to guide keypoint discovery in multi-agent scenarios. This eliminates the need for time-consuming manual annotations on new experimental settings and organisms. Extensive evaluations demonstrate improved keypoint regression and downstream behavioral classification in videos of flies, mice, and rats. Furthermore, our method generalizes well to other species, including ants, bees, and humans, highlighting its potential for broad applications in automated keypoint annotation for multi-agent behavior analysis. Code available under: https://danielpkhalil.github.io/B-KinD-Multi

Via

Access Paper or Ask Questions