Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Seong Joon Oh

DISCO: Diversifying Sample Condensation for Efficient Model Evaluation

Oct 09, 2025

Alexander Rubinstein, Benjamin Raible, Martin Gubri, Seong Joon Oh

Abstract:Evaluating modern machine learning models has become prohibitively expensive. Benchmarks such as LMMs-Eval and HELM demand thousands of GPU hours per model. Costly evaluation reduces inclusivity, slows the cycle of innovation, and worsens environmental impact. The typical approach follows two steps. First, select an anchor subset of data. Second, train a mapping from the accuracy on this subset to the final test result. The drawback is that anchor selection depends on clustering, which can be complex and sensitive to design choices. We argue that promoting diversity among samples is not essential; what matters is to select samples that $\textit{maximise diversity in model responses}$. Our method, $\textbf{Diversifying Sample Condensation (DISCO)}$, selects the top-k samples with the greatest model disagreements. This uses greedy, sample-wise statistics rather than global clustering. The approach is conceptually simpler. From a theoretical view, inter-model disagreement provides an information-theoretically optimal rule for such greedy selection. $\textbf{DISCO}$ shows empirical gains over prior methods, achieving state-of-the-art results in performance prediction across MMLU, Hellaswag, Winogrande, and ARC. Code is available here: https://github.com/arubique/disco-public.

Via

Access Paper or Ask Questions

First Hallucination Tokens Are Different from Conditional Ones

Jul 28, 2025

Jakob Snel, Seong Joon Oh

Abstract:Hallucination, the generation of untruthful content, is one of the major concerns regarding foundational models. Detecting hallucinations at the token level is vital for real-time filtering and targeted correction, yet the variation of hallucination signals within token sequences is not fully understood. Leveraging the RAGTruth corpus with token-level annotations and reproduced logits, we analyse how these signals depend on a token's position within hallucinated spans, contributing to an improved understanding of token-level hallucination. Our results show that the first hallucinated token carries a stronger signal and is more detectable than conditional tokens. We release our analysis framework, along with code for logit reproduction and metric computation at https://github.com/jakobsnl/RAGTruth_Xtended.

* 4.5 pages, 3 figures, Dataset, Knowledge Paper, Hallucination, Trustworthiness

Via

Access Paper or Ask Questions

Does Data Scaling Lead to Visual Compositional Generalization?

Jul 09, 2025

Arnas Uselis, Andrea Dittadi, Seong Joon Oh

Abstract:Compositional understanding is crucial for human intelligence, yet it remains unclear whether contemporary vision models exhibit it. The dominant machine learning paradigm is built on the premise that scaling data and model sizes will improve out-of-distribution performance, including compositional generalization. We test this premise through controlled experiments that systematically vary data scale, concept diversity, and combination coverage. We find that compositional generalization is driven by data diversity, not mere data scale. Increased combinatorial coverage forces models to discover a linearly factored representational structure, where concepts decompose into additive components. We prove this structure is key to efficiency, enabling perfect generalization from few observed combinations. Evaluating pretrained models (DINO, CLIP), we find above-random yet imperfect performance, suggesting partial presence of this structure. Our work motivates stronger emphasis on constructing diverse datasets for compositional generalization, and considering the importance of representational structure that enables efficient compositional learning. Code available at https://github.com/oshapio/visual-compositional-generalization.

* ICML 2025

Via

Access Paper or Ask Questions

Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers

Jun 18, 2025

Tommaso Green, Martin Gubri, Haritz Puerto, Sangdoo Yun, Seong Joon Oh

Abstract:We study privacy leakage in the reasoning traces of large reasoning models used as personal agents. Unlike final outputs, reasoning traces are often assumed to be internal and safe. We challenge this assumption by showing that reasoning traces frequently contain sensitive user data, which can be extracted via prompt injections or accidentally leak into outputs. Through probing and agentic evaluations, we demonstrate that test-time compute approaches, particularly increased reasoning steps, amplify such leakage. While increasing the budget of those test-time compute approaches makes models more cautious in their final answers, it also leads them to reason more verbosely and leak more in their own thinking. This reveals a core tension: reasoning improves utility but enlarges the privacy attack surface. We argue that safety efforts must extend to the model's internal thinking, not just its outputs.

Via

Access Paper or Ask Questions

Diffusion Classifiers Understand Compositionality, but Conditions Apply

May 23, 2025

Yujin Jeong, Arnas Uselis, Seong Joon Oh, Anna Rohrbach

Abstract:Understanding visual scenes is fundamental to human intelligence. While discriminative models have significantly advanced computer vision, they often struggle with compositional understanding. In contrast, recent generative text-to-image diffusion models excel at synthesizing complex scenes, suggesting inherent compositional capabilities. Building on this, zero-shot diffusion classifiers have been proposed to repurpose diffusion models for discriminative tasks. While prior work offered promising results in discriminative compositional scenarios, these results remain preliminary due to a small number of benchmarks and a relatively shallow analysis of conditions under which the models succeed. To address this, we present a comprehensive study of the discriminative capabilities of diffusion classifiers on a wide range of compositional tasks. Specifically, our study covers three diffusion models (SD 1.5, 2.0, and, for the first time, 3-m) spanning 10 datasets and over 30 tasks. Further, we shed light on the role that target dataset domains play in respective performance; to isolate the domain effects, we introduce a new diagnostic benchmark Self-Bench comprised of images created by diffusion models themselves. Finally, we explore the importance of timestep weighting and uncover a relationship between domain gap and timestep sensitivity, particularly for SD3-m. To sum up, diffusion classifiers understand compositionality, but conditions apply! Code and dataset are available at https://github.com/eugene6923/Diffusion-Classifiers-Compositionality.

Via

Access Paper or Ask Questions

Are We Done with Object-Centric Learning?

Apr 09, 2025

Alexander Rubinstein, Ameya Prabhu, Matthias Bethge, Seong Joon Oh

Figure 1 for Are We Done with Object-Centric Learning?

Figure 2 for Are We Done with Object-Centric Learning?

Figure 3 for Are We Done with Object-Centric Learning?

Figure 4 for Are We Done with Object-Centric Learning?

Abstract:Object-centric learning (OCL) seeks to learn representations that only encode an object, isolated from other objects or background cues in a scene. This approach underpins various aims, including out-of-distribution (OOD) generalization, sample-efficient composition, and modeling of structured environments. Most research has focused on developing unsupervised mechanisms that separate objects into discrete slots in the representation space, evaluated using unsupervised object discovery. However, with recent sample-efficient segmentation models, we can separate objects in the pixel space and encode them independently. This achieves remarkable zero-shot performance on OOD object discovery benchmarks, is scalable to foundation models, and can handle a variable number of slots out-of-the-box. Hence, the goal of OCL methods to obtain object-centric representations has been largely achieved. Despite this progress, a key question remains: How does the ability to separate objects within a scene contribute to broader OCL objectives, such as OOD generalization? We address this by investigating the OOD generalization challenge caused by spurious background cues through the lens of OCL. We propose a novel, training-free probe called $\textbf{Object-Centric Classification with Applied Masks (OCCAM)}$, demonstrating that segmentation-based encoding of individual objects significantly outperforms slot-based OCL methods. However, challenges in real-world applications remain. We provide the toolbox for the OCL community to use scalable object-centric representations, and focus on practical applications and fundamental questions, such as understanding object perception in human cognition. Our code is available $\href{https://github.com/AlexanderRubinstein/OCCAM}{here}$.

Via

Access Paper or Ask Questions

Intermediate Layer Classifiers for OOD generalization

Apr 07, 2025

Arnas Uselis, Seong Joon Oh

Figure 1 for Intermediate Layer Classifiers for OOD generalization

Figure 2 for Intermediate Layer Classifiers for OOD generalization

Figure 3 for Intermediate Layer Classifiers for OOD generalization

Figure 4 for Intermediate Layer Classifiers for OOD generalization

Abstract:Deep classifiers are known to be sensitive to data distribution shifts, primarily due to their reliance on spurious correlations in training data. It has been suggested that these classifiers can still find useful features in the network's last layer that hold up under such shifts. In this work, we question the use of last-layer representations for out-of-distribution (OOD) generalisation and explore the utility of intermediate layers. To this end, we introduce \textit{Intermediate Layer Classifiers} (ILCs). We discover that intermediate layer representations frequently offer substantially better generalisation than those from the penultimate layer. In many cases, zero-shot OOD generalisation using earlier-layer representations approaches the few-shot performance of retraining on penultimate layer representations. This is confirmed across multiple datasets, architectures, and types of distribution shifts. Our analysis suggests that intermediate layers are less sensitive to distribution shifts compared to the penultimate layer. These findings highlight the importance of understanding how information is distributed across network layers and its role in OOD generalisation, while also pointing to the limits of penultimate layer representation utility. Code is available at https://github.com/oshapio/intermediate-layer-generalization

* ICLR 2025

Via

Access Paper or Ask Questions

DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation

Apr 07, 2025

Sohyun Lee, Nayeong Kim, Juwon Kang, Seong Joon Oh, Suha Kwak

Figure 1 for DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation

Figure 2 for DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation

Figure 3 for DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation

Figure 4 for DiCoTTA: Domain-invariant Learning for Continual Test-time Adaptation

Abstract:This paper studies continual test-time adaptation (CTTA), the task of adapting a model to constantly changing unseen domains in testing while preserving previously learned knowledge. Existing CTTA methods mostly focus on adaptation to the current test domain only, overlooking generalization to arbitrary test domains a model may face in the future. To tackle this limitation, we present a novel online domain-invariant learning framework for CTTA, dubbed DiCoTTA. DiCoTTA aims to learn feature representation to be invariant to both current and previous test domains on the fly during testing. To this end, we propose a new model architecture and a test-time adaptation strategy dedicated to learning domain-invariant features without corrupting semantic contents, along with a new data structure and optimization algorithm for effectively managing information from previous test domains. DiCoTTA achieved state-of-the-art performance on four public CTTA benchmarks. Moreover, it showed superior generalization to unseen test domains.

Via

Access Paper or Ask Questions

CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally

Feb 05, 2025

Darina Koishigarina, Arnas Uselis, Seong Joon Oh

Figure 1 for CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally

Figure 2 for CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally

Figure 3 for CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally

Figure 4 for CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally

Abstract:CLIP (Contrastive Language-Image Pretraining) has become a popular choice for various downstream tasks. However, recent studies have questioned its ability to represent compositional concepts effectively. These works suggest that CLIP often acts like a bag-of-words (BoW) model, interpreting images and text as sets of individual concepts without grasping the structural relationships. In particular, CLIP struggles to correctly bind attributes to their corresponding objects when multiple objects are present in an image or text. In this work, we investigate why CLIP exhibits this BoW-like behavior. We find that the correct attribute-object binding information is already present in individual text and image modalities. Instead, the issue lies in the cross-modal alignment, which relies on cosine similarity. To address this, we propose Linear Attribute Binding CLIP or LABCLIP. It applies a linear transformation to text embeddings before computing cosine similarity. This approach significantly improves CLIP's ability to bind attributes to correct objects, thereby enhancing its compositional understanding.

Via

Access Paper or Ask Questions

Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models

Oct 31, 2024

Haritz Puerto, Martin Gubri, Sangdoo Yun, Seong Joon Oh

Figure 1 for Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models

Figure 2 for Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models

Figure 3 for Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models

Figure 4 for Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models

Abstract:Membership inference attacks (MIA) attempt to verify the membership of a given data sample in the training set for a model. MIA has become relevant in recent years, following the rapid development of large language models (LLM). Many are concerned about the usage of copyrighted materials for training them and call for methods for detecting such usage. However, recent research has largely concluded that current MIA methods do not work on LLMs. Even when they seem to work, it is usually because of the ill-designed experimental setup where other shortcut features enable "cheating." In this work, we argue that MIA still works on LLMs, but only when multiple documents are presented for testing. We construct new benchmarks that measure the MIA performances at a continuous scale of data samples, from sentences (n-grams) to a collection of documents (multiple chunks of tokens). To validate the efficacy of current MIA approaches at greater scales, we adapt a recent work on Dataset Inference (DI) for the task of binary membership detection that aggregates paragraph-level MIA features to enable MIA at document and collection of documents level. This baseline achieves the first successful MIA on pre-trained and fine-tuned LLMs.

* Our code is available at https://github.com/parameterlab/mia-scaling

Via

Access Paper or Ask Questions