Picture for Minjoon Seo

Minjoon Seo

MambaMia: A State-Space-Model-Based Compression for Efficient Video Understanding in Large Multimodal Models

Add code
Jun 16, 2025
Viaarxiv icon

Differential Information: An Information-Theoretic Perspective on Preference Optimization

Add code
May 29, 2025
Viaarxiv icon

Let's Predict Sentence by Sentence

Add code
May 28, 2025
Viaarxiv icon

The Coverage Principle: A Framework for Understanding Compositional Generalization

Add code
May 26, 2025
Viaarxiv icon

Reasoning Models Better Express Their Confidence

Add code
May 20, 2025
Viaarxiv icon

The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think

Add code
May 15, 2025
Viaarxiv icon

DSAI: Unbiased and Interpretable Latent Feature Extraction for Data-Centric AI

Add code
Dec 09, 2024
Viaarxiv icon

Generative Context Distillation

Add code
Nov 24, 2024
Viaarxiv icon

Latent Action Pretraining from Videos

Add code
Oct 15, 2024
Figure 1 for Latent Action Pretraining from Videos
Figure 2 for Latent Action Pretraining from Videos
Figure 3 for Latent Action Pretraining from Videos
Figure 4 for Latent Action Pretraining from Videos
Viaarxiv icon

How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?

Add code
Oct 10, 2024
Figure 1 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Figure 2 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Figure 3 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Figure 4 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Viaarxiv icon