Picture for Kate Saenko

Kate Saenko

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

Breaking the Assistant Mold: Modeling Behavioral Variation in LLM Based Procedural Character Generation

Add code
Jan 06, 2026
Viaarxiv icon

BabyVLM-V2: Toward Developmentally Grounded Pretraining and Benchmarking of Vision Foundation Models

Add code
Dec 11, 2025
Viaarxiv icon

Mull-Tokens: Modality-Agnostic Latent Thinking

Add code
Dec 11, 2025
Figure 1 for Mull-Tokens: Modality-Agnostic Latent Thinking
Figure 2 for Mull-Tokens: Modality-Agnostic Latent Thinking
Figure 3 for Mull-Tokens: Modality-Agnostic Latent Thinking
Figure 4 for Mull-Tokens: Modality-Agnostic Latent Thinking
Viaarxiv icon

The SA-FARI Dataset: Segment Anything in Footage of Animals for Recognition and Identification

Add code
Nov 19, 2025
Viaarxiv icon

Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data

Add code
Apr 07, 2025
Figure 1 for Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data
Figure 2 for Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data
Figure 3 for Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data
Figure 4 for Enhancing Compositional Reasoning in Vision-Language Models with Synthetic Preference Data
Viaarxiv icon

Web Artifact Attacks Disrupt Vision Language Models

Add code
Mar 17, 2025
Viaarxiv icon

SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models

Add code
Feb 24, 2025
Figure 1 for SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models
Figure 2 for SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models
Figure 3 for SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models
Figure 4 for SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models
Viaarxiv icon

OP-LoRA: The Blessing of Dimensionality

Add code
Dec 13, 2024
Figure 1 for OP-LoRA: The Blessing of Dimensionality
Figure 2 for OP-LoRA: The Blessing of Dimensionality
Figure 3 for OP-LoRA: The Blessing of Dimensionality
Figure 4 for OP-LoRA: The Blessing of Dimensionality
Viaarxiv icon

SAT: Spatial Aptitude Training for Multimodal Language Models

Add code
Dec 10, 2024
Figure 1 for SAT: Spatial Aptitude Training for Multimodal Language Models
Figure 2 for SAT: Spatial Aptitude Training for Multimodal Language Models
Figure 3 for SAT: Spatial Aptitude Training for Multimodal Language Models
Figure 4 for SAT: Spatial Aptitude Training for Multimodal Language Models
Viaarxiv icon