Picture for Tom Goldstein

Tom Goldstein

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Add code
Jun 05, 2025
Viaarxiv icon

MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning

Add code
Jun 05, 2025
Viaarxiv icon

Zero-Shot Vision Encoder Grafting via LLM Surrogates

Add code
May 28, 2025
Figure 1 for Zero-Shot Vision Encoder Grafting via LLM Surrogates
Figure 2 for Zero-Shot Vision Encoder Grafting via LLM Surrogates
Figure 3 for Zero-Shot Vision Encoder Grafting via LLM Surrogates
Figure 4 for Zero-Shot Vision Encoder Grafting via LLM Surrogates
Viaarxiv icon

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Add code
May 14, 2025
Viaarxiv icon

Leveraging AI for Productive and Trustworthy HPC Software: Challenges and Research Directions

Add code
May 13, 2025
Viaarxiv icon

AegisLLM: Scaling Agentic Systems for Self-Reflective Defense in LLM Security

Add code
Apr 29, 2025
Viaarxiv icon

Dense Backpropagation Improves Training for Sparse Mixture-of-Experts

Add code
Apr 18, 2025
Viaarxiv icon

Analysis of Attention in Video Diffusion Transformers

Add code
Apr 14, 2025
Viaarxiv icon

LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation

Add code
Apr 10, 2025
Viaarxiv icon

Using Attention Sinks to Identify and Evaluate Dormant Heads in Pretrained LLMs

Add code
Apr 04, 2025
Viaarxiv icon