Picture for Oncel Tuzel

Oncel Tuzel

LiTo: Surface Light Field Tokenization

Add code
Mar 11, 2026
Viaarxiv icon

ASTRA-bench: Evaluating Tool-Use Agent Reasoning and Action Planning with Personal User Context

Add code
Mar 02, 2026
Viaarxiv icon

TrajTok: Learning Trajectory Tokens enables better Video Understanding

Add code
Feb 26, 2026
Viaarxiv icon

Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pretraining

Add code
Feb 23, 2026
Viaarxiv icon

RayRoPE: Projective Ray Positional Encoding for Multi-view Attention

Add code
Jan 21, 2026
Viaarxiv icon

AMUSE: Audio-Visual Benchmark and Alignment Framework for Agentic Multi-Speaker Understanding

Add code
Dec 18, 2025
Figure 1 for AMUSE: Audio-Visual Benchmark and Alignment Framework for Agentic Multi-Speaker Understanding
Figure 2 for AMUSE: Audio-Visual Benchmark and Alignment Framework for Agentic Multi-Speaker Understanding
Figure 3 for AMUSE: Audio-Visual Benchmark and Alignment Framework for Agentic Multi-Speaker Understanding
Figure 4 for AMUSE: Audio-Visual Benchmark and Alignment Framework for Agentic Multi-Speaker Understanding
Viaarxiv icon

Learning to Reason for Hallucination Span Detection

Add code
Oct 02, 2025
Figure 1 for Learning to Reason for Hallucination Span Detection
Figure 2 for Learning to Reason for Hallucination Span Detection
Figure 3 for Learning to Reason for Hallucination Span Detection
Figure 4 for Learning to Reason for Hallucination Span Detection
Viaarxiv icon

MobileCLIP2: Improving Multi-Modal Reinforced Training

Add code
Aug 28, 2025
Viaarxiv icon

Proxy-FDA: Proxy-based Feature Distribution Alignment for Fine-tuning Vision Foundation Models without Forgetting

Add code
May 30, 2025
Figure 1 for Proxy-FDA: Proxy-based Feature Distribution Alignment for Fine-tuning Vision Foundation Models without Forgetting
Figure 2 for Proxy-FDA: Proxy-based Feature Distribution Alignment for Fine-tuning Vision Foundation Models without Forgetting
Figure 3 for Proxy-FDA: Proxy-based Feature Distribution Alignment for Fine-tuning Vision Foundation Models without Forgetting
Figure 4 for Proxy-FDA: Proxy-based Feature Distribution Alignment for Fine-tuning Vision Foundation Models without Forgetting
Viaarxiv icon

FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations

Add code
Apr 11, 2025
Figure 1 for FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations
Figure 2 for FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations
Figure 3 for FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations
Figure 4 for FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations
Viaarxiv icon