Picture for Zsolt Kira

Zsolt Kira

Let's Think in Two Steps: Mitigating Agreement Bias in MLLMs with Self-Grounded Verification

Add code
Jul 15, 2025
Viaarxiv icon

EscherNet++: Simultaneous Amodal Completion and Scalable View Synthesis through Masked Fine-Tuning and Enhanced Feed-Forward 3D Reconstruction

Add code
Jul 10, 2025
Viaarxiv icon

FindingDory: A Benchmark to Evaluate Memory in Embodied Agents

Add code
Jun 18, 2025
Viaarxiv icon

MedMoE: Modality-Specialized Mixture of Experts for Medical Vision-Language Understanding

Add code
Jun 11, 2025
Viaarxiv icon

Mimicking or Reasoning: Rethinking Multi-Modal In-Context Learning in Vision-Language Models

Add code
Jun 09, 2025
Viaarxiv icon

FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering

Add code
May 27, 2025
Viaarxiv icon

Barrier Function Overrides For Non-Convex Fixed Wing Flight Control and Self-Driving Cars

Add code
May 08, 2025
Viaarxiv icon

Grounding Multimodal LLMs to Embodied Agents that Ask for Help with Reinforcement Learning

Add code
Apr 02, 2025
Viaarxiv icon

When Domain Generalization meets Generalized Category Discovery: An Adaptive Task-Arithmetic Driven Approach

Add code
Mar 21, 2025
Viaarxiv icon

Directional Gradient Projection for Robust Fine-Tuning of Foundation Models

Add code
Feb 21, 2025
Viaarxiv icon