Picture for Perouz Taslakian

Perouz Taslakian

MosaicLeaks:Privacy Risks in Querying-in-the-Open for Deep Research Agents

Add code
May 29, 2026
Viaarxiv icon

How and What to Imagine? Visual Thinking in Unified Multimodal Models for Cross-View Spatial Reasoning

Add code
May 26, 2026
Viaarxiv icon

Mem-$π$: Adaptive Memory through Learning When and What to Generate

Add code
May 20, 2026
Viaarxiv icon

Grounding Computer Use Agents on Human Demonstrations

Add code
Nov 10, 2025
Figure 1 for Grounding Computer Use Agents on Human Demonstrations
Figure 2 for Grounding Computer Use Agents on Human Demonstrations
Figure 3 for Grounding Computer Use Agents on Human Demonstrations
Figure 4 for Grounding Computer Use Agents on Human Demonstrations
Viaarxiv icon

Rendering-Aware Reinforcement Learning for Vector Graphics Generation

Add code
May 27, 2025
Figure 1 for Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Figure 2 for Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Figure 3 for Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Figure 4 for Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Viaarxiv icon

StarFlow: Generating Structured Workflow Outputs From Sketch Images

Add code
Mar 27, 2025
Figure 1 for StarFlow: Generating Structured Workflow Outputs From Sketch Images
Figure 2 for StarFlow: Generating Structured Workflow Outputs From Sketch Images
Figure 3 for StarFlow: Generating Structured Workflow Outputs From Sketch Images
Figure 4 for StarFlow: Generating Structured Workflow Outputs From Sketch Images
Viaarxiv icon

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction

Add code
Mar 19, 2025
Viaarxiv icon

Learning to Defer for Causal Discovery with Imperfect Experts

Add code
Feb 18, 2025
Figure 1 for Learning to Defer for Causal Discovery with Imperfect Experts
Figure 2 for Learning to Defer for Causal Discovery with Imperfect Experts
Figure 3 for Learning to Defer for Causal Discovery with Imperfect Experts
Figure 4 for Learning to Defer for Causal Discovery with Imperfect Experts
Viaarxiv icon

ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval

Add code
Feb 11, 2025
Viaarxiv icon

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

Add code
Feb 03, 2025
Figure 1 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 2 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 3 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 4 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Viaarxiv icon