Picture for Srinivas Sunkara

Srinivas Sunkara

JD

Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos

Add code
Mar 23, 2026
Viaarxiv icon

EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings

Add code
Mar 13, 2026
Viaarxiv icon

AprielGuard

Add code
Dec 23, 2025
Figure 1 for AprielGuard
Figure 2 for AprielGuard
Figure 3 for AprielGuard
Figure 4 for AprielGuard
Viaarxiv icon

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Add code
Dec 05, 2024
Figure 1 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Figure 2 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Figure 3 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Figure 4 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Viaarxiv icon

ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Add code
Feb 19, 2024
Figure 1 for ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Figure 2 for ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Figure 3 for ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Figure 4 for ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Viaarxiv icon

Towards Better Semantic Understanding of Mobile Interfaces

Add code
Oct 06, 2022
Figure 1 for Towards Better Semantic Understanding of Mobile Interfaces
Figure 2 for Towards Better Semantic Understanding of Mobile Interfaces
Figure 3 for Towards Better Semantic Understanding of Mobile Interfaces
Figure 4 for Towards Better Semantic Understanding of Mobile Interfaces
Viaarxiv icon

A Unified Approach to Entity-Centric Context Tracking in Social Conversations

Add code
Jan 28, 2022
Figure 1 for A Unified Approach to Entity-Centric Context Tracking in Social Conversations
Figure 2 for A Unified Approach to Entity-Centric Context Tracking in Social Conversations
Figure 3 for A Unified Approach to Entity-Centric Context Tracking in Social Conversations
Figure 4 for A Unified Approach to Entity-Centric Context Tracking in Social Conversations
Viaarxiv icon

UIBert: Learning Generic Multimodal Representations for UI Understanding

Add code
Aug 10, 2021
Figure 1 for UIBert: Learning Generic Multimodal Representations for UI Understanding
Figure 2 for UIBert: Learning Generic Multimodal Representations for UI Understanding
Figure 3 for UIBert: Learning Generic Multimodal Representations for UI Understanding
Figure 4 for UIBert: Learning Generic Multimodal Representations for UI Understanding
Viaarxiv icon

ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces

Add code
Jan 25, 2021
Figure 1 for ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces
Figure 2 for ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces
Figure 3 for ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces
Figure 4 for ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces
Viaarxiv icon

Schema-Guided Dialogue State Tracking Task at DSTC8

Add code
Feb 02, 2020
Figure 1 for Schema-Guided Dialogue State Tracking Task at DSTC8
Figure 2 for Schema-Guided Dialogue State Tracking Task at DSTC8
Figure 3 for Schema-Guided Dialogue State Tracking Task at DSTC8
Figure 4 for Schema-Guided Dialogue State Tracking Task at DSTC8
Viaarxiv icon