Picture for Kristina Toutanova

Kristina Toutanova

Understanding the World's Museums through Vision-Language Reasoning

Add code
Dec 02, 2024
Viaarxiv icon

ALTA: Compiler-Based Analysis of Transformers

Add code
Oct 23, 2024
Figure 1 for ALTA: Compiler-Based Analysis of Transformers
Figure 2 for ALTA: Compiler-Based Analysis of Transformers
Figure 3 for ALTA: Compiler-Based Analysis of Transformers
Figure 4 for ALTA: Compiler-Based Analysis of Transformers
Viaarxiv icon

Taming CLIP for Fine-grained and Structured Visual Understanding of Museum Exhibits

Add code
Sep 03, 2024
Viaarxiv icon

Efficient End-to-End Visual Document Understanding with Rationale Distillation

Add code
Nov 16, 2023
Viaarxiv icon

From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces

Add code
May 31, 2023
Figure 1 for From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces
Figure 2 for From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces
Figure 3 for From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces
Figure 4 for From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces
Viaarxiv icon

Anchor Prediction: Automatic Refinement of Internet Links

Add code
May 24, 2023
Viaarxiv icon

QUEST: A Retrieval Dataset of Entity-Seeking Queries with Implicit Set Operations

Add code
May 19, 2023
Viaarxiv icon

Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities

Add code
Feb 24, 2023
Figure 1 for Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities
Figure 2 for Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities
Figure 3 for Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities
Figure 4 for Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities
Viaarxiv icon

Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

Add code
Oct 07, 2022
Figure 1 for Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Figure 2 for Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Figure 3 for Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Figure 4 for Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Viaarxiv icon

Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing

Add code
May 24, 2022
Figure 1 for Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing
Figure 2 for Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing
Figure 3 for Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing
Figure 4 for Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing
Viaarxiv icon