Alert button

End-to-end Document Recognition and Understanding with Dessurt

Mar 30, 2022
Brian Davis, Bryan Morse, Bryan Price, Chris Tensmeyer, Curtis Wigington, Vlad Morariu

Figure 1 for End-to-end Document Recognition and Understanding with Dessurt
Figure 2 for End-to-end Document Recognition and Understanding with Dessurt
Figure 3 for End-to-end Document Recognition and Understanding with Dessurt
Figure 4 for End-to-end Document Recognition and Understanding with Dessurt

Share this with someone who'll enjoy it:

We introduce Dessurt, a relatively simple document understanding transformer capable of being fine-tuned on a greater variety of document tasks than prior methods. It receives a document image and task string as input and generates arbitrary text autoregressively as output. Because Dessurt is an end-to-end architecture that performs text recognition in addition to the document understanding, it does not require an external recognition model as prior methods do, making it easier to fine-tune to new visual domains. We show that this model is effective at 9 different dataset-task combinations.

View paper onarxiv icon

Share this with someone who'll enjoy it: