Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Synthetic Document Generator for Annotation-free Layout Recognition

Nov 11, 2021

Natraj Raman, Sameena Shah, Manuela Veloso

Figure 1 for Synthetic Document Generator for Annotation-free Layout Recognition

Figure 2 for Synthetic Document Generator for Annotation-free Layout Recognition

Figure 3 for Synthetic Document Generator for Annotation-free Layout Recognition

Figure 4 for Synthetic Document Generator for Annotation-free Layout Recognition

Share this with someone who'll enjoy it:

Abstract:Analyzing the layout of a document to identify headers, sections, tables, figures etc. is critical to understanding its content. Deep learning based approaches for detecting the layout structure of document images have been promising. However, these methods require a large number of annotated examples during training, which are both expensive and time consuming to obtain. We describe here a synthetic document generator that automatically produces realistic documents with labels for spatial positions, extents and categories of the layout elements. The proposed generative process treats every physical component of a document as a random variable and models their intrinsic dependencies using a Bayesian Network graph. Our hierarchical formulation using stochastic templates allow parameter sharing between documents for retaining broad themes and yet the distributional characteristics produces visually unique samples, thereby capturing complex and diverse layouts. We empirically illustrate that a deep layout detection model trained purely on the synthetic documents can match the performance of a model that uses real documents.

View paper on

Share this with someone who'll enjoy it:

Title:Synthetic Document Generator for Annotation-free Layout Recognition

Paper and Code