Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Berkay Döner

Sketch-to-Layout: Sketch-Guided Multimodal Layout Generation

Oct 31, 2025

Riccardo Brioschi, Aleksandr Alekseev, Emanuele Nevali, Berkay Döner, Omar El Malki, Blagoj Mitrevski, Leandro Kieliger, Mark Collier, Andrii Maksai, Jesse Berent(+2 more)

Abstract:Graphic layout generation is a growing research area focusing on generating aesthetically pleasing layouts ranging from poster designs to documents. While recent research has explored ways to incorporate user constraints to guide the layout generation, these constraints often require complex specifications which reduce usability. We introduce an innovative approach exploiting user-provided sketches as intuitive constraints and we demonstrate empirically the effectiveness of this new guidance method, establishing the sketch-to-layout problem as a promising research direction, which is currently under-explored. To tackle the sketch-to-layout problem, we propose a multimodal transformer-based solution using the sketch and the content assets as inputs to produce high quality layouts. Since collecting sketch training data from human annotators to train our model is very costly, we introduce a novel and efficient method to synthetically generate training sketches at scale. We train and evaluate our model on three publicly available datasets: PubLayNet, DocLayNet and SlidesVQA, demonstrating that it outperforms state-of-the-art constraint-based methods, while offering a more intuitive design experience. In order to facilitate future sketch-to-layout research, we release O(200k) synthetically-generated sketches for the public datasets above. The datasets are available at https://github.com/google-deepmind/sketch_to_layout.

* 15 pages, 18 figures, GitHub link: https://github.com/google-deepmind/sketch_to_layout, accept at ICCV 2025 Workshop (HiGen)

Via

Access Paper or Ask Questions

Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access

Jan 18, 2024

Saibo Geng, Berkay Döner, Chris Wendler, Martin Josifoski, Robert West

Figure 1 for Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access

Figure 2 for Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access

Figure 3 for Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access

Figure 4 for Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access

Abstract:Constrained decoding, a technique for enforcing constraints on language model outputs, offers a way to control text generation without retraining or architectural modifications. Its application is, however, typically restricted to models that give users access to next-token distributions (usually via softmax logits), which poses a limitation with blackbox large language models (LLMs). This paper introduces sketch-guided constrained decoding (SGCD), a novel approach to constrained decoding for blackbox LLMs, which operates without access to the logits of the blackbox LLM. SGCD utilizes a locally hosted auxiliary model to refine the output of an unconstrained blackbox LLM, effectively treating this initial output as a "sketch" for further elaboration. This approach is complementary to traditional logit-based techniques and enables the application of constrained decoding in settings where full model transparency is unavailable. We demonstrate the efficacy of SGCD through experiments in closed information extraction and constituency parsing, showing how it enhances the utility and flexibility of blackbox LLMs for complex NLP tasks.

Via

Access Paper or Ask Questions