Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Markus Krimmel

Diffusion-Pretrained Dense and Contextual Embeddings

Feb 13, 2026

Sedigheh Eslami, Maksim Gaiduk, Markus Krimmel, Louis Milliken, Bo Wang, Denis Bykov

Abstract:In this report, we introduce pplx-embed, a family of multilingual embedding models that employ multi-stage contrastive learning on a diffusion-pretrained language model backbone for web-scale retrieval. By leveraging bidirectional attention through diffusion-based pretraining, our models capture comprehensive bidirectional context within passages, enabling the use of mean pooling and a late chunking strategy to better preserve global context across long documents. We release two model types: pplx-embed-v1 for standard retrieval, and pplx-embed-context-v1 for contextualized embeddings that incorporate global document context into passage representations. pplx-embed-v1 achieves competitive performance on the MTEB(Multilingual, v2), MTEB(Code), MIRACL, BERGEN, and ToolRet retrieval benchmarks, while pplx-embed-context-v1 sets new records on the ConTEB benchmark. Beyond public benchmarks, pplx-embed-v1 demonstrates strong performance on our internal evaluation suite, focusing on real-world, large-scale search scenarios constructed from 1B production web pages. These results validate the models' effectiveness in production environments where retrieval quality and efficiency are critical at scale.

Via

Access Paper or Ask Questions

Flatten Graphs as Sequences: Transformers are Scalable Graph Generators

Feb 04, 2025

Dexiong Chen, Markus Krimmel, Karsten Borgwardt

Figure 1 for Flatten Graphs as Sequences: Transformers are Scalable Graph Generators

Figure 2 for Flatten Graphs as Sequences: Transformers are Scalable Graph Generators

Figure 3 for Flatten Graphs as Sequences: Transformers are Scalable Graph Generators

Figure 4 for Flatten Graphs as Sequences: Transformers are Scalable Graph Generators

Abstract:We introduce AutoGraph, a novel autoregressive framework for generating large attributed graphs using decoder-only transformers. At the core of our approach is a reversible "flattening" process that transforms graphs into random sequences. By sampling and learning from these sequences, AutoGraph enables transformers to model and generate complex graph structures in a manner akin to natural language. In contrast to diffusion models that rely on computationally intensive node features, our approach operates exclusively on these sequences. The sampling complexity and sequence length scale linearly with the number of edges, making AutoGraph highly scalable for generating large sparse graphs. Empirically, AutoGraph achieves state-of-the-art performance across diverse synthetic and molecular graph generation benchmarks, while delivering a 100-fold generation and a 3-fold training speedup compared to leading diffusion models. Additionally, it demonstrates promising transfer capabilities and supports substructure-conditioned generation without additional fine-tuning. By extending language modeling techniques to graph generation, this work paves the way for developing graph foundation models.

Via

Access Paper or Ask Questions

Towards Fast Graph Generation via Autoregressive Noisy Filtration Modeling

Feb 04, 2025

Markus Krimmel, Jenna Wiens, Karsten Borgwardt, Dexiong Chen

Figure 1 for Towards Fast Graph Generation via Autoregressive Noisy Filtration Modeling

Figure 2 for Towards Fast Graph Generation via Autoregressive Noisy Filtration Modeling

Figure 3 for Towards Fast Graph Generation via Autoregressive Noisy Filtration Modeling

Figure 4 for Towards Fast Graph Generation via Autoregressive Noisy Filtration Modeling

Abstract:Graph generative models often face a critical trade-off between learning complex distributions and achieving fast generation speed. We introduce Autoregressive Noisy Filtration Modeling (ANFM), a novel approach that addresses both challenges. ANFM leverages filtration, a concept from topological data analysis, to transform graphs into short sequences of monotonically increasing subgraphs. This formulation extends the sequence families used in previous autoregressive models. To learn from these sequences, we propose a novel autoregressive graph mixer model. Our experiments suggest that exposure bias might represent a substantial hurdle in autoregressive graph generation and we introduce two mitigation strategies to address it: noise augmentation and a reinforcement learning approach. Incorporating these techniques leads to substantial performance gains, making ANFM competitive with state-of-the-art diffusion models across diverse synthetic and real-world datasets. Notably, ANFM produces remarkably short sequences, achieving a 100-fold speedup in generation time compared to diffusion models. This work marks a significant step toward high-throughput graph generation.

* 32 pages, 27 tables, 6 figures

Via

Access Paper or Ask Questions

jina-embeddings-v3: Multilingual Embeddings With Task LoRA

Sep 17, 2024

Saba Sturua, Isabelle Mohr, Mohammad Kalim Akram, Michael Günther, Bo Wang, Markus Krimmel, Feng Wang, Georgios Mastrapas, Andreas Koukounas, Nan Wang(+1 more)

Figure 1 for jina-embeddings-v3: Multilingual Embeddings With Task LoRA

Figure 2 for jina-embeddings-v3: Multilingual Embeddings With Task LoRA

Figure 3 for jina-embeddings-v3: Multilingual Embeddings With Task LoRA

Figure 4 for jina-embeddings-v3: Multilingual Embeddings With Task LoRA

Abstract:We introduce jina-embeddings-v3, a novel text embedding model with 570 million parameters, achieves state-of-the-art performance on multilingual data and long-context retrieval tasks, supporting context lengths of up to 8192 tokens. The model includes a set of task-specific Low-Rank Adaptation (LoRA) adapters to generate high-quality embeddings for query-document retrieval, clustering, classification, and text matching. Additionally, Matryoshka Representation Learning is integrated into the training process, allowing flexible truncation of embedding dimensions without compromising performance. Evaluation on the MTEB benchmark shows that jina-embeddings-v3 outperforms the latest proprietary embeddings from OpenAI and Cohere on English tasks, while achieving superior performance compared to multilingual-e5-large-instruct across all multilingual tasks.

* 20 pages, pp11-13 references, pp14-20 appendix and experiment tables

Via

Access Paper or Ask Questions

Gymnasium: A Standard Interface for Reinforcement Learning Environments

Jul 24, 2024

Mark Towers, Ariel Kwiatkowski, Jordan Terry, John U. Balis, Gianluca De Cola, Tristan Deleu, Manuel Goulão, Andreas Kallinteris, Markus Krimmel, Arjun KG(+6 more)

Figure 1 for Gymnasium: A Standard Interface for Reinforcement Learning Environments

Abstract:Gymnasium is an open-source library providing an API for reinforcement learning environments. Its main contribution is a central abstraction for wide interoperability between benchmark environments and training algorithms. Gymnasium comes with various built-in environments and utilities to simplify researchers' work along with being supported by most training libraries. This paper outlines the main design decisions for Gymnasium, its key features, and the differences to alternative APIs.

* 6 pages, 1 figure, preprint

Via

Access Paper or Ask Questions

Attention Normalization Impacts Cardinality Generalization in Slot Attention

Jul 04, 2024

Markus Krimmel, Jan Achterhold, Joerg Stueckler

Figure 1 for Attention Normalization Impacts Cardinality Generalization in Slot Attention

Figure 2 for Attention Normalization Impacts Cardinality Generalization in Slot Attention

Figure 3 for Attention Normalization Impacts Cardinality Generalization in Slot Attention

Figure 4 for Attention Normalization Impacts Cardinality Generalization in Slot Attention

Abstract:Object-centric scene decompositions are important representations for downstream tasks in fields such as computer vision and robotics. The recently proposed Slot Attention module, already leveraged by several derivative works for image segmentation and object tracking in videos, is a deep learning component which performs unsupervised object-centric scene decomposition on input images. It is based on an attention architecture, in which latent slot vectors, which hold compressed information on objects, attend to localized perceptual features from the input image. In this paper, we show that design decisions on normalizing the aggregated values in the attention architecture have considerable impact on the capabilities of Slot Attention to generalize to a higher number of slots and objects as seen during training. We argue that the original Slot Attention normalization scheme discards information on the prior assignment probability of pixels to slots, which impairs its generalization capabilities. Based on these findings, we propose and investigate alternative normalization approaches which increase the generalization capabilities of Slot Attention to varying slot and object counts, resulting in performance gains on the task of unsupervised image segmentation.

* 24 pages, 10 figures, 5 tables

Via

Access Paper or Ask Questions

Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings

Feb 26, 2024

Isabelle Mohr, Markus Krimmel, Saba Sturua, Mohammad Kalim Akram, Andreas Koukounas, Michael Günther, Georgios Mastrapas, Vinit Ravishankar, Joan Fontanals Martínez, Feng Wang(+9 more)

Figure 1 for Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings

Figure 2 for Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings

Figure 3 for Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings

Figure 4 for Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings

Abstract:We introduce a novel suite of state-of-the-art bilingual text embedding models that are designed to support English and another target language. These models are capable of processing lengthy text inputs with up to 8192 tokens, making them highly versatile for a range of natural language processing tasks such as text retrieval, clustering, and semantic textual similarity (STS) calculations. By focusing on bilingual models and introducing a unique multi-task learning objective, we have significantly improved the model performance on STS tasks, which outperforms the capabilities of existing multilingual models in both target language understanding and cross-lingual evaluation tasks. Moreover, our bilingual models are more efficient, requiring fewer parameters and less memory due to their smaller vocabulary needs. Furthermore, we have expanded the Massive Text Embedding Benchmark (MTEB) to include benchmarks for German and Spanish embedding models. This integration aims to stimulate further research and advancement in text embedding technologies for these languages.

Via

Access Paper or Ask Questions

Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning

Jul 11, 2022

Jan Achterhold, Markus Krimmel, Joerg Stueckler

Figure 1 for Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning

Figure 2 for Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning

Figure 3 for Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning

Abstract:Problems which require both long-horizon planning and continuous control capabilities pose significant challenges to existing reinforcement learning agents. In this paper we introduce a novel hierarchical reinforcement learning agent which links temporally extended skills for continuous control with a forward model in a symbolic discrete abstraction of the environment's state for planning. We term our agent SEADS for Symbolic Effect-Aware Diverse Skills. We formulate an objective and corresponding algorithm which leads to unsupervised learning of a diverse set of skills through intrinsic motivation given a known state abstraction. The skills are jointly learned with the symbolic forward model which captures the effect of skill execution in the state abstraction. After training, we can leverage the skills as symbolic actions using the forward model for long-horizon planning and subsequently execute the plan using the learned continuous-action control skills. The proposed algorithm learns skills and forward models that can be used to solve complex tasks which require both continuous control and long-horizon planning capabilities with high success rate. It compares favorably with other flat and hierarchical reinforcement learning baseline agents and is successfully demonstrated with a real robot.

* Project website (including video) is available at https://seads.is.tue.mpg.de/

Via

Access Paper or Ask Questions