Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xueyu Zhou

DOS: Dependency-Oriented Sampler for Masked Diffusion Language Models

Mar 16, 2026

Xueyu Zhou, Yangrong Hu, Jian Huang

Abstract:Masked diffusion language models (MDLMs) have recently emerged as a new paradigm in language modeling, offering flexible generation dynamics and enabling efficient parallel decoding. However, existing decoding strategies for pre-trained MDLMs predominantly rely on token-level uncertainty criteria, while largely overlooking sequence-level information and inter-token dependencies. To address this limitation, we propose Dependency-Oriented Sampler (DOS), a training-free decoding strategy that leverages inter-token dependencies to inform token updates during generation. Specifically, DOS exploits attention matrices from transformer blocks to approximate inter-token dependencies, emphasizing information from unmasked tokens when updating masked positions. Empirical results demonstrate that DOS consistently achieves superior performance on both code generation and mathematical reasoning tasks. Moreover, DOS can be seamlessly integrated with existing parallel sampling methods, leading to improved generation efficiency without sacrificing generation quality.

* 16 pages, 5 figures

Via

Access Paper or Ask Questions

Transfer Learning through Enhanced Sufficient Representation: Enriching Source Domain Knowledge with Target Data

Feb 22, 2025

Yeheng Ge, Xueyu Zhou, Jian Huang

Figure 1 for Transfer Learning through Enhanced Sufficient Representation: Enriching Source Domain Knowledge with Target Data

Figure 2 for Transfer Learning through Enhanced Sufficient Representation: Enriching Source Domain Knowledge with Target Data

Figure 3 for Transfer Learning through Enhanced Sufficient Representation: Enriching Source Domain Knowledge with Target Data

Figure 4 for Transfer Learning through Enhanced Sufficient Representation: Enriching Source Domain Knowledge with Target Data

Abstract:Transfer learning is an important approach for addressing the challenges posed by limited data availability in various applications. It accomplishes this by transferring knowledge from well-established source domains to a less familiar target domain. However, traditional transfer learning methods often face difficulties due to rigid model assumptions and the need for a high degree of similarity between source and target domain models. In this paper, we introduce a novel method for transfer learning called Transfer learning through Enhanced Sufficient Representation (TESR). Our approach begins by estimating a sufficient and invariant representation from the source domains. This representation is then enhanced with an independent component derived from the target data, ensuring that it is sufficient for the target domain and adaptable to its specific characteristics. A notable advantage of TESR is that it does not rely on assuming similar model structures across different tasks. For example, the source domain models can be regression models, while the target domain task can be classification. This flexibility makes TESR applicable to a wide range of supervised learning problems. We explore the theoretical properties of TESR and validate its performance through simulation studies and real-world data applications, demonstrating its effectiveness in finite sample settings.

* 44 pages

Via

Access Paper or Ask Questions