Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Galarnyk

Evolutionary Task Discovery: Advancing Reasoning Frontiers via Skill Composition and Complexity Scaling

May 12, 2026

Liqin Ye, Yanbin Yin, Michael Galarnyk, Yuzhao Heng, Sudheer Chava, Chao Zhang

Abstract:The reasoning frontier of Large Language Models (LLMs) has advanced significantly through modern post-training paradigms (e.g., Reinforcement Learning from Verifiable Rewards (RLVR)). However, the efficacy of these methods remains fundamentally constrained by the diversity and complexity of the training data. One practical solution is data synthesis; yet, prevalent methods relying on unstructured mutation or exploration suffer from homogeneity collapse, failing to systematically expand the reasoning frontier. To overcome this, we propose Evoutionary Task Discovery (EvoTD), a framework that treats data synthesis as a directed search over a dual-axis manifold of Algorithmic Skills and Complexity Attributes. We introduce structured evolutionary operators to navigate this space: a Crossover operator that synthesizes novel skill compositions to enhance diversity, and a Parametric Mutation operator that scales structural constraints (e.g., input size, tree depth) to drive robust generalization. Crucially, we integrate a dynamic Zone of Proximal Development filter, ensuring tasks lie within the learnable region of the model. Empirically, EvoTD delivers substantial reasoning gains that generalize consistently across model architectures, pretraining regimes, and scales, demonstrating that structured evolutionary curricula can effectively support reasoning improvement. We release our code on https://github.com/liqinye/EvoTD.

Via

Access Paper or Ask Questions

How Inclusively do LMs Perceive Social and Moral Norms?

Feb 04, 2025

Michael Galarnyk, Agam Shah, Dipanwita Guhathakurta, Poojitha Nandigam, Sudheer Chava

Figure 1 for How Inclusively do LMs Perceive Social and Moral Norms?

Figure 2 for How Inclusively do LMs Perceive Social and Moral Norms?

Figure 3 for How Inclusively do LMs Perceive Social and Moral Norms?

Figure 4 for How Inclusively do LMs Perceive Social and Moral Norms?

Abstract:This paper discusses and contains offensive content. Language models (LMs) are used in decision-making systems and as interactive assistants. However, how well do these models making judgements align with the diversity of human values, particularly regarding social and moral norms? In this work, we investigate how inclusively LMs perceive norms across demographic groups (e.g., gender, age, and income). We prompt 11 LMs on rules-of-thumb (RoTs) and compare their outputs with the existing responses of 100 human annotators. We introduce the Absolute Distance Alignment Metric (ADA-Met) to quantify alignment on ordinal questions. We find notable disparities in LM responses, with younger, higher-income groups showing closer alignment, raising concerns about the representation of marginalized perspectives. Our findings highlight the importance of further efforts to make LMs more inclusive of diverse human values. The code and prompts are available on GitHub under the CC BY-NC 4.0 license.

* Accepted at NAACL 2025 Findings

Via

Access Paper or Ask Questions

ACL Ready: RAG Based Assistant for the ACL Checklist

Aug 07, 2024

Michael Galarnyk, Rutwik Routu, Kosha Bheda, Priyanshu Mehta, Agam Shah, Sudheer Chava

Figure 1 for ACL Ready: RAG Based Assistant for the ACL Checklist

Figure 2 for ACL Ready: RAG Based Assistant for the ACL Checklist

Figure 3 for ACL Ready: RAG Based Assistant for the ACL Checklist

Figure 4 for ACL Ready: RAG Based Assistant for the ACL Checklist

Abstract:The ARR Responsible NLP Research checklist website states that the "checklist is designed to encourage best practices for responsible research, addressing issues of research ethics, societal impact and reproducibility." Answering the questions is an opportunity for authors to reflect on their work and make sure any shared scientific assets follow best practices. Ideally, considering the checklist before submission can favorably impact the writing of a research paper. However, the checklist is often filled out at the last moment. In this work, we introduce ACLReady, a retrieval-augmented language model application that can be used to empower authors to reflect on their work and assist authors with the ACL checklist. To test the effectiveness of the system, we conducted a qualitative study with 13 users which shows that 92% of users found the application useful and easy to use as well as 77% of the users found that the application provided the information they expected. Our code is publicly available under the CC BY-NC 4.0 license on GitHub.

Via

Access Paper or Ask Questions