Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lav R. Varshney

University of Illinois at Urbana-Champaign

Energy-Aware Routing to Large Reasoning Models

Dec 23, 2025

Austin R. Ellis-Mohr, Max Hartman, Lav R. Varshney

Abstract:Large reasoning models (LRMs) have heterogeneous inference energy costs based on which model is used and how much it reasons. To reduce energy, it is important to choose the right LRM and operate it in the right way. As a result, the performance of systems that dispatch tasks to different individual LRMs depend on the balance between mean energy provisioning and stochastic fluctuations. The critical regime is the unique operating point at which neither auxiliary energy nor baseline energy is systematically wasted. Increasing baseline supply shifts the system toward persistent over-supply and baseline-energy waste, while reducing supply induces persistent reliance on auxiliary energy. Yet in this regime, performance remains volatility-limited and so a second-order characterization provides further insights that we develop. Here, performance is governed by how variability is absorbed across time, models, and execution choices. This perspective highlights variance-aware routing and dispatch as a principled design axis, and provides a theoretical basis for developing energy-aware model routing policies. Routing behavior is characterized when dispatch policies are based on training-compute and inference-compute scaling laws for LRMs.

Via

Access Paper or Ask Questions

Nonlinear Federated System Identification

Aug 20, 2025

Omkar Tupe, Max Hartman, Lav R. Varshney, Saurav Prakash

Abstract:We consider federated learning of linearly-parameterized nonlinear systems. We establish theoretical guarantees on the effectiveness of federated nonlinear system identification compared to centralized approaches, demonstrating that the convergence rate improves as the number of clients increases. Although the convergence rates in the linear and nonlinear cases differ only by a constant, this constant depends on the feature map $\phi$, which can be carefully chosen in the nonlinear setting to increase excitation and improve performance. We experimentally validate our theory in physical settings where client devices are driven by i.i.d. control inputs and control policies exhibiting i.i.d. random perturbations, ensuring non-active exploration. Experiments use trajectories from nonlinear dynamical systems characterized by real-analytic feature functions, including polynomial and trigonometric components, representative of physical systems including pendulum and quadrotor dynamics. We analyze the convergence behavior of the proposed method under varying noise levels and data distributions. Results show that federated learning consistently improves convergence of any individual client as the number of participating clients increases.

Via

Access Paper or Ask Questions

Concealment of Intent: A Game-Theoretic Analysis

May 27, 2025

Xinbo Wu, Abhishek Umrawal, Lav R. Varshney

Abstract:As large language models (LLMs) grow more capable, concerns about their safe deployment have also grown. Although alignment mechanisms have been introduced to deter misuse, they remain vulnerable to carefully designed adversarial prompts. In this work, we present a scalable attack strategy: intent-hiding adversarial prompting, which conceals malicious intent through the composition of skills. We develop a game-theoretic framework to model the interaction between such attacks and defense systems that apply both prompt and response filtering. Our analysis identifies equilibrium points and reveals structural advantages for the attacker. To counter these threats, we propose and analyze a defense mechanism tailored to intent-hiding attacks. Empirically, we validate the attack's effectiveness on multiple real-world LLMs across a range of malicious behaviors, demonstrating clear advantages over existing adversarial prompting techniques.

Via

Access Paper or Ask Questions

Platonic Grounding for Efficient Multimodal Language Models

Apr 27, 2025

Moulik Choraria, Xinbo Wu, Akhil Bhimaraju, Nitesh Sekhar, Yue Wu, Xu Zhang, Prateek Singhal, Lav R. Varshney

Figure 1 for Platonic Grounding for Efficient Multimodal Language Models

Figure 2 for Platonic Grounding for Efficient Multimodal Language Models

Figure 3 for Platonic Grounding for Efficient Multimodal Language Models

Figure 4 for Platonic Grounding for Efficient Multimodal Language Models

Abstract:The hyperscaling of data and parameter count in Transformer-based models is yielding diminishing performance improvement, especially when weighed against training costs. Such plateauing indicates the importance of methods for more efficient finetuning and inference, while retaining similar performance. This is especially relevant for multimodal learning paradigms, where inference costs of processing multimodal tokens can determine the model's practical viability. At the same time, research on representations and mechanistic interpretability has improved our understanding of the inner workings of Transformer-based models; one such line of work reveals an implicit alignment in the deeper layers of pretrained models, across modalities. Taking inspiration from this, we motivate and propose a simple modification to existing multimodal frameworks that rely on aligning pretrained models. We demonstrate that our approach maintains and, in some cases, even improves performance of baseline methods while achieving significant gains in both training and inference-time compute. Our work also has implications for combining pretrained models into larger systems efficiently.

Via

Access Paper or Ask Questions

Spark: A System for Scientifically Creative Idea Generation

Apr 25, 2025

Aishik Sanyal, Samuel Schapiro, Sumuk Shashidhar, Royce Moon, Lav R. Varshney, Dilek Hakkani-Tur

Figure 1 for Spark: A System for Scientifically Creative Idea Generation

Figure 2 for Spark: A System for Scientifically Creative Idea Generation

Figure 3 for Spark: A System for Scientifically Creative Idea Generation

Abstract:Recently, large language models (LLMs) have shown promising abilities to generate novel research ideas in science, a direction which coincides with many foundational principles in computational creativity (CC). In light of these developments, we present an idea generation system named Spark that couples retrieval-augmented idea generation using LLMs with a reviewer model named Judge trained on 600K scientific reviews from OpenReview. Our work is both a system demonstration and intended to inspire other CC researchers to explore grounding the generation and evaluation of scientific ideas within foundational CC principles. To this end, we release the annotated dataset used to train Judge, inviting other researchers to explore the use of LLMs for idea generation and creative evaluations.

Via

Access Paper or Ask Questions

Transformational Creativity in Science: A Graphical Theory

Apr 25, 2025

Samuel Schapiro, Jonah Black, Lav R. Varshney

Figure 1 for Transformational Creativity in Science: A Graphical Theory

Figure 2 for Transformational Creativity in Science: A Graphical Theory

Abstract:Creative processes are typically divided into three types: combinatorial, exploratory, and transformational. Here, we provide a graphical theory of transformational scientific creativity, synthesizing Boden's insight that transformational creativity arises from changes in the "enabling constraints" of a conceptual space and Kuhn's structure of scientific revolutions as resulting from paradigm shifts. We prove that modifications made to axioms of our graphical model have the most transformative potential and then illustrate how several historical instances of transformational creativity can be captured by our framework.

Via

Access Paper or Ask Questions

Fed-SB: A Silver Bullet for Extreme Communication Efficiency and Performance in (Private) Federated LoRA Fine-Tuning

Feb 21, 2025

Raghav Singhal, Kaustubh Ponkshe, Rohit Vartak, Lav R. Varshney, Praneeth Vepakomma

Abstract:Low-Rank Adaptation (LoRA) has become ubiquitous for efficiently fine-tuning foundation models. However, federated fine-tuning using LoRA is challenging due to suboptimal updates arising from traditional federated averaging of individual adapters. Existing solutions either incur prohibitively high communication cost that scales linearly with the number of clients or suffer from performance degradation due to limited expressivity. We introduce Federated Silver Bullet (Fed-SB), a novel approach for federated fine-tuning of LLMs using LoRA-SB, a recently proposed low-rank adaptation method. LoRA-SB optimally aligns the optimization trajectory with the ideal low-rank full fine-tuning projection by learning a small square matrix (R) between adapters B and A, keeping other components fixed. Direct averaging of R guarantees exact updates, substantially reducing communication cost, which remains independent of the number of clients, and enables scalability. Fed-SB achieves state-of-the-art performance across commonsense reasoning, arithmetic reasoning, and language inference tasks while reducing communication costs by up to 230x. In private settings, Fed-SB further improves performance by (1) reducing trainable parameters, thereby lowering the noise required for differential privacy and (2) avoiding noise amplification introduced by other methods. Overall, Fed-SB establishes a new Pareto frontier in the tradeoff between communication and performance, offering an efficient and scalable solution for both private and non-private federated fine-tuning. Our code is publicly available at https://github.com/CERT-Lab/fed-sb.

* Raghav Singhal and Kaustubh Ponkshe contributed equally to this work

Via

Access Paper or Ask Questions

ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks

Feb 07, 2025

Saurabh Jha, Rohan Arora, Yuji Watanabe, Takumi Yanagawa, Yinfang Chen, Jackson Clark, Bhavya Bhavya, Mudit Verma, Harshit Kumar, Hirokuni Kitahara(+33 more)

Figure 1 for ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks

Figure 2 for ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks

Figure 3 for ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks

Figure 4 for ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks

Abstract:Realizing the vision of using AI agents to automate critical IT tasks depends on the ability to measure and understand effectiveness of proposed solutions. We introduce ITBench, a framework that offers a systematic methodology for benchmarking AI agents to address real-world IT automation tasks. Our initial release targets three key areas: Site Reliability Engineering (SRE), Compliance and Security Operations (CISO), and Financial Operations (FinOps). The design enables AI researchers to understand the challenges and opportunities of AI agents for IT automation with push-button workflows and interpretable metrics. ITBench includes an initial set of 94 real-world scenarios, which can be easily extended by community contributions. Our results show that agents powered by state-of-the-art models resolve only 13.8% of SRE scenarios, 25.2% of CISO scenarios, and 0% of FinOps scenarios. We expect ITBench to be a key enabler of AI-driven IT automation that is correct, safe, and fast.

Via

Access Paper or Ask Questions

Online Reinforcement Learning with Passive Memory

Oct 18, 2024

Anay Pattanaik, Lav R. Varshney

Abstract:This paper considers an online reinforcement learning algorithm that leverages pre-collected data (passive memory) from the environment for online interaction. We show that using passive memory improves performance and further provide theoretical guarantees for regret that turns out to be near-minimax optimal. Results show that the quality of passive memory determines sub-optimality of the incurred regret. The proposed approach and results hold in both continuous and discrete state-action spaces.

Via

Access Paper or Ask Questions

Compute-Update Federated Learning: A Lattice Coding Approach

Sep 10, 2024

Seyed Mohammad Azimi-Abarghouyi, Lav R. Varshney

Abstract:This paper introduces a federated learning framework that enables over-the-air computation via digital communications, using a new joint source-channel coding scheme. Without relying on channel state information at devices, this scheme employs lattice codes to both quantize model parameters and exploit interference from the devices. We propose a novel receiver structure at the server, designed to reliably decode an integer combination of the quantized model parameters as a lattice point for the purpose of aggregation. We present a mathematical approach to derive a convergence bound for the proposed scheme and offer design remarks. In this context, we suggest an aggregation metric and a corresponding algorithm to determine effective integer coefficients for the aggregation in each communication round. Our results illustrate that, regardless of channel dynamics and data heterogeneity, our scheme consistently delivers superior learning accuracy across various parameters and markedly surpasses other over-the-air methodologies.

* Extended version of the preprint available at arXiv:2403.01023

Via

Access Paper or Ask Questions