Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Oguzhan Gungordu

Scaling Search-Augmented LLM Reasoning via Adaptive Information Control

Feb 02, 2026

Siheng Xiong, Oguzhan Gungordu, Blair Johnson, James C. Kerce, Faramarz Fekri

Abstract:Search-augmented reasoning agents interleave multi-step reasoning with external information retrieval, but uncontrolled retrieval often leads to redundant evidence, context saturation, and unstable learning. Existing approaches rely on outcome-based reinforcement learning (RL), which provides limited guidance for regulating information acquisition. We propose DeepControl, a framework for adaptive information control based on a formal notion of information utility, which measures the marginal value of retrieved evidence under a given reasoning state. Building on this utility, we introduce retrieval continuation and granularity control mechanisms that selectively regulate when to continue and stop retrieval, and how much information to expand. An annealed control strategy enables the agent to internalize effective information acquisition behaviors during training. Extensive experiments across seven benchmarks demonstrate that our method consistently outperforms strong baselines. In particular, our approach achieves average performance improvements of 9.4% and 8.6% on Qwen2.5-7B and Qwen2.5-3B, respectively, over strong outcome-based RL baselines, and consistently outperforms both retrieval-free and retrieval-based reasoning methods without explicit information control. These results highlight the importance of adaptive information control for scaling search-augmented reasoning agents to complex, real-world information environments.

* Work in progress

Via

Access Paper or Ask Questions

PathWise: Planning through World Model for Automated Heuristic Design via Self-Evolving LLMs

Jan 29, 2026

Oguzhan Gungordu, Siheng Xiong, Faramarz Fekri

Abstract:Large Language Models (LLMs) have enabled automated heuristic design (AHD) for combinatorial optimization problems (COPs), but existing frameworks' reliance on fixed evolutionary rules and static prompt templates often leads to myopic heuristic generation, redundant evaluations, and limited reasoning about how new heuristics should be derived. We propose a novel multi-agent reasoning framework, referred to as Planning through World Model for Automated Heuristic Design via Self-Evolving LLMs (PathWise), which formulates heuristic generation as a sequential decision process over an entailment graph serving as a compact, stateful memory of the search trajectory. This approach allows the system to carry forward past decisions and reuse or avoid derivation information across generations. A policy agent plans evolutionary actions, a world model agent generates heuristic rollouts conditioned on those actions, and critic agents provide routed reflections summarizing lessons from prior steps, shifting LLM-based AHD from trial-and-error evolution toward state-aware planning through reasoning. Experiments across diverse COPs show that PathWise converges faster to better heuristics, generalizes across different LLM backbones, and scales to larger problem sizes.

Via

Access Paper or Ask Questions

Saliency-aware End-to-end Learned Variable-Bitrate 360-degree Image Compression

Feb 14, 2024

Oguzhan Gungordu, A. Murat Tekalp

Abstract:Effective compression of 360$^\circ$ images, also referred to as omnidirectional images (ODIs), is of high interest for various virtual reality (VR) and related applications. 2D image compression methods ignore the equator-biased nature of ODIs and fail to address oversampling near the poles, leading to inefficient compression when applied to ODI. We present a new learned saliency-aware 360$^\circ$ image compression architecture that prioritizes bit allocation to more significant regions, considering the unique properties of ODIs. By assigning fewer bits to less important regions, significant data size reduction can be achieved while maintaining high visual quality in the significant regions. To the best of our knowledge, this is the first study that proposes an end-to-end variable-rate model to compress 360$^\circ$ images leveraging saliency information. The results show significant bit-rate savings over the state-of-the-art learned and traditional ODI compression methods at similar perceptual visual quality.

* 7 pages with double column, 1 and a half for references, 6 figures and 4 tables, submitted to IEEE ICIP 2024

Via

Access Paper or Ask Questions