Picture for Dimitris N. Metaxas

Dimitris N. Metaxas

Rutgers University

Latent Reward Steering: An Adaptive Inference-Time Framework that Implicitly Promotes Cognitive Behaviors in Reasoning LLMs

Add code
May 30, 2026
Viaarxiv icon

Weak Critics Make Strong Learners: On-Policy Critique Distillation for Scalable Oversight

Add code
May 29, 2026
Viaarxiv icon

PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning

Add code
May 29, 2026
Viaarxiv icon

MemGym: a Long-Horizon Memory Environment for LLM Agents

Add code
May 20, 2026
Viaarxiv icon

MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory

Add code
May 14, 2026
Viaarxiv icon

All Circuits Lead to Rome: Rethinking Functional Anisotropy in Circuit and Sheaf Discovery for LLMs

Add code
May 12, 2026
Viaarxiv icon

DARE: Difficulty-Adaptive Reinforcement Learning with Co-Evolved Difficulty Estimation

Add code
May 09, 2026
Viaarxiv icon

Evidence Over Plans: Online Trajectory Verification for Skill Distillation

Add code
May 09, 2026
Viaarxiv icon

SignVerse-2M: A Two-Million-Clip Pose-Native Universe of 55+ Sign Languages

Add code
May 06, 2026
Viaarxiv icon

AEL: Agent Evolving Learning for Open-Ended Environments

Add code
Apr 23, 2026
Viaarxiv icon