Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marc-Antoine Allard

Experiential Reflective Learning for Self-Improving LLM Agents

Mar 25, 2026

Marc-Antoine Allard, Arnaud Teinturier, Victor Xing, Gautier Viaud

Abstract:Recent advances in large language models (LLMs) have enabled the development of autonomous agents capable of complex reasoning and multi-step problem solving. However, these agents struggle to adapt to specialized environments and do not leverage past interactions, approaching each new task from scratch regardless of their accumulated experience. We introduce Experiential Reflective Learning (ERL), a simple self-improvement framework that enables rapid environment adaptation through experiential learning. ERL reflects on task trajectories and outcomes to generate heuristics, capturing actionable lessons that transfer across tasks. At test time, relevant heuristics are retrieved based on the current task and injected into the agent's context to guide execution. On the Gaia2 benchmark, ERL improves success rate by 7.8% over a ReAct baseline, with large gains in task completion reliability, and outperforms prior experiential learning methods. Through systematic ablations, we find that selective retrieval is essential and that heuristics provide more transferable abstractions than few-shot trajectory prompting. These results demonstrate that reflecting on single-attempt experiences to extract transferable heuristics enables effective agent self-improvement.

* Published as a conference paper at the ICLR 2026 MemAgents Workshop

Via

Access Paper or Ask Questions

Enhancing Inflation Nowcasting with LLM: Sentiment Analysis on News

Oct 26, 2024

Marc-Antoine Allard, Paul Teiletche, Adam Zinebi

Figure 1 for Enhancing Inflation Nowcasting with LLM: Sentiment Analysis on News

Figure 2 for Enhancing Inflation Nowcasting with LLM: Sentiment Analysis on News

Figure 3 for Enhancing Inflation Nowcasting with LLM: Sentiment Analysis on News

Figure 4 for Enhancing Inflation Nowcasting with LLM: Sentiment Analysis on News

Abstract:This study explores the integration of large language models (LLMs) into classic inflation nowcasting frameworks, particularly in light of high inflation volatility periods such as the COVID-19 pandemic. We propose InflaBERT, a BERT-based LLM fine-tuned to predict inflation-related sentiment in news. We use this model to produce NEWS, an index capturing the monthly sentiment of the news regarding inflation. Incorporating our expectation index into the Cleveland Fed's model, which is only based on macroeconomic autoregressive processes, shows a marginal improvement in nowcast accuracy during the pandemic. This highlights the potential of combining sentiment analysis with traditional economic indicators, suggesting further research to refine these methodologies for better real-time inflation monitoring. The source code is available at https://github.com/paultltc/InflaBERT.

Via

Access Paper or Ask Questions

LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ

Sep 25, 2024

Marc-Antoine Allard, Matin Ansaripour, Maria Yuffa, Paul Teiletche

Figure 1 for LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ

Figure 2 for LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ

Figure 3 for LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ

Figure 4 for LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ

Abstract:Large Language Models (LLMs) often struggle with tasks requiring mathematical reasoning, particularly multiple-choice questions (MCQs). To address this issue, we developed LLaMa-SciQ, an educational chatbot designed to assist college students in solving and understanding MCQs in STEM fields. We begin by fine-tuning and aligning the models to human preferences. After comparing the performance of Mistral-7B and LLaMa-8B, we selected the latter as the base model due to its higher evaluation accuracy. To further enhance accuracy, we implement Retrieval-Augmented Generation (RAG) and apply quantization to compress the model, reducing inference time and increasing accessibility for students. For mathematical reasoning, LLaMa-SciQ achieved 74.5% accuracy on the GSM8k dataset and 30% on the MATH dataset. However, RAG does not improve performance and even reduces it, likely due to retriever issues or the model's unfamiliarity with context. Despite this, the quantized model shows only a 5% loss in performance, demonstrating significant efficiency improvements.

Via

Access Paper or Ask Questions