Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lucas Pereira

Meta-Evaluating Local LLMs: Rethinking Performance Metrics for Serious Games

Apr 13, 2025

Andrés Isaza-Giraldo, Paulo Bala, Lucas Pereira

Abstract:The evaluation of open-ended responses in serious games presents a unique challenge, as correctness is often subjective. Large Language Models (LLMs) are increasingly being explored as evaluators in such contexts, yet their accuracy and consistency remain uncertain, particularly for smaller models intended for local execution. This study investigates the reliability of five small-scale LLMs when assessing player responses in \textit{En-join}, a game that simulates decision-making within energy communities. By leveraging traditional binary classification metrics (including accuracy, true positive rate, and true negative rate), we systematically compare these models across different evaluation scenarios. Our results highlight the strengths and limitations of each model, revealing trade-offs between sensitivity, specificity, and overall performance. We demonstrate that while some models excel at identifying correct responses, others struggle with false positives or inconsistent evaluations. The findings highlight the need for context-aware evaluation frameworks and careful model selection when deploying LLMs as evaluators. This work contributes to the broader discourse on the trustworthiness of AI-driven assessment tools, offering insights into how different LLM architectures handle subjective evaluation tasks.

* 2nd HEAL Workshop at CHI Conference on Human Factors in Computing Systems. April 26, 2025. Yokohama, Japan

Via

Access Paper or Ask Questions

Improving accuracy and convergence of federated learning edge computing methods for generalized DER forecasting applications in power grid

Oct 13, 2024

Vineet Jagadeesan Nair, Lucas Pereira

Figure 1 for Improving accuracy and convergence of federated learning edge computing methods for generalized DER forecasting applications in power grid

Abstract:This proposal aims to develop more accurate federated learning (FL) methods with faster convergence properties and lower communication requirements, specifically for forecasting distributed energy resources (DER) such as renewables, energy storage, and loads in modern, low-carbon power grids. This will be achieved by (i) leveraging recently developed extensions of FL such as hierarchical and iterative clustering to improve performance with non-IID data, (ii) experimenting with different types of FL global models well-suited to time-series data, and (iii) incorporating domain-specific knowledge from power systems to build more general FL frameworks and architectures that can be applied to diverse types of DERs beyond just load forecasting, and with heterogeneous clients.

* Presented at the NeurIPS 2022 Tackling Climate Change with Machine Learning workshop

Via

Access Paper or Ask Questions

Federated Learning Forecasting for Strengthening Grid Reliability and Enabling Markets for Resilience

Jul 16, 2024

Lucas Pereira, Vineet Jagadeesan Nair, Bruno Dias, Hugo Morais, Anuradha Annaswamy

Abstract:We propose a comprehensive approach to increase the reliability and resilience of future power grids rich in distributed energy resources. Our distributed scheme combines federated learning-based attack detection with a local electricity market-based attack mitigation method. We validate the scheme by applying it to a real-world distribution grid rich in solar PV. Simulation results demonstrate that the approach is feasible and can successfully mitigate the grid impacts of cyber-physical attacks.

* Submitted to CIRED 2024 USA: Workshop on Resilience of Electric Distribution Systems

Via

Access Paper or Ask Questions