Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marcelo Lotif

FedRAG: A Framework for Fine-Tuning Retrieval-Augmented Generation Systems

Jun 12, 2025

Val Andrei Fajardo, David B. Emerson, Amandeep Singh, Veronica Chatrath, Marcelo Lotif, Ravi Theja, Alex Cheung, Izuki Matsuba

Abstract:Retrieval-augmented generation (RAG) systems have been shown to be effective in addressing many of the drawbacks of relying solely on the parametric memory of large language models. Recent work has demonstrated that RAG systems can be improved via fine-tuning of their retriever and generator models. In this work, we introduce FedRAG, a framework for fine-tuning RAG systems across centralized and federated architectures. FedRAG supports state-of-the-art fine-tuning methods, offering a simple and intuitive interface and a seamless conversion from centralized to federated training tasks. FedRAG is also deeply integrated with the modern RAG ecosystem, filling a critical gap in available tools.

* 9 pages, 4 figures, 2 tables. Accepted for the CODEML Workshop at ICML 2025. Framework code available at https://github.com/VectorInstitute/fed-rag

Via

Access Paper or Ask Questions

Just as Humans Need Vaccines, So Do Models: Model Immunization to Combat Falsehoods

May 23, 2025

Shaina Raza, Rizwan Qureshi, Marcelo Lotif, Aman Chadha, Deval Pandya, Christos Emmanouilidis

Abstract:Generative AI models often learn and reproduce false information present in their training corpora. This position paper argues that, analogous to biological immunization, where controlled exposure to a weakened pathogen builds immunity, AI models should be fine tuned on small, quarantined sets of explicitly labeled falsehoods as a "vaccine" against misinformation. These curated false examples are periodically injected during finetuning, strengthening the model ability to recognize and reject misleading claims while preserving accuracy on truthful inputs. An illustrative case study shows that immunized models generate substantially less misinformation than baselines. To our knowledge, this is the first training framework that treats fact checked falsehoods themselves as a supervised vaccine, rather than relying on input perturbations or generic human feedback signals, to harden models against future misinformation. We also outline ethical safeguards and governance controls to ensure the safe use of false data. Model immunization offers a proactive paradigm for aligning AI systems with factuality.

Via

Access Paper or Ask Questions

FairSense-AI: Responsible AI Meets Sustainability

Mar 05, 2025

Shaina Raza, Mukund Sayeeganesh Chettiar, Matin Yousefabadi, Tahniat Khan, Marcelo Lotif

Figure 1 for FairSense-AI: Responsible AI Meets Sustainability

Figure 2 for FairSense-AI: Responsible AI Meets Sustainability

Abstract:In this paper, we introduce FairSense-AI: a multimodal framework designed to detect and mitigate bias in both text and images. By leveraging Large Language Models (LLMs) and Vision-Language Models (VLMs), FairSense-AI uncovers subtle forms of prejudice or stereotyping that can appear in content, providing users with bias scores, explanatory highlights, and automated recommendations for fairness enhancements. In addition, FairSense-AI integrates an AI risk assessment component that aligns with frameworks like the MIT AI Risk Repository and NIST AI Risk Management Framework, enabling structured identification of ethical and safety concerns. The platform is optimized for energy efficiency via techniques such as model pruning and mixed-precision computation, thereby reducing its environmental footprint. Through a series of case studies and applications, we demonstrate how FairSense-AI promotes responsible AI use by addressing both the social dimension of fairness and the pressing need for sustainability in large-scale AI deployments. https://vectorinstitute.github.io/FairSense-AI, https://pypi.org/project/fair-sense-ai/ (Sustainability , Responsible AI , Large Language Models , Vision Language Models , Ethical AI , Green AI)

Via

Access Paper or Ask Questions

ViLBias: A Framework for Bias Detection using Linguistic and Visual Cues

Dec 22, 2024

Shaina Raza, Caesar Saleh, Emrul Hasan, Franklin Ogidi, Maximus Powers, Veronica Chatrath, Marcelo Lotif, Roya Javadi, Anam Zahid, Vahid Reza Khazaie

Figure 1 for ViLBias: A Framework for Bias Detection using Linguistic and Visual Cues

Figure 2 for ViLBias: A Framework for Bias Detection using Linguistic and Visual Cues

Figure 3 for ViLBias: A Framework for Bias Detection using Linguistic and Visual Cues

Figure 4 for ViLBias: A Framework for Bias Detection using Linguistic and Visual Cues

Abstract:The integration of Large Language Models (LLMs) and Vision-Language Models (VLMs) opens new avenues for addressing complex challenges in multimodal content analysis, particularly in biased news detection. This study introduces ViLBias, a framework that leverages state of the art LLMs and VLMs to detect linguistic and visual biases in news content, addressing the limitations of traditional text-only approaches. Our contributions include a novel dataset pairing textual content with accompanying visuals from diverse news sources and a hybrid annotation framework, combining LLM-based annotations with human review to enhance quality while reducing costs and improving scalability. We evaluate the efficacy of LLMs and VLMs in identifying biases, revealing their strengths in detecting subtle framing and text-visual inconsistencies. Empirical analysis demonstrates that incorporating visual cues alongside text enhances bias detection accuracy by 3 to 5 %, showcasing the complementary strengths of LLMs in generative reasoning and Small Language Models (SLMs) in classification. This study offers a comprehensive exploration of LLMs and VLMs as tools for detecting multimodal biases in news content, highlighting both their potential and limitations. Our research paves the way for more robust, scalable, and nuanced approaches to media bias detection, contributing to the broader field of natural language processing and multimodal analysis. (The data and code will be made available for research purposes).

* Under review

Via

Access Paper or Ask Questions

Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths?

Nov 08, 2024

Veronica Chatrath, Marcelo Lotif, Shaina Raza

Figure 1 for Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths?

Figure 2 for Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths?

Figure 3 for Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths?

Figure 4 for Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths?

Abstract:Political misinformation poses significant challenges to democratic processes, shaping public opinion and trust in media. Manual fact-checking methods face issues of scalability and annotator bias, while machine learning models require large, costly labelled datasets. This study investigates the use of state-of-the-art large language models (LLMs) as reliable annotators for detecting political factuality in news articles. Using open-source LLMs, we create a politically diverse dataset, labelled for bias through LLM-generated annotations. These annotations are validated by human experts and further evaluated by LLM-based judges to assess the accuracy and reliability of the annotations. Our approach offers a scalable and robust alternative to traditional fact-checking, enhancing transparency and public trust in media.

* Accepted at Socially Responsible Language Modelling Research (SoLaR) Workshop at NeurIPS 2024

Via

Access Paper or Ask Questions