Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Amit Sheth

Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)

Nov 24, 2024

Nasrin Imanpour, Shashwat Bajpai, Subhankar Ghosh, Sainath Reddy Sankepally, Abhilekh Borah, Hasnat Md Abdullah, Nishoak Kosaraju, Shreyas Dixit, Ashhar Aziz, Shwetangshu Biswas(+4 more)

Figure 1 for Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)

Figure 2 for Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)

Figure 3 for Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)

Figure 4 for Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)

Abstract:The proliferation of AI techniques for image generation, coupled with their increasing accessibility, has raised significant concerns about the potential misuse of these images to spread misinformation. Recent AI-generated image detection (AGID) methods include CNNDetection, NPR, DM Image Detection, Fake Image Detection, DIRE, LASTED, GAN Image Detection, AIDE, SSP, DRCT, RINE, OCC-CLIP, De-Fake, and Deep Fake Detection. However, we argue that the current state-of-the-art AGID techniques are inadequate for effectively detecting contemporary AI-generated images and advocate for a comprehensive reevaluation of these methods. We introduce the Visual Counter Turing Test (VCT^2), a benchmark comprising ~130K images generated by contemporary text-to-image models (Stable Diffusion 2.1, Stable Diffusion XL, Stable Diffusion 3, DALL-E 3, and Midjourney 6). VCT^2 includes two sets of prompts sourced from tweets by the New York Times Twitter account and captions from the MS COCO dataset. We also evaluate the performance of the aforementioned AGID techniques on the VCT$^2$ benchmark, highlighting their ineffectiveness in detecting AI-generated images. As image-generative AI models continue to evolve, the need for a quantifiable framework to evaluate these models becomes increasingly critical. To meet this need, we propose the Visual AI Index (V_AI), which assesses generated images from various visual perspectives, including texture complexity and object coherence, setting a new standard for evaluating image-generative AI models. To foster research in this domain, we make our https://huggingface.co/datasets/anonymous1233/COCO_AI and https://huggingface.co/datasets/anonymous1233/twitter_AI datasets publicly available.

* 13 pages, 9 figures

Via

Access Paper or Ask Questions

A Domain-Agnostic Neurosymbolic Approach for Big Social Data Analysis: Evaluating Mental Health Sentiment on Social Media during COVID-19

Nov 11, 2024

Vedant Khandelwal, Manas Gaur, Ugur Kursuncu, Valerie Shalin, Amit Sheth

Figure 1 for A Domain-Agnostic Neurosymbolic Approach for Big Social Data Analysis: Evaluating Mental Health Sentiment on Social Media during COVID-19

Figure 2 for A Domain-Agnostic Neurosymbolic Approach for Big Social Data Analysis: Evaluating Mental Health Sentiment on Social Media during COVID-19

Figure 3 for A Domain-Agnostic Neurosymbolic Approach for Big Social Data Analysis: Evaluating Mental Health Sentiment on Social Media during COVID-19

Figure 4 for A Domain-Agnostic Neurosymbolic Approach for Big Social Data Analysis: Evaluating Mental Health Sentiment on Social Media during COVID-19

Abstract:Monitoring public sentiment via social media is potentially helpful during health crises such as the COVID-19 pandemic. However, traditional frequency-based, data-driven neural network-based approaches can miss newly relevant content due to the evolving nature of language in a dynamically evolving environment. Human-curated symbolic knowledge sources, such as lexicons for standard language and slang terms, can potentially elevate social media signals in evolving language. We introduce a neurosymbolic method that integrates neural networks with symbolic knowledge sources, enhancing the detection and interpretation of mental health-related tweets relevant to COVID-19. Our method was evaluated using a corpus of large datasets (approximately 12 billion tweets, 2.5 million subreddit data, and 700k news articles) and multiple knowledge graphs. This method dynamically adapts to evolving language, outperforming purely data-driven models with an F1 score exceeding 92\%. This approach also showed faster adaptation to new data and lower computational demands than fine-tuning pre-trained large language models (LLMs). This study demonstrates the benefit of neurosymbolic methods in interpreting text in a dynamic environment for tasks such as health surveillance.

* 13 Pages, 5 Figures, 5 Tables, 2024 IEEE International Conference on Big Data, Regular Paper

Via

Access Paper or Ask Questions

Knowledge Graphs of Driving Scenes to Empower the Emerging Capabilities of Neurosymbolic AI

Nov 05, 2024

Ruwan Wickramarachchi, Cory Henson, Amit Sheth

Figure 1 for Knowledge Graphs of Driving Scenes to Empower the Emerging Capabilities of Neurosymbolic AI

Figure 2 for Knowledge Graphs of Driving Scenes to Empower the Emerging Capabilities of Neurosymbolic AI

Figure 3 for Knowledge Graphs of Driving Scenes to Empower the Emerging Capabilities of Neurosymbolic AI

Abstract:In the era of Generative AI, Neurosymbolic AI is emerging as a powerful approach for tasks spanning from perception to cognition. The use of Neurosymbolic AI has been shown to achieve enhanced capabilities, including improved grounding, alignment, explainability, and reliability. However, due to its nascent stage, there is a lack of widely available real-world benchmark datasets tailored to Neurosymbolic AI tasks. To address this gap and support the evaluation of current and future methods, we introduce DSceneKG -- a suite of knowledge graphs of driving scenes built from real-world, high-quality scenes from multiple open autonomous driving datasets. In this article, we detail the construction process of DSceneKG and highlight its application in seven different tasks. DSceneKG is publicly accessible at: https://github.com/ruwantw/DSceneKG

* 8 pages

Via

Access Paper or Ask Questions

Overview of Factify5WQA: Fact Verification through 5W Question-Answering

Oct 05, 2024

Suryavardan Suresh, Anku Rani, Parth Patwa, Aishwarya Reganti, Vinija Jain, Aman Chadha, Amitava Das, Amit Sheth, Asif Ekbal

Figure 1 for Overview of Factify5WQA: Fact Verification through 5W Question-Answering

Figure 2 for Overview of Factify5WQA: Fact Verification through 5W Question-Answering

Figure 3 for Overview of Factify5WQA: Fact Verification through 5W Question-Answering

Figure 4 for Overview of Factify5WQA: Fact Verification through 5W Question-Answering

Abstract:Researchers have found that fake news spreads much times faster than real news. This is a major problem, especially in today's world where social media is the key source of news for many among the younger population. Fact verification, thus, becomes an important task and many media sites contribute to the cause. Manual fact verification is a tedious task, given the volume of fake news online. The Factify5WQA shared task aims to increase research towards automated fake news detection by providing a dataset with an aspect-based question answering based fact verification method. Each claim and its supporting document is associated with 5W questions that help compare the two information sources. The objective performance measure in the task is done by comparing answers using BLEU score to measure the accuracy of the answers, followed by an accuracy measure of the classification. The task had submissions using custom training setup and pre-trained language-models among others. The best performing team posted an accuracy of 69.56%, which is a near 35% improvement over the baseline.

* Accepted at defactify3@aaai2024

Via

Access Paper or Ask Questions

The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks

Aug 19, 2024

Niyar R Barman, Krish Sharma, Ashhar Aziz, Shashwat Bajpai, Shwetangshu Biswas, Vasu Sharma, Vinija Jain, Aman Chadha, Amit Sheth, Amitava Das

Abstract:The rapid advancement of text-to-image generation systems, exemplified by models like Stable Diffusion, Midjourney, Imagen, and DALL-E, has heightened concerns about their potential misuse. In response, companies like Meta and Google have intensified their efforts to implement watermarking techniques on AI-generated images to curb the circulation of potentially misleading visuals. However, in this paper, we argue that current image watermarking methods are fragile and susceptible to being circumvented through visual paraphrase attacks. The proposed visual paraphraser operates in two steps. First, it generates a caption for the given image using KOSMOS-2, one of the latest state-of-the-art image captioning systems. Second, it passes both the original image and the generated caption to an image-to-image diffusion system. During the denoising step of the diffusion pipeline, the system generates a visually similar image that is guided by the text caption. The resulting image is a visual paraphrase and is free of any watermarks. Our empirical findings demonstrate that visual paraphrase attacks can effectively remove watermarks from images. This paper provides a critical assessment, empirically revealing the vulnerability of existing watermarking techniques to visual paraphrase attacks. While we do not propose solutions to this issue, this paper serves as a call to action for the scientific community to prioritize the development of more robust watermarking techniques. Our first-of-its-kind visual paraphrase dataset and accompanying code are publicly available.

* 23 pages and 10 figures

Via

Access Paper or Ask Questions

AssemAI: Interpretable Image-Based Anomaly Detection for Manufacturing Pipelines

Aug 05, 2024

Renjith Prasad, Chathurangi Shyalika, Ramtin Zand, Fadi El Kalach, Revathy Venkataramanan, Ramy Harik, Amit Sheth

Figure 1 for AssemAI: Interpretable Image-Based Anomaly Detection for Manufacturing Pipelines

Figure 2 for AssemAI: Interpretable Image-Based Anomaly Detection for Manufacturing Pipelines

Figure 3 for AssemAI: Interpretable Image-Based Anomaly Detection for Manufacturing Pipelines

Figure 4 for AssemAI: Interpretable Image-Based Anomaly Detection for Manufacturing Pipelines

Abstract:Anomaly detection in manufacturing pipelines remains a critical challenge, intensified by the complexity and variability of industrial environments. This paper introduces AssemAI, an interpretable image-based anomaly detection system tailored for smart manufacturing pipelines. Our primary contributions include the creation of a tailored image dataset and the development of a custom object detection model, YOLO-FF, designed explicitly for anomaly detection in manufacturing assembly environments. Utilizing the preprocessed image dataset derived from an industry-focused rocket assembly pipeline, we address the challenge of imbalanced image data and demonstrate the importance of image-based methods in anomaly detection. The proposed approach leverages domain knowledge in data preparation, model development and reasoning. We compare our method against several baselines, including simple CNN and custom Visual Transformer (ViT) models, showcasing the effectiveness of our custom data preparation and pretrained CNN integration. Additionally, we incorporate explainability techniques at both user and model levels, utilizing ontology for user-friendly explanations and SCORE-CAM for in-depth feature and model analysis. Finally, the model was also deployed in a real-time setting. Our results include ablation studies on the baselines, providing a comprehensive evaluation of the proposed system. This work highlights the broader impact of advanced image-based anomaly detection in enhancing the reliability and efficiency of smart manufacturing processes.

* 8 Pages, 6 Figures, 4 Tables

Via

Access Paper or Ask Questions

Neurosymbolic AI for Enhancing Instructability in Generative AI

Jul 26, 2024

Amit Sheth, Vishal Pallagani, Kaushik Roy

Figure 1 for Neurosymbolic AI for Enhancing Instructability in Generative AI

Figure 2 for Neurosymbolic AI for Enhancing Instructability in Generative AI

Figure 3 for Neurosymbolic AI for Enhancing Instructability in Generative AI

Figure 4 for Neurosymbolic AI for Enhancing Instructability in Generative AI

Abstract:Generative AI, especially via Large Language Models (LLMs), has transformed content creation across text, images, and music, showcasing capabilities in following instructions through prompting, largely facilitated by instruction tuning. Instruction tuning is a supervised fine-tuning method where LLMs are trained on datasets formatted with specific tasks and corresponding instructions. This method systematically enhances the model's ability to comprehend and execute the provided directives. Despite these advancements, LLMs still face challenges in consistently interpreting complex, multi-step instructions and generalizing them to novel tasks, which are essential for broader applicability in real-world scenarios. This article explores why neurosymbolic AI offers a better path to enhance the instructability of LLMs. We explore the use a symbolic task planner to decompose high-level instructions into structured tasks, a neural semantic parser to ground these tasks into executable actions, and a neuro-symbolic executor to implement these actions while dynamically maintaining an explicit representation of state. We also seek to show that neurosymbolic approach enhances the reliability and context-awareness of task execution, enabling LLMs to dynamically interpret and respond to a wider range of instructional contexts with greater precision and flexibility.

Via

Access Paper or Ask Questions

Counter Turing Test ($CT^2$): Investigating AI-Generated Text Detection for Hindi -- Ranking LLMs based on Hindi AI Detectability Index ($ADI_{hi}$)

Jul 22, 2024

Ishan Kavathekar, Anku Rani, Ashmit Chamoli, Ponnurangam Kumaraguru, Amit Sheth, Amitava Das

$Figure 1 for Counter Turing Test ($CT^2$): Investigating AI-Generated Text Detection for Hindi -- Ranking LLMs based on Hindi AI Detectability Index ($ADI_{hi}$)$

$Figure 2 for Counter Turing Test ($CT^2$): Investigating AI-Generated Text Detection for Hindi -- Ranking LLMs based on Hindi AI Detectability Index ($ADI_{hi}$)$

$Figure 3 for Counter Turing Test ($CT^2$): Investigating AI-Generated Text Detection for Hindi -- Ranking LLMs based on Hindi AI Detectability Index ($ADI_{hi}$)$

$Figure 4 for Counter Turing Test ($CT^2$): Investigating AI-Generated Text Detection for Hindi -- Ranking LLMs based on Hindi AI Detectability Index ($ADI_{hi}$)$

Abstract:The widespread adoption of large language models (LLMs) and awareness around multilingual LLMs have raised concerns regarding the potential risks and repercussions linked to the misapplication of AI-generated text, necessitating increased vigilance. While these models are primarily trained for English, their extensive training on vast datasets covering almost the entire web, equips them with capabilities to perform well in numerous other languages. AI-Generated Text Detection (AGTD) has emerged as a topic that has already received immediate attention in research, with some initial methods having been proposed, soon followed by the emergence of techniques to bypass detection. In this paper, we report our investigation on AGTD for an indic language Hindi. Our major contributions are in four folds: i) examined 26 LLMs to evaluate their proficiency in generating Hindi text, ii) introducing the AI-generated news article in Hindi ($AG_{hi}$) dataset, iii) evaluated the effectiveness of five recently proposed AGTD techniques: ConDA, J-Guard, RADAR, RAIDAR and Intrinsic Dimension Estimation for detecting AI-generated Hindi text, iv) proposed Hindi AI Detectability Index ($ADI_{hi}$) which shows a spectrum to understand the evolving landscape of eloquence of AI-generated text in Hindi. We will make the codes and datasets available to encourage further research.

Via

Access Paper or Ask Questions

Evaluating the Role of Data Enrichment Approaches Towards Rare Event Analysis in Manufacturing

Jul 01, 2024

Chathurangi Shyalika, Ruwan Wickramarachchi, Fadi El Kalach, Ramy Harik, Amit Sheth

Abstract:Rare events are occurrences that take place with a significantly lower frequency than more common regular events. In manufacturing, predicting such events is particularly important, as they lead to unplanned downtime, shortening equipment lifespan, and high energy consumption. The occurrence of events is considered frequently-rare if observed in more than 10% of all instances, very-rare if it is 1-5%, moderately-rare if it is 5-10%, and extremely-rare if less than 1%. The rarity of events is inversely correlated with the maturity of a manufacturing industry. Typically, the rarity of events affects the multivariate data generated within a manufacturing process to be highly imbalanced, which leads to bias in predictive models. This paper evaluates the role of data enrichment techniques combined with supervised machine-learning techniques for rare event detection and prediction. To address the data scarcity, we use time series data augmentation and sampling methods to amplify the dataset with more multivariate features and data points while preserving the underlying time series patterns in the combined alterations. Imputation techniques are used in handling null values in datasets. Considering 15 learning models ranging from statistical learning to machine learning to deep learning methods, the best-performing model for the selected datasets is obtained and the efficacy of data enrichment is evaluated. Based on this evaluation, our results find that the enrichment procedure enhances up to 48% of F1 measure in rare failure event detection and prediction of supervised prediction models. We also conduct empirical and ablation experiments on the datasets to derive dataset-specific novel insights. Finally, we investigate the interpretability aspect of models for rare event prediction, considering multiple methods.

* 27 pages, 11 figures, 16 tables

Via

Access Paper or Ask Questions

Towards Learning Foundation Models for Heuristic Functions to Solve Pathfinding Problems

Jun 01, 2024

Vedant Khandelwal, Amit Sheth, Forest Agostinelli

Abstract:Pathfinding problems are found throughout robotics, computational science, and natural sciences. Traditional methods to solve these require training deep neural networks (DNNs) for each new problem domain, consuming substantial time and resources. This study introduces a novel foundation model, leveraging deep reinforcement learning to train heuristic functions that seamlessly adapt to new domains without further fine-tuning. Building upon DeepCubeA, we enhance the model by providing the heuristic function with the domain's state transition information, improving its adaptability. Utilizing a puzzle generator for the 15-puzzle action space variation domains, we demonstrate our model's ability to generalize and solve unseen domains. We achieve a strong correlation between learned and ground truth heuristic values across various domains, as evidenced by robust R-squared and Concordance Correlation Coefficient metrics. These results underscore the potential of foundation models to establish new standards in efficiency and adaptability for AI-driven solutions in complex pathfinding problems.

Via

Access Paper or Ask Questions