Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kishor Datta Gupta

LiteVLA-Edge: Quantized On-Device Multimodal Control for Embedded Robotics

Mar 03, 2026

Justin Williams, Kishor Datta Gupta, Roy George, Mrinmoy Sarkar

Abstract:Vision-Language-Action (VLA) models provide a unified framework for perception, language conditioning, and action generation, but many existing systems remain difficult to deploy in embedded robotic settings because of their computational requirements and inference latency. In this paper, we present LiteVLA-Edge, a deployment-oriented VLA pipeline for fully on-device inference on Jetson Orin-class hardware. Our approach combines supervised image-to-action fine-tuning in FP32 with post-training 4-bit GGUF quantization and GPU-accelerated inference through the \texttt{llama.cpp} runtime. Under our deployment configuration, LiteVLA-Edge achieves a mean end-to-end latency of 150.5\,ms (approximately 6.6\,Hz) while operating entirely offline within a ROS~2-integrated perception--reasoning--action pipeline. Rather than introducing a new policy objective, our contribution is a practical systems path for executing compact multimodal control models locally on embedded hardware while preserving modular interfaces between perception, reasoning, and actuation. These results establish timing feasibility for reactive language-conditioned control and provide a reproducible baseline for future task-level evaluation of on-device VLAs in robotics.

Via

Access Paper or Ask Questions

Physics-Based Benchmarking Metrics for Multimodal Synthetic Images

Nov 19, 2025

Kishor Datta Gupta, Marufa Kamal, Md. Mahfuzur Rahman, Fahad Rahman, Mohd Ariful Haque, Sunzida Siddique

Figure 1 for Physics-Based Benchmarking Metrics for Multimodal Synthetic Images

Figure 2 for Physics-Based Benchmarking Metrics for Multimodal Synthetic Images

Figure 3 for Physics-Based Benchmarking Metrics for Multimodal Synthetic Images

Figure 4 for Physics-Based Benchmarking Metrics for Multimodal Synthetic Images

Abstract:Current state of the art measures like BLEU, CIDEr, VQA score, SigLIP-2 and CLIPScore are often unable to capture semantic or structural accuracy, especially for domain-specific or context-dependent scenarios. For this, this paper proposes a Physics-Constrained Multimodal Data Evaluation (PCMDE) metric combining large language models with reasoning, knowledge based mapping and vision-language models to overcome these limitations. The architecture is comprised of three main stages: (1) feature extraction of spatial and semantic information with multimodal features through object detection and VLMs; (2) Confidence-Weighted Component Fusion for adaptive component-level validation; and (3) physics-guided reasoning using large language models for structural and relational constraints (e.g., alignment, position, consistency) enforcement.

Via

Access Paper or Ask Questions

Lite VLA: Efficient Vision-Language-Action Control on CPU-Bound Edge Robots

Nov 07, 2025

Justin Williams, Kishor Datta Gupta, Roy George, Mrinmoy Sarkar

Abstract:The deployment of artificial intelligence models at the edge is increasingly critical for autonomous robots operating in GPS-denied environments where local, resource-efficient reasoning is essential. This work demonstrates the feasibility of deploying small Vision-Language Models (VLMs) on mobile robots to achieve real-time scene understanding and reasoning under strict computational constraints. Unlike prior approaches that separate perception from mobility, the proposed framework enables simultaneous movement and reasoning in dynamic environments using only on-board hardware. The system integrates a compact VLM with multimodal perception to perform contextual interpretation directly on embedded hardware, eliminating reliance on cloud connectivity. Experimental validation highlights the balance between computational efficiency, task accuracy, and system responsiveness. Implementation on a mobile robot confirms one of the first successful deployments of small VLMs for concurrent reasoning and mobility at the edge. This work establishes a foundation for scalable, assured autonomy in applications such as service robotics, disaster response, and defense operations.

Via

Access Paper or Ask Questions

Continuous Monitoring of Large-Scale Generative AI via Deterministic Knowledge Graph Structures

Sep 04, 2025

Kishor Datta Gupta, Mohd Ariful Haque, Hasmot Ali, Marufa Kamal, Syed Bahauddin Alam, Mohammad Ashiqur Rahman

Figure 1 for Continuous Monitoring of Large-Scale Generative AI via Deterministic Knowledge Graph Structures

Figure 2 for Continuous Monitoring of Large-Scale Generative AI via Deterministic Knowledge Graph Structures

Figure 3 for Continuous Monitoring of Large-Scale Generative AI via Deterministic Knowledge Graph Structures

Figure 4 for Continuous Monitoring of Large-Scale Generative AI via Deterministic Knowledge Graph Structures

Abstract:Generative AI (GEN AI) models have revolutionized diverse application domains but present substantial challenges due to reliability concerns, including hallucinations, semantic drift, and inherent biases. These models typically operate as black-boxes, complicating transparent and objective evaluation. Current evaluation methods primarily depend on subjective human assessment, limiting scalability, transparency, and effectiveness. This research proposes a systematic methodology using deterministic and Large Language Model (LLM)-generated Knowledge Graphs (KGs) to continuously monitor and evaluate GEN AI reliability. We construct two parallel KGs: (i) a deterministic KG built using explicit rule-based methods, predefined ontologies, domain-specific dictionaries, and structured entity-relation extraction rules, and (ii) an LLM-generated KG dynamically derived from real-time textual data streams such as live news articles. Utilizing real-time news streams ensures authenticity, mitigates biases from repetitive training, and prevents adaptive LLMs from bypassing predefined benchmarks through feedback memorization. To quantify structural deviations and semantic discrepancies, we employ several established KG metrics, including Instantiated Class Ratio (ICR), Instantiated Property Ratio (IPR), and Class Instantiation (CI). An automated real-time monitoring framework continuously computes deviations between deterministic and LLM-generated KGs. By establishing dynamic anomaly thresholds based on historical structural metric distributions, our method proactively identifies and flags significant deviations, thus promptly detecting semantic anomalies or hallucinations. This structured, metric-driven comparison between deterministic and dynamically generated KGs delivers a robust and scalable evaluation framework.

Via

Access Paper or Ask Questions

Advanced Tool Learning and Selection System (ATLASS): A Closed-Loop Framework Using LLM

Mar 13, 2025

Mohd Ariful Haque, Justin Williams, Sunzida Siddique, Md. Hujaifa Islam, Hasmot Ali, Kishor Datta Gupta, Roy George

Figure 1 for Advanced Tool Learning and Selection System (ATLASS): A Closed-Loop Framework Using LLM

Figure 2 for Advanced Tool Learning and Selection System (ATLASS): A Closed-Loop Framework Using LLM

Figure 3 for Advanced Tool Learning and Selection System (ATLASS): A Closed-Loop Framework Using LLM

Figure 4 for Advanced Tool Learning and Selection System (ATLASS): A Closed-Loop Framework Using LLM

Abstract:The combination of LLM agents with external tools enables models to solve complex tasks beyond their knowledge base. Human-designed tools are inflexible and restricted to solutions within the scope of pre-existing tools created by experts. To address this problem, we propose ATLASS, an advanced tool learning and selection system designed as a closed-loop framework. It enables the LLM to solve problems by dynamically generating external tools on demand. In this framework, agents play a crucial role in orchestrating tool selection, execution, and refinement, ensuring adaptive problem-solving capabilities. The operation of ATLASS follows three phases: The first phase, Understanding Tool Requirements, involves the Agents determining whether tools are required and specifying their functionality; the second phase, Tool Retrieval/Generation, involves the Agents retrieving or generating tools based on their availability; and the third phase, Task Solving, involves combining all the component tools necessary to complete the initial task. The Tool Dataset stores the generated tools, ensuring reusability and minimizing inference cost. Current LLM-based tool generation systems have difficulty creating complex tools that need APIs or external packages. In ATLASS, we solve the problem by automatically setting up the environment, fetching relevant API documentation online, and using a Python interpreter to create a reliable, versatile tool that works in a wider range of situations. OpenAI GPT-4.0 is used as the LLM agent, and safety and ethical concerns are handled through human feedback before executing generated code. By addressing the limitations of predefined toolsets and enhancing adaptability, ATLASS serves as a real-world solution that empowers users with dynamically generated tools for complex problem-solving.

Via

Access Paper or Ask Questions

A Comprehensive Review on Understanding the Decentralized and Collaborative Approach in Machine Learning

Mar 12, 2025

Sarwar Saif, Md Jahirul Islam, Md. Zihad Bin Jahangir, Parag Biswas, Abdur Rashid, MD Abdullah Al Nasim, Kishor Datta Gupta

Figure 1 for A Comprehensive Review on Understanding the Decentralized and Collaborative Approach in Machine Learning

Figure 2 for A Comprehensive Review on Understanding the Decentralized and Collaborative Approach in Machine Learning

Figure 3 for A Comprehensive Review on Understanding the Decentralized and Collaborative Approach in Machine Learning

Figure 4 for A Comprehensive Review on Understanding the Decentralized and Collaborative Approach in Machine Learning

Abstract:The arrival of Machine Learning (ML) completely changed how we can unlock valuable information from data. Traditional methods, where everything was stored in one place, had big problems with keeping information private, handling large amounts of data, and avoiding unfair advantages. Machine Learning has become a powerful tool that uses Artificial Intelligence (AI) to overcome these challenges. We started by learning the basics of Machine Learning, including the different types like supervised, unsupervised, and reinforcement learning. We also explored the important steps involved, such as preparing the data, choosing the right model, training it, and then checking its performance. Next, we examined some key challenges in Machine Learning, such as models learning too much from specific examples (overfitting), not learning enough (underfitting), and reflecting biases in the data used. Moving beyond centralized systems, we looked at decentralized Machine Learning and its benefits, like keeping data private, getting answers faster, and using a wider variety of data sources. We then focused on a specific type called federated learning, where models are trained without directly sharing sensitive information. Real-world examples from healthcare and finance were used to show how collaborative Machine Learning can solve important problems while still protecting information security. Finally, we discussed challenges like communication efficiency, dealing with different types of data, and security. We also explored using a Zero Trust framework, which provides an extra layer of protection for collaborative Machine Learning systems. This approach is paving the way for a bright future for this groundbreaking technology.

Via

Access Paper or Ask Questions

An Extensive and Methodical Review of Smart Grids for Sustainable Energy Management-Addressing Challenges with AI, Renewable Energy Integration and Leading-edge Technologies

Jan 23, 2025

Parag Biswas, Abdur Rashid, abdullah al masum, MD Abdullah Al Nasim, A. S. M Anas Ferdous, Kishor Datta Gupta, Angona Biswas

Figure 1 for An Extensive and Methodical Review of Smart Grids for Sustainable Energy Management-Addressing Challenges with AI, Renewable Energy Integration and Leading-edge Technologies

Figure 2 for An Extensive and Methodical Review of Smart Grids for Sustainable Energy Management-Addressing Challenges with AI, Renewable Energy Integration and Leading-edge Technologies

Figure 3 for An Extensive and Methodical Review of Smart Grids for Sustainable Energy Management-Addressing Challenges with AI, Renewable Energy Integration and Leading-edge Technologies

Figure 4 for An Extensive and Methodical Review of Smart Grids for Sustainable Energy Management-Addressing Challenges with AI, Renewable Energy Integration and Leading-edge Technologies

Abstract:Energy management decreases energy expenditures and consumption while simultaneously increasing energy efficiency, reducing carbon emissions, and enhancing operational performance. Smart grids are a type of sophisticated energy infrastructure that increase the generation and distribution of electricity's sustainability, dependability, and efficiency by utilizing digital communication technologies. They combine a number of cutting-edge techniques and technology to improve energy resource management. A large amount of research study on the topic of smart grids for energy management has been completed in the last several years. The authors of the present study want to cover a number of topics, including smart grid benefits and components, technical developments, integrating renewable energy sources, using artificial intelligence and data analytics, cybersecurity, and privacy. Smart Grids for Energy Management are an innovative field of study aiming at tackling various difficulties and magnifying the efficiency, dependability, and sustainability of energy systems, including: 1) Renewable sources of power like solar and wind are intermittent and unpredictable 2) Defending smart grid system from various cyber-attacks 3) Incorporating an increasing number of electric vehicles into the system of power grid without overwhelming it. Additionally, it is proposed to use AI and data analytics for better performance on the grid, reliability, and energy management. It also looks into how AI and data analytics can be used to optimize grid performance, enhance reliability, and improve energy management. The authors will explore these significant challenges and ongoing research. Lastly, significant issues in this field are noted, and recommendations for further work are provided.

Via

Access Paper or Ask Questions

Trustworthy XAI and Application

Oct 22, 2024

MD Abdullah Al Nasim, Parag Biswas, Abdur Rashid, Angona Biswas, Kishor Datta Gupta

Figure 1 for Trustworthy XAI and Application

Figure 2 for Trustworthy XAI and Application

Figure 3 for Trustworthy XAI and Application

Figure 4 for Trustworthy XAI and Application

Abstract:One of today's most significant and transformative technologies is the rapidly developing field of artificial intelligence (AI). Deined as a computer system that simulates human cognitive processes, AI is present in many aspects of our daily lives, from the self-driving cars on the road to the intelligence (AI) because some AI systems are so complex and opaque. With millions of parameters and layers, these system-deep neural networks in particular-make it difficult for humans to comprehend accountability, prejudice, and justice are raised by the opaqueness of its decision-making process. AI has a lot of potential, but it also comes with a lot of difficulties and moral dilemmas. In the context of explainable artificial intelligence (XAI), trust is crucial as it ensures that AI systems behave consistently, fairly, and ethically. In the present article, we explore XAI, reliable XAI, and several practical uses for reliable XAI. Once more, we go over the three main components-transparency, explainability, and trustworthiness of XAI-that we determined are pertinent in this situation. We present an overview of recent scientific studies that employ trustworthy XAI in various application fields. In the end, trustworthiness is crucial for establishing and maintaining trust between humans and AI systems, facilitating the integration of AI systems into various applications and domains for the benefit of society.

* 28 pages, 14 figures

Via

Access Paper or Ask Questions

Power Plays: Unleashing Machine Learning Magic in Smart Grids

Oct 20, 2024

Abdur Rashid, Parag Biswas, abdullah al masum, MD Abdullah Al Nasim, Kishor Datta Gupta

Figure 1 for Power Plays: Unleashing Machine Learning Magic in Smart Grids

Figure 2 for Power Plays: Unleashing Machine Learning Magic in Smart Grids

Figure 3 for Power Plays: Unleashing Machine Learning Magic in Smart Grids

Abstract:The integration of machine learning into smart grid systems represents a transformative step in enhancing the efficiency, reliability, and sustainability of modern energy networks. By adding advanced data analytics, these systems can better manage the complexities of renewable energy integration, demand response, and predictive maintenance. Machine learning algorithms analyze vast amounts of data from smart meters, sensors, and other grid components to optimize energy distribution, forecast demand, and detect irregularities that could indicate potential failures. This enables more precise load balancing, reduces operational costs, and enhances the resilience of the grid against disturbances. Furthermore, the use of predictive models helps in anticipating equipment failures, thereby improving the reliability of the energy supply. As smart grids continue to evolve, the role of machine learning in managing decentralized energy sources and enabling real-time decision-making will become increasingly critical. However, the deployment of these technologies also raises challenges related to data privacy, security, and the need for robust infrastructure. Addressing these issues in this research authors will focus on realizing the full potential of smart grids, ensuring they meet the growing energy demands while maintaining a focus on sustainability and efficiency using Machine Learning techniques. Furthermore, this research will help determine the smart grid's essentiality with the aid of Machine Learning. Multiple ML algorithms have been integrated along with their pros and cons. The future scope of these algorithms are also integrated.

* 16 pages, 1 figure

Via

Access Paper or Ask Questions

UAV (Unmanned Aerial Vehicles): Diverse Applications of UAV Datasets in Segmentation, Classification, Detection, and Tracking

Sep 05, 2024

Md. Mahfuzur Rahman, Sunzida Siddique, Marufa Kamal, Rakib Hossain Rifat, Kishor Datta Gupta

Figure 1 for UAV (Unmanned Aerial Vehicles): Diverse Applications of UAV Datasets in Segmentation, Classification, Detection, and Tracking

Figure 2 for UAV (Unmanned Aerial Vehicles): Diverse Applications of UAV Datasets in Segmentation, Classification, Detection, and Tracking

Figure 3 for UAV (Unmanned Aerial Vehicles): Diverse Applications of UAV Datasets in Segmentation, Classification, Detection, and Tracking

Figure 4 for UAV (Unmanned Aerial Vehicles): Diverse Applications of UAV Datasets in Segmentation, Classification, Detection, and Tracking

Abstract:Unmanned Aerial Vehicles (UAVs), have greatly revolutionized the process of gathering and analyzing data in diverse research domains, providing unmatched adaptability and effectiveness. This paper presents a thorough examination of Unmanned Aerial Vehicle (UAV) datasets, emphasizing their wide range of applications and progress. UAV datasets consist of various types of data, such as satellite imagery, images captured by drones, and videos. These datasets can be categorized as either unimodal or multimodal, offering a wide range of detailed and comprehensive information. These datasets play a crucial role in disaster damage assessment, aerial surveillance, object recognition, and tracking. They facilitate the development of sophisticated models for tasks like semantic segmentation, pose estimation, vehicle re-identification, and gesture recognition. By leveraging UAV datasets, researchers can significantly enhance the capabilities of computer vision models, thereby advancing technology and improving our understanding of complex, dynamic environments from an aerial perspective. This review aims to encapsulate the multifaceted utility of UAV datasets, emphasizing their pivotal role in driving innovation and practical applications in multiple domains.

Via

Access Paper or Ask Questions