Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sumit Kumar

The Orchestration of Multi-Agent Systems: Architectures, Protocols, and Enterprise Adoption

Jan 20, 2026

Apoorva Adimulam, Rajesh Gupta, Sumit Kumar

Abstract:Orchestrated multi-agent systems represent the next stage in the evolution of artificial intelligence, where autonomous agents collaborate through structured coordination and communication to achieve complex, shared objectives. This paper consolidates and formalizes the technical composition of such systems, presenting a unified architectural framework that integrates planning, policy enforcement, state management, and quality operations into a coherent orchestration layer. Another primary contribution of this work is the in-depth technical delineation of two complementary communication protocols - the Model Context Protocol, which standardizes how agents access external tools and contextual data, and the Agent2Agent protocol, which governs peer coordination, negotiation, and delegation. Together, these protocols establish an interoperable communication substrate that enables scalable, auditable, and policy-compliant reasoning across distributed agent collectives. Beyond protocol design, the paper details how orchestration logic, governance frameworks, and observability mechanisms collectively sustain system coherence, transparency, and accountability. By synthesizing these elements into a cohesive technical blueprint, this paper provides comprehensive treatments of orchestrated multi-agent systems - bridging conceptual architectures with implementation-ready design principles for enterprise-scale AI ecosystems.

Via

Access Paper or Ask Questions

LABELING COPILOT: A Deep Research Agent for Automated Data Curation in Computer Vision

Sep 26, 2025

Debargha Ganguly, Sumit Kumar, Ishwar Balappanawar, Weicong Chen, Shashank Kambhatla, Srinivasan Iyengar, Shivkumar Kalyanaraman, Ponnurangam Kumaraguru, Vipin Chaudhary

Abstract:Curating high-quality, domain-specific datasets is a major bottleneck for deploying robust vision systems, requiring complex trade-offs between data quality, diversity, and cost when researching vast, unlabeled data lakes. We introduce Labeling Copilot, the first data curation deep research agent for computer vision. A central orchestrator agent, powered by a large multimodal language model, uses multi-step reasoning to execute specialized tools across three core capabilities: (1) Calibrated Discovery sources relevant, in-distribution data from large repositories; (2) Controllable Synthesis generates novel data for rare scenarios with robust filtering; and (3) Consensus Annotation produces accurate labels by orchestrating multiple foundation models via a novel consensus mechanism incorporating non-maximum suppression and voting. Our large-scale validation proves the effectiveness of Labeling Copilot's components. The Consensus Annotation module excels at object discovery: on the dense COCO dataset, it averages 14.2 candidate proposals per image-nearly double the 7.4 ground-truth objects-achieving a final annotation mAP of 37.1%. On the web-scale Open Images dataset, it navigated extreme class imbalance to discover 903 new bounding box categories, expanding its capability to over 1500 total. Concurrently, our Calibrated Discovery tool, tested at a 10-million sample scale, features an active learning strategy that is up to 40x more computationally efficient than alternatives with equivalent sample efficiency. These experiments validate that an agentic workflow with optimized, scalable tools provides a robust foundation for curating industrial-scale datasets.

Via

Access Paper or Ask Questions

Optimizing Recall or Relevance? A Multi-Task Multi-Head Approach for Item-to-Item Retrieval in Recommendation

Jun 06, 2025

Jiang Zhang, Sumit Kumar, Wei Chang, Yubo Wang, Feng Zhang, Weize Mao, Hanchao Yu, Aashu Singh, Min Li, Qifan Wang

Abstract:The task of item-to-item (I2I) retrieval is to identify a set of relevant and highly engaging items based on a given trigger item. It is a crucial component in modern recommendation systems, where users' previously engaged items serve as trigger items to retrieve relevant content for future engagement. However, existing I2I retrieval models in industry are primarily built on co-engagement data and optimized using the recall measure, which overly emphasizes co-engagement patterns while failing to capture semantic relevance. This often leads to overfitting short-term co-engagement trends at the expense of long-term benefits such as discovering novel interests and promoting content diversity. To address this challenge, we propose MTMH, a Multi-Task and Multi-Head I2I retrieval model that achieves both high recall and semantic relevance. Our model consists of two key components: 1) a multi-task learning loss for formally optimizing the trade-off between recall and semantic relevance, and 2) a multi-head I2I retrieval architecture for retrieving both highly co-engaged and semantically relevant items. We evaluate MTMH using proprietary data from a commercial platform serving billions of users and demonstrate that it can improve recall by up to 14.4% and semantic relevance by up to 56.6% compared with prior state-of-the-art models. We also conduct live experiments to verify that MTMH can enhance both short-term consumption metrics and long-term user-experience-related metrics. Our work provides a principled approach for jointly optimizing I2I recall and semantic relevance, which has significant implications for improving the overall performance of recommendation systems.

* KDD 2025

Via

Access Paper or Ask Questions

SEMMA: A Semantic Aware Knowledge Graph Foundation Model

May 26, 2025

Arvindh Arun, Sumit Kumar, Mojtaba Nayyeri, Bo Xiong, Ponnurangam Kumaraguru, Antonio Vergari, Steffen Staab

Abstract:Knowledge Graph Foundation Models (KGFMs) have shown promise in enabling zero-shot reasoning over unseen graphs by learning transferable patterns. However, most existing KGFMs rely solely on graph structure, overlooking the rich semantic signals encoded in textual attributes. We introduce SEMMA, a dual-module KGFM that systematically integrates transferable textual semantics alongside structure. SEMMA leverages Large Language Models (LLMs) to enrich relation identifiers, generating semantic embeddings that subsequently form a textual relation graph, which is fused with the structural component. Across 54 diverse KGs, SEMMA outperforms purely structural baselines like ULTRA in fully inductive link prediction. Crucially, we show that in more challenging generalization settings, where the test-time relation vocabulary is entirely unseen, structural methods collapse while SEMMA is 2x more effective. Our findings demonstrate that textual semantics are critical for generalization in settings where structure alone fails, highlighting the need for foundation models that unify structural and linguistic signals in knowledge reasoning.

Via

Access Paper or Ask Questions

Recognizing Ornaments in Vocal Indian Art Music with Active Annotation

May 07, 2025

Sumit Kumar, Parampreet Singh, Vipul Arora

Abstract:Ornamentations, embellishments, or microtonal inflections are essential to melodic expression across many musical traditions, adding depth, nuance, and emotional impact to performances. Recognizing ornamentations in singing voices is key to MIR, with potential applications in music pedagogy, singer identification, genre classification, and controlled singing voice generation. However, the lack of annotated datasets and specialized modeling approaches remains a major obstacle for progress in this research area. In this work, we introduce R\=aga Ornamentation Detection (ROD), a novel dataset comprising Indian classical music recordings curated by expert musicians. The dataset is annotated using a custom Human-in-the-Loop tool for six vocal ornaments marked as event-based labels. Using this dataset, we develop an ornamentation detection model based on deep time-series analysis, preserving ornament boundaries during the chunking of long audio recordings. We conduct experiments using different train-test configurations within the ROD dataset and also evaluate our approach on a separate, manually annotated dataset of Indian classical concert recordings. Our experimental results support the superior performance of our proposed approach over the baseline CRNN.

Via

Access Paper or Ask Questions

Artificial Intelligence implementation of onboard flexible payload and adaptive beamforming using commercial off-the-shelf devices

May 03, 2025

Luis Manuel Garcés-Socarrás, Amirhosein Nik, Flor Ortiz, Juan A. Vásquez-Peralvo, Jorge Luis González Rios, Mouhamad Chehailty, Marcele Kuhfuss, Eva Lagunas, Jan Thoemel, Sumit Kumar(+6 more)

Figure 1 for Artificial Intelligence implementation of onboard flexible payload and adaptive beamforming using commercial off-the-shelf devices

Figure 2 for Artificial Intelligence implementation of onboard flexible payload and adaptive beamforming using commercial off-the-shelf devices

Figure 3 for Artificial Intelligence implementation of onboard flexible payload and adaptive beamforming using commercial off-the-shelf devices

Figure 4 for Artificial Intelligence implementation of onboard flexible payload and adaptive beamforming using commercial off-the-shelf devices

Abstract:Very High Throughput satellites typically provide multibeam coverage, however, a common problem is that there can be a mismatch between the capacity of each beam and the traffic demand: some beams may fall short, while others exceed the requirements. This challenge can be addressed by integrating machine learning with flexible payload and adaptive beamforming techniques. These methods allow for dynamic allocation of payload resources based on real-time capacity needs. As artificial intelligence advances, its ability to automate tasks, enhance efficiency, and increase precision is proving invaluable, especially in satellite communications, where traditional optimization methods are often computationally intensive. AI-driven solutions offer faster, more effective ways to handle complex satellite communication tasks. Artificial intelligence in space has more constraints than other fields, considering the radiation effects, the spaceship power capabilities, mass, and area. Current onboard processing uses legacy space-certified general-purpose processors, costly application-specific integrated circuits, or field-programmable gate arrays subjected to a highly stringent certification process. The increased performance demands of onboard processors to satisfy the accelerated data rates and autonomy requirements have rendered current space-graded processors obsolete. This work is focused on transforming the satellite payload using artificial intelligence and machine learning methodologies over available commercial off-the-shelf chips for onboard processing. The objectives include validating artificial intelligence-driven scenarios, focusing on flexible payload and adaptive beamforming as machine learning models onboard. Results show that machine learning models significantly improve signal quality, spectral efficiency, and throughput compared to conventional payload.

* 5th ESA Workshop on Advanced Flexible Telecom Payloads, 10 pages, 6 figures

Via

Access Paper or Ask Questions

A Cognac shot to forget bad memories: Corrective Unlearning in GNNs

Dec 01, 2024

Varshita Kolipaka, Akshit Sinha, Debangan Mishra, Sumit Kumar, Arvindh Arun, Shashwat Goel, Ponnurangam Kumaraguru

Figure 1 for A Cognac shot to forget bad memories: Corrective Unlearning in GNNs

Figure 2 for A Cognac shot to forget bad memories: Corrective Unlearning in GNNs

Figure 3 for A Cognac shot to forget bad memories: Corrective Unlearning in GNNs

Figure 4 for A Cognac shot to forget bad memories: Corrective Unlearning in GNNs

Abstract:Graph Neural Networks (GNNs) are increasingly being used for a variety of ML applications on graph data. As graph data does not follow the independently and identically distributed (i.i.d) assumption, adversarial manipulations or incorrect data can propagate to other data points through message passing, deteriorating the model's performance. To allow model developers to remove the adverse effects of manipulated entities from a trained GNN, we study the recently formulated problem of Corrective Unlearning. We find that current graph unlearning methods fail to unlearn the effect of manipulations even when the whole manipulated set is known. We introduce a new graph unlearning method, Cognac, which can unlearn the effect of the manipulation set even when only 5% of it is identified. It recovers most of the performance of a strong oracle with fully corrected training data, even beating retraining from scratch without the deletion set while being 8x more efficient. We hope our work guides GNN developers in fixing harmful effects due to issues in real-world data post-training.

Via

Access Paper or Ask Questions

ECHO: Environmental Sound Classification with Hierarchical Ontology-guided Semi-Supervised Learning

Sep 21, 2024

Pranav Gupta, Raunak Sharma, Rashmi Kumari, Sri Krishna Aditya, Shwetank Choudhary, Sumit Kumar, Kanchana M, Thilagavathy R

Abstract:Environment Sound Classification has been a well-studied research problem in the field of signal processing and up till now more focus has been laid on fully supervised approaches. Over the last few years, focus has moved towards semi-supervised methods which concentrate on the utilization of unlabeled data, and self-supervised methods which learn the intermediate representation through pretext task or contrastive learning. However, both approaches require a vast amount of unlabelled data to improve performance. In this work, we propose a novel framework called Environmental Sound Classification with Hierarchical Ontology-guided semi-supervised Learning (ECHO) that utilizes label ontology-based hierarchy to learn semantic representation by defining a novel pretext task. In the pretext task, the model tries to predict coarse labels defined by the Large Language Model (LLM) based on ground truth label ontology. The trained model is further fine-tuned in a supervised way to predict the actual task. Our proposed novel semi-supervised framework achieves an accuracy improvement in the range of 1\% to 8\% over baseline systems across three datasets namely UrbanSound8K, ESC-10, and ESC-50.

* IEEE CONECCT 2024, Signal Processing and Pattern Recognition, Environmental Sound Classification, ESC

Via

Access Paper or Ask Questions

Demonstration of Safe Electromagnetic Radiation Emitted by 5G Active Antenna Systems

Jun 12, 2024

Sumit Kumar, Chandan Kumar Sheemar, Abdelrahman Astro, Jorge Querol, Symeon Chatzinotas

Figure 1 for Demonstration of Safe Electromagnetic Radiation Emitted by 5G Active Antenna Systems

Figure 2 for Demonstration of Safe Electromagnetic Radiation Emitted by 5G Active Antenna Systems

Figure 3 for Demonstration of Safe Electromagnetic Radiation Emitted by 5G Active Antenna Systems

Figure 4 for Demonstration of Safe Electromagnetic Radiation Emitted by 5G Active Antenna Systems

Abstract:The careful planning and safe deployment of 5G technologies will bring enormous benefits to society and the economy. Higher frequency, beamforming, and small-cells are key technologies that will provide unmatched throughput and seamless connectivity to 5G users. Superficial knowledge of these technologies has raised concerns among the general public about the harmful effects of radiation. Several standardization bodies are active to put limits on the emissions which are based on a defined set of radiation measurement methodologies. However, due to the peculiarity of 5G such as dynamicity of the beams, network densification, Time Division Duplexing mode of operation, etc, using existing EMF measurement methods may provide inaccurate results. In this context, we discuss our experimental studies aimed towards the measurement of radiation caused by beam-based transmissions from a 5G base station equipped with an Active Antenna System(AAS). We elaborate on the shortcomings of current measurement methodologies and address several open questions. Next, we demonstrate that using user-specific downlink beamforming, not only better performance is achieved compared to non-beamformed downlink, but also the radiation in the vicinity of the intended user is significantly decreased. Further, we show that under weak reception conditions, an uplink transmission can cause significantly high radiation in the vicinity of the user equipment. We believe that our work will help in clearing several misleading concepts about the 5G EMF radiation effects. We conclude the work by providing guidelines to improve the methodology of EMF measurement by considering the spatiotemporal dynamicity of the 5G transmission.

Via

Access Paper or Ask Questions

Artificial Intelligence Satellite Telecommunication Testbed using Commercial Off-The-Shelf Chipsets

May 28, 2024

Luis M. Garces, Amirhossein Nik, Flor Ortiz, Juan A. Vásquez-Peralvo, Jorge L. Gonzalez, Mouhamad Chehailty, Marcele Kuhfuss, Eva Lagunas, Jan Thoemel, Sumit Kumar(+6 more)

Figure 1 for Artificial Intelligence Satellite Telecommunication Testbed using Commercial Off-The-Shelf Chipsets

Figure 2 for Artificial Intelligence Satellite Telecommunication Testbed using Commercial Off-The-Shelf Chipsets

Figure 3 for Artificial Intelligence Satellite Telecommunication Testbed using Commercial Off-The-Shelf Chipsets

Figure 4 for Artificial Intelligence Satellite Telecommunication Testbed using Commercial Off-The-Shelf Chipsets

Abstract:The Artificial Intelligence Satellite Telecommunications Testbed (AISTT), part of the ESA project SPAICE, is focused on the transformation of the satellite payload by using artificial intelligence (AI) and machine learning (ML) methodologies over available commercial off-the-shelf (COTS) AI chips for on-board processing. The objectives include validating artificial intelligence-driven SATCOM scenarios such as interference detection, spectrum sharing, radio resource management, decoding, and beamforming. The study highlights hardware selection and payload architecture. Preliminary results show that ML models significantly improve signal quality, spectral efficiency, and throughput compared to conventional payload. Moreover, the testbed aims to evaluate the performance and application of AI-capable COTS chips in onboard SATCOM contexts.

* Submitted to SPAICE Conference 2024: AI in and for Space, 5 pages, 3 figures

Via

Access Paper or Ask Questions