Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vaishnav Ramesh

HiRQA: Hierarchical Ranking and Quality Alignment for Opinion-Unaware Image Quality Assessment

Aug 20, 2025

Vaishnav Ramesh, Haining Wang, Md Jahidul Islam

Abstract:Despite significant progress in no-reference image quality assessment (NR-IQA), dataset biases and reliance on subjective labels continue to hinder their generalization performance. We propose HiRQA, Hierarchical Ranking and Quality Alignment), a self-supervised, opinion-unaware framework that offers a hierarchical, quality-aware embedding through a combination of ranking and contrastive learning. Unlike prior approaches that depend on pristine references or auxiliary modalities at inference time, HiRQA predicts quality scores using only the input image. We introduce a novel higher-order ranking loss that supervises quality predictions through relational ordering across distortion pairs, along with an embedding distance loss that enforces consistency between feature distances and perceptual differences. A training-time contrastive alignment loss, guided by structured textual prompts, further enhances the learned representation. Trained only on synthetic distortions, HiRQA generalizes effectively to authentic degradations, as demonstrated through evaluation on various distortions such as lens flare, haze, motion blur, and low-light conditions. For real-time deployment, we introduce \textbf{HiRQA-S}, a lightweight variant with an inference time of only 3.5 ms per image. Extensive experiments across synthetic and authentic benchmarks validate HiRQA's state-of-the-art (SOTA) performance, strong generalization ability, and scalability.

* 10 pages, 8 figures

Via

Access Paper or Ask Questions

NemeSys: An Online Underwater Explorer with Goal-Driven Adaptive Autonomy

Jul 16, 2025

Adnan Abdullah, Alankrit Gupta, Vaishnav Ramesh, Shivali Patel, Md Jahidul Islam

Abstract:Adaptive mission control and dynamic parameter reconfiguration are essential for autonomous underwater vehicles (AUVs) operating in GPS-denied, communication-limited marine environments. However, most current AUV platforms execute static, pre-programmed missions or rely on tethered connections and high-latency acoustic channels for mid-mission updates, significantly limiting their adaptability and responsiveness. In this paper, we introduce NemeSys, a novel AUV system designed to support real-time mission reconfiguration through compact optical and magnetoelectric (OME) signaling facilitated by floating buoys. We present the full system design, control architecture, and a semantic mission encoding framework that enables interactive exploration and task adaptation via low-bandwidth communication. The proposed system is validated through analytical modeling, controlled experimental evaluations, and open-water trials. Results confirm the feasibility of online mission adaptation and semantic task updates, highlighting NemeSys as an online AUV platform for goal-driven adaptive autonomy in dynamic and uncertain underwater environments.

* 10 pages, V1

Via

Access Paper or Ask Questions

DGIQA: Depth-guided Feature Attention and Refinement for Generalizable Image Quality Assessment

May 29, 2025

Vaishnav Ramesh, Junliang Liu, Haining Wang, Md Jahidul Islam

Abstract:A long-held challenge in no-reference image quality assessment (NR-IQA) learning from human subjective perception is the lack of objective generalization to unseen natural distortions. To address this, we integrate a novel Depth-Guided cross-attention and refinement (Depth-CAR) mechanism, which distills scene depth and spatial features into a structure-aware representation for improved NR-IQA. This brings in the knowledge of object saliency and relative contrast of the scene for more discriminative feature learning. Additionally, we introduce the idea of TCB (Transformer-CNN Bridge) to fuse high-level global contextual dependencies from a transformer backbone with local spatial features captured by a set of hierarchical CNN (convolutional neural network) layers. We implement TCB and Depth-CAR as multimodal attention-based projection functions to select the most informative features, which also improve training time and inference efficiency. Experimental results demonstrate that our proposed DGIQA model achieves state-of-the-art (SOTA) performance on both synthetic and authentic benchmark datasets. More importantly, DGIQA outperforms SOTA models on cross-dataset evaluations as well as in assessing natural image distortions such as low-light effects, hazy conditions, and lens flares.

* 18 pages

Via

Access Paper or Ask Questions

Demonstrating CavePI: Autonomous Exploration of Underwater Caves by Semantic Guidance

Feb 07, 2025

Alankrit Gupta, Adnan Abdullah, Xianyao Li, Vaishnav Ramesh, Ioannis Rekleitis, Md Jahidul Islam

Abstract:Enabling autonomous robots to safely and efficiently navigate, explore, and map underwater caves is of significant importance to water resource management, hydrogeology, archaeology, and marine robotics. In this work, we demonstrate the system design and algorithmic integration of a visual servoing framework for semantically guided autonomous underwater cave exploration. We present the hardware and edge-AI design considerations to deploy this framework on a novel AUV (Autonomous Underwater Vehicle) named CavePI. The guided navigation is driven by a computationally light yet robust deep visual perception module, delivering a rich semantic understanding of the environment. Subsequently, a robust control mechanism enables CavePI to track the semantic guides and navigate within complex cave structures. We evaluate the system through field experiments in natural underwater caves and spring-water sites and further validate its ROS (Robot Operating System)-based digital twin in a simulation environment. Our results highlight how these integrated design choices facilitate reliable navigation under feature-deprived, GPS-denied, and low-visibility conditions.

* V1, 15 pages

Via

Access Paper or Ask Questions