Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yihang Lu

Beluga Whale Detection from Satellite Imagery with Point Labels

May 17, 2025

Yijie Zheng, Jinxuan Yang, Yu Chen, Yaxuan Wang, Yihang Lu, Guoqing Li

Abstract:Very high-resolution (VHR) satellite imagery has emerged as a powerful tool for monitoring marine animals on a large scale. However, existing deep learning-based whale detection methods usually require manually created, high-quality bounding box annotations, which are labor-intensive to produce. Moreover, existing studies often exclude ``uncertain whales'', individuals that have ambiguous appearances in satellite imagery, limiting the applicability of these models in real-world scenarios. To address these limitations, this study introduces an automated pipeline for detecting beluga whales and harp seals in VHR satellite imagery. The pipeline leverages point annotations and the Segment Anything Model (SAM) to generate precise bounding box annotations, which are used to train YOLOv8 for multiclass detection of certain whales, uncertain whales, and harp seals. Experimental results demonstrated that SAM-generated annotations significantly improved detection performance, achieving higher $\text{F}_\text{1}$-scores compared to traditional buffer-based annotations. YOLOv8 trained on SAM-labeled boxes achieved an overall $\text{F}_\text{1}$-score of 72.2% for whales overall and 70.3% for harp seals, with superior performance in dense scenes. The proposed approach not only reduces the manual effort required for annotation but also enhances the detection of uncertain whales, offering a more comprehensive solution for marine animal monitoring. This method holds great potential for extending to other species, habitats, and remote sensing platforms, as well as for estimating whale biometrics, thereby advancing ecological monitoring and conservation efforts. The codes for our label and detection pipeline are publicly available at http://github.com/voyagerxvoyagerx/beluga-seeker .

* Accepted for oral presentation at IGARSS 2025. Session at https://www.2025.ieeeigarss.org/view_paper.php?PaperNum=2430&SessionID=1426

Via

Access Paper or Ask Questions

TimeCapsule: Solving the Jigsaw Puzzle of Long-Term Time Series Forecasting with Compressed Predictive Representations

Apr 17, 2025

Yihang Lu, Yangyang Xu, Qitao Qing, Xianwei Meng

Abstract:Recent deep learning models for Long-term Time Series Forecasting (LTSF) often emphasize complex, handcrafted designs, while simpler architectures like linear models or MLPs have often outperformed these intricate solutions. In this paper, we revisit and organize the core ideas behind several key techniques, such as redundancy reduction and multi-scale modeling, which are frequently employed in advanced LTSF models. Our goal is to streamline these ideas for more efficient deep learning utilization. To this end, we introduce TimeCapsule, a model built around the principle of high-dimensional information compression that unifies these techniques in a generalized yet simplified framework. Specifically, we model time series as a 3D tensor, incorporating temporal, variate, and level dimensions, and leverage mode production to capture multi-mode dependencies while achieving dimensionality compression. We propose an internal forecast within the compressed representation domain, supported by the Joint-Embedding Predictive Architecture (JEPA), to monitor the learning of predictive representations. Extensive experiments on challenging benchmarks demonstrate the versatility of our method, showing that TimeCapsule can achieve state-of-the-art performance.

Via

Access Paper or Ask Questions

MNT-TNN: Spatiotemporal Traffic Data Imputation via Compact Multimode Nonlinear Transform-based Tensor Nuclear Norm

Mar 29, 2025

Yihang Lu, Mahwish Yousaf, Xianwei Meng, Enhong Chen

Abstract:Imputation of random or non-random missing data is a long-standing research topic and a crucial application for Intelligent Transportation Systems (ITS). However, with the advent of modern communication technologies such as Global Satellite Navigation Systems (GNSS), traffic data collection has outpaced traditional methods, introducing new challenges in random missing value imputation and increasing demands for spatiotemporal dependency modelings. To address these issues, we propose a novel spatiotemporal traffic imputation method, Multimode Nonlinear Transformed Tensor Nuclear Norm (MNT-TNN), grounded in the Transform-based Tensor Nuclear Norm (TTNN) optimization framework which exhibits efficient mathematical representations and theoretical guarantees for the recovery of random missing values. Specifically, we strictly extend the single-mode transform in TTNN to a multimode transform with nonlinear activation, effectively capturing the intrinsic multimode spatiotemporal correlations and low-rankness of the traffic tensor, represented as location $\times$ location $\times$ time. To solve the nonconvex optimization problem, we design a proximal alternating minimization (PAM) algorithm with theoretical convergence guarantees. We suggest an Augmented Transform-based Tensor Nuclear Norm Families (ATTNNs) framework to enhance the imputation results of TTNN techniques, especially at very high miss rates. Extensive experiments on real datasets demonstrate that our proposed MNT-TNN and ATTNNs can outperform the compared state-of-the-art imputation methods, completing the benchmark of random missing traffic value imputation.

Via

Access Paper or Ask Questions