Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Penghui Huang

Cell-Free MIMO with Rotatable Antennas: When Macro-Diversity Meets Antenna Directivity

Jan 23, 2026

Xingxiang Peng, Qingqing Wu, Ziyuan Zheng, Yanze Zhu, Wen Chen, Penghui Huang, Ying Gao, Honghao Wang

Abstract:Cell-free networks leverage distributed access points (APs) to achieve macro-diversity, yet their performance is often constrained by large disparities in channel quality arising from user geometry and blockages. To address this, rotatable antennas (RAs) add a lightweight hardware degree of freedom by steering the antenna boresight toward dominant propagation directions to strengthen unfavorable links, thereby enabling the network to better exploit macro-diversity for higher and more uniform performance. This paper investigates an RA-enabled cell-free downlink network and formulates a max-min rate problem that jointly optimizes transmit beamforming and antenna orientations. To tackle this challenging problem, we develop an alternating-optimization-based algorithm that iteratively updates the beamformers via a second-order cone program (SOCP) and optimizes the antenna orientations using successive convex approximation. To reduce complexity, we further propose an efficient two-stage scheme that first designs orientations by maximizing a proportional-fair log-utility using manifold-aware Frank-Wolfe updates, and then computes the beamformers using an SOCP-based design. Simulation results demonstrate that the proposed orientation-aware designs achieve a substantially higher worst-user rate than conventional beamforming-only benchmarks. Furthermore, larger antenna directivity enhances fairness with proper orientation but can degrade the worst-user performance otherwise.

* 12 pages, 7 figures. Submitted to an IEEE journal for possible publication

Via

Access Paper or Ask Questions

DVGBench: Implicit-to-Explicit Visual Grounding Benchmark in UAV Imagery with Large Vision-Language Models

Jan 02, 2026

Yue Zhou, Jue Chen, Zilun Zhang, Penghui Huang, Ran Ding, Zhentao Zou, PengFei Gao, Yuchen Wei, Ke Li, Xue Yang(+3 more)

Abstract:Remote sensing (RS) large vision-language models (LVLMs) have shown strong promise across visual grounding (VG) tasks. However, existing RS VG datasets predominantly rely on explicit referring expressions-such as relative position, relative size, and color cues-thereby constraining performance on implicit VG tasks that require scenario-specific domain knowledge. This article introduces DVGBench, a high-quality implicit VG benchmark for drones, covering six major application scenarios: traffic, disaster, security, sport, social activity, and productive activity. Each object provides both explicit and implicit queries. Based on the dataset, we design DroneVG-R1, an LVLM that integrates the novel Implicit-to-Explicit Chain-of-Thought (I2E-CoT) within a reinforcement learning paradigm. This enables the model to take advantage of scene-specific expertise, converting implicit references into explicit ones and thus reducing grounding difficulty. Finally, an evaluation of mainstream models on both explicit and implicit VG tasks reveals substantial limitations in their reasoning capabilities. These findings provide actionable insights for advancing the reasoning capacity of LVLMs for drone-based agents. The code and datasets will be released at https://github.com/zytx121/DVGBench

* 20 pages, 17 figures

Via

Access Paper or Ask Questions

Large-Scale Gaussian Splatting SLAM

May 15, 2025

Zhe Xin, Chenyang Wu, Penghui Huang, Yanyong Zhang, Yinian Mao, Guoquan Huang

Abstract:The recently developed Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) have shown encouraging and impressive results for visual SLAM. However, most representative methods require RGBD sensors and are only available for indoor environments. The robustness of reconstruction in large-scale outdoor scenarios remains unexplored. This paper introduces a large-scale 3DGS-based visual SLAM with stereo cameras, termed LSG-SLAM. The proposed LSG-SLAM employs a multi-modality strategy to estimate prior poses under large view changes. In tracking, we introduce feature-alignment warping constraints to alleviate the adverse effects of appearance similarity in rendering losses. For the scalability of large-scale scenarios, we introduce continuous Gaussian Splatting submaps to tackle unbounded scenes with limited memory. Loops are detected between GS submaps by place recognition and the relative pose between looped keyframes is optimized utilizing rendering and feature warping losses. After the global optimization of camera poses and Gaussian points, a structure refinement module enhances the reconstruction quality. With extensive evaluations on the EuRoc and KITTI datasets, LSG-SLAM achieves superior performance over existing Neural, 3DGS-based, and even traditional approaches. Project page: https://lsg-slam.github.io.

Via

Access Paper or Ask Questions