Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qiong Wu

Resource Allocation for Transmissive RIS Transceiver Enabled SWIPT Systems

Nov 15, 2025

Yuan Guo, Wen Chen, Xudong Bai, Chong He, Qiong Wu

Figure 1 for Resource Allocation for Transmissive RIS Transceiver Enabled SWIPT Systems

Figure 2 for Resource Allocation for Transmissive RIS Transceiver Enabled SWIPT Systems

Figure 3 for Resource Allocation for Transmissive RIS Transceiver Enabled SWIPT Systems

Figure 4 for Resource Allocation for Transmissive RIS Transceiver Enabled SWIPT Systems

Abstract:A novel transmissive reconfigurable intelligent surface (TRIS) transceiver-empowered simultaneous wireless information and power transfer (SWIPT) framework is proposed. The sum-rate of the information decoding (ID) users is maximized by optimizing the TRIS transceiver's beamforming, subject to the energy harvesting (EH) users' quality-of-harvest and the per-antenna power constraints. To solve this non-convex problem, we develop an efficient optimization algorithm. First, the original problem is reformulated as a semi-definite programming (SDP) problem. The resulting SDP problem is then addressed using successive convex approximation (SCA) combined with a penalty-based method. Numerical results demonstrate the effectiveness of the algorithm.

Via

Access Paper or Ask Questions

GLEAM: Learning to Match and Explain in Cross-View Geo-Localization

Sep 09, 2025

Xudong Lu, Zhi Zheng, Yi Wan, Yongxiang Yao, Annan Wang, Renrui Zhang, Panwang Xia, Qiong Wu, Qingyun Li, Weifeng Lin(+3 more)

Abstract:Cross-View Geo-Localization (CVGL) focuses on identifying correspondences between images captured from distinct perspectives of the same geographical location. However, existing CVGL approaches are typically restricted to a single view or modality, and their direct visual matching strategy lacks interpretability: they merely predict whether two images correspond, without explaining the rationale behind the match. In this paper, we present GLEAM-C, a foundational CVGL model that unifies multiple views and modalities-including UAV imagery, street maps, panoramic views, and ground photographs-by aligning them exclusively with satellite imagery. Our framework enhances training efficiency through optimized implementation while achieving accuracy comparable to prior modality-specific CVGL models through a two-phase training strategy. Moreover, to address the lack of interpretability in traditional CVGL methods, we leverage the reasoning capabilities of multimodal large language models (MLLMs) to propose a new task, GLEAM-X, which combines cross-view correspondence prediction with explainable reasoning. To support this task, we construct a bilingual benchmark using GPT-4o and Doubao-1.5-Thinking-Vision-Pro to generate training and testing data. The test set is further refined through detailed human revision, enabling systematic evaluation of explainable cross-view reasoning and advancing transparency and scalability in geo-localization. Together, GLEAM-C and GLEAM-X form a comprehensive CVGL pipeline that integrates multi-modal, multi-view alignment with interpretable correspondence analysis, unifying accurate cross-view matching with explainable reasoning and advancing Geo-Localization by enabling models to better Explain And Match. Code and datasets used in this work will be made publicly accessible at https://github.com/Lucky-Lance/GLEAM.

* 18 pages

Via

Access Paper or Ask Questions

Quality-of-Service Aware LLM Routing for Edge Computing with Multiple Experts

Aug 01, 2025

Jin Yang, Qiong Wu, Zhiying Feng, Zhi Zhou, Deke Guo, Xu Chen

Figure 1 for Quality-of-Service Aware LLM Routing for Edge Computing with Multiple Experts

Figure 2 for Quality-of-Service Aware LLM Routing for Edge Computing with Multiple Experts

Figure 3 for Quality-of-Service Aware LLM Routing for Edge Computing with Multiple Experts

Figure 4 for Quality-of-Service Aware LLM Routing for Edge Computing with Multiple Experts

Abstract:Large Language Models (LLMs) have demonstrated remarkable capabilities, leading to a significant increase in user demand for LLM services. However, cloud-based LLM services often suffer from high latency, unstable responsiveness, and privacy concerns. Therefore, multiple LLMs are usually deployed at the network edge to boost real-time responsiveness and protect data privacy, particularly for many emerging smart mobile and IoT applications. Given the varying response quality and latency of LLM services, a critical issue is how to route user requests from mobile and IoT devices to an appropriate LLM service (i.e., edge LLM expert) to ensure acceptable quality-of-service (QoS). Existing routing algorithms fail to simultaneously address the heterogeneity of LLM services, the interference among requests, and the dynamic workloads necessary for maintaining long-term stable QoS. To meet these challenges, in this paper we propose a novel deep reinforcement learning (DRL)-based QoS-aware LLM routing framework for sustained high-quality LLM services. Due to the dynamic nature of the global state, we propose a dynamic state abstraction technique to compactly represent global state features with a heterogeneous graph attention network (HAN). Additionally, we introduce an action impact estimator and a tailored reward function to guide the DRL agent in maximizing QoS and preventing latency violations. Extensive experiments on both Poisson and real-world workloads demonstrate that our proposed algorithm significantly improves average QoS and computing resource efficiency compared to existing baselines.

* Accepted by IEEE Transactions on Mobile Computing

Via

Access Paper or Ask Questions

Toward Safety-First Human-Like Decision Making for Autonomous Vehicles in Time-Varying Traffic Flow

Jun 17, 2025

Xiao Wang, Junru Yu, Jun Huang, Qiong Wu, Ljubo Vacic, Changyin Sun

Figure 1 for Toward Safety-First Human-Like Decision Making for Autonomous Vehicles in Time-Varying Traffic Flow

Figure 2 for Toward Safety-First Human-Like Decision Making for Autonomous Vehicles in Time-Varying Traffic Flow

Figure 3 for Toward Safety-First Human-Like Decision Making for Autonomous Vehicles in Time-Varying Traffic Flow

Figure 4 for Toward Safety-First Human-Like Decision Making for Autonomous Vehicles in Time-Varying Traffic Flow

Abstract:Despite the recent advancements in artificial intelligence technologies have shown great potential in improving transport efficiency and safety, autonomous vehicles(AVs) still face great challenge of driving in time-varying traffic flow, especially in dense and interactive situations. Meanwhile, human have free wills and usually do not make the same decisions even situate in the exactly same scenarios, leading to the data-driven methods suffer from poor migratability and high search cost problems, decreasing the efficiency and effectiveness of the behavior policy. In this research, we propose a safety-first human-like decision-making framework(SF-HLDM) for AVs to drive safely, comfortably, and social compatiblely in effiency. The framework integrates a hierarchical progressive framework, which combines a spatial-temporal attention (S-TA) mechanism for other road users' intention inference, a social compliance estimation module for behavior regulation, and a Deep Evolutionary Reinforcement Learning(DERL) model for expanding the search space efficiently and effectively to make avoidance of falling into the local optimal trap and reduce the risk of overfitting, thus make human-like decisions with interpretability and flexibility. The SF-HLDM framework enables autonomous driving AI agents dynamically adjusts decision parameters to maintain safety margins and adhering to contextually appropriate driving behaviors at the same time.

Via

Access Paper or Ask Questions

Federated Learning Assisted Edge Caching Scheme Based on Lightweight Architecture DDPM

Jun 05, 2025

Xun Li, Qiong Wu

Abstract:Edge caching is an emerging technology that empowers caching units at edge nodes, allowing users to fetch contents of interest that have been pre-cached at the edge nodes. The key to pre-caching is to maximize the cache hit percentage for cached content without compromising users' privacy. In this letter, we propose a federated learning (FL) assisted edge caching scheme based on lightweight architecture denoising diffusion probabilistic model (LDPM). Our simulation results verify that our proposed scheme achieves a higher cache hit percentage compared to existing FL-based methods and baseline methods.

* This paper has been submitted to IEEE letters. The source code has been released at: https://github.com/qiongwu86/Federated-Learning-Assisted-Edge-Caching-Scheme-Based-on-Lightweight-Architecture-DDPM

Via

Access Paper or Ask Questions

Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking

May 29, 2025

Liangliang Zhang, Zhuorui Jiang, Hongliang Chi, Haoyang Chen, Mohammed Elkoumy, Fali Wang, Qiong Wu, Zhengyi Zhou, Shirui Pan, Suhang Wang(+1 more)

Figure 1 for Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking

Figure 2 for Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking

Figure 3 for Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking

Figure 4 for Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking

Abstract:Knowledge Graph Question Answering (KGQA) systems rely on high-quality benchmarks to evaluate complex multi-hop reasoning. However, despite their widespread use, popular datasets such as WebQSP and CWQ suffer from critical quality issues, including inaccurate or incomplete ground-truth annotations, poorly constructed questions that are ambiguous, trivial, or unanswerable, and outdated or inconsistent knowledge. Through a manual audit of 16 popular KGQA datasets, including WebQSP and CWQ, we find that the average factual correctness rate is only 57 %. To address these issues, we introduce KGQAGen, an LLM-in-the-loop framework that systematically resolves these pitfalls. KGQAGen combines structured knowledge grounding, LLM-guided generation, and symbolic verification to produce challenging and verifiable QA instances. Using KGQAGen, we construct KGQAGen-10k, a ten-thousand scale benchmark grounded in Wikidata, and evaluate a diverse set of KG-RAG models. Experimental results demonstrate that even state-of-the-art systems struggle on this benchmark, highlighting its ability to expose limitations of existing models. Our findings advocate for more rigorous benchmark construction and position KGQAGen as a scalable framework for advancing KGQA evaluation.

* 9 pages

Via

Access Paper or Ask Questions

StereoINR: Cross-View Geometry Consistent Stereo Super Resolution with Implicit Neural Representation

May 07, 2025

Yi Liu, Xinyi Liu, Panwang Xia, Qiong Wu, Yi Wan, Yongjun Zhang

Abstract:Stereo image super-resolution (SSR) aims to enhance high-resolution details by leveraging information from stereo image pairs. However, existing stereo super-resolution (SSR) upsampling methods (e.g., pixel shuffle) often overlook cross-view geometric consistency and are limited to fixed-scale upsampling. The key issue is that previous upsampling methods use convolution to independently process deep features of different views, lacking cross-view and non-local information perception, making it difficult to select beneficial information from multi-view scenes adaptively. In this work, we propose Stereo Implicit Neural Representation (StereoINR), which innovatively models stereo image pairs as continuous implicit representations. This continuous representation breaks through the scale limitations, providing a unified solution for arbitrary-scale stereo super-resolution reconstruction of left-right views. Furthermore, by incorporating spatial warping and cross-attention mechanisms, StereoINR enables effective cross-view information fusion and achieves significant improvements in pixel-level geometric consistency. Extensive experiments across multiple datasets show that StereoINR outperforms out-of-training-distribution scale upsampling and matches state-of-the-art SSR methods within training-distribution scales.

Via

Access Paper or Ask Questions

Grounded Chain-of-Thought for Multimodal Large Language Models

Mar 17, 2025

Qiong Wu, Xiangcong Yang, Yiyi Zhou, Chenxin Fang, Baiyang Song, Xiaoshuai Sun, Rongrong Ji

Abstract:Despite great progress, existing multimodal large language models (MLLMs) are prone to visual hallucination, greatly impeding their trustworthy applications. In this paper, we study this problem from the perspective of visual-spatial reasoning, and propose a new learning task for MLLMs, termed Grounded Chain-of-Thought (GCoT). Different from recent visual CoT studies, which focus more on visual knowledge reasoning, GCoT is keen to helping MLLMs to recognize and ground the relevant visual cues step by step, thereby predicting the correct answer with grounding coordinates as the intuitive basis. To facilitate this task, we also carefully design and construct a dataset called multimodal grounded chain-of-thought (MM-GCoT) consisting of 24,022 GCoT examples for 5,033 images. Besides, a comprehensive consistency evaluation system is also introduced, including the metrics of answer accuracy, grounding accuracy and answer-grounding consistency. We further design and conduct a bunch of experiments on 12 advanced MLLMs, and reveal some notable findings: i. most MLLMs performs poorly on the consistency evaluation, indicating obvious visual hallucination; ii. visual hallucination is not directly related to the parameter size and general multimodal performance, i.e., a larger and stronger MLLM is not less affected by this issue. Lastly, we also demonstrate that the proposed dataset can help existing MLLMs to well cultivate their GCoT capability and reduce the inconsistent answering significantly. Moreover, their GCoT can be also generalized to exiting multimodal tasks, such as open-world QA and REC.

Via

Access Paper or Ask Questions

Unifying and Optimizing Data Values for Selection via Sequential-Decision-Making

Feb 06, 2025

Hongliang Chi, Qiong Wu, Zhengyi Zhou, Jonathan Light, Emily Dodwell, Yao Ma

Figure 1 for Unifying and Optimizing Data Values for Selection via Sequential-Decision-Making

Figure 2 for Unifying and Optimizing Data Values for Selection via Sequential-Decision-Making

Figure 3 for Unifying and Optimizing Data Values for Selection via Sequential-Decision-Making

Figure 4 for Unifying and Optimizing Data Values for Selection via Sequential-Decision-Making

Abstract:Data selection has emerged as a crucial downstream application of data valuation. While existing data valuation methods have shown promise in selection tasks, the theoretical foundations and full potential of using data values for selection remain largely unexplored. In this work, we first demonstrate that data values applied for selection can be naturally reformulated as a sequential-decision-making problem, where the optimal data value can be derived through dynamic programming. We show this framework unifies and reinterprets existing methods like Data Shapley through the lens of approximate dynamic programming, specifically as myopic reward function approximations to this sequential problem. Furthermore, we analyze how sequential data selection optimality is affected when the ground-truth utility function exhibits monotonic submodularity with curvature. To address the computational challenges in obtaining optimal data values, we propose an efficient approximation scheme using learned bipartite graphs as surrogate utility models, ensuring greedy selection is still optimal when the surrogate utility is correctly specified and learned. Extensive experiments demonstrate the effectiveness of our approach across diverse datasets.

Via

Access Paper or Ask Questions

Enhanced SPS Velocity-adaptive Scheme: Access Fairness in 5G NR V2I Networks

Jan 16, 2025

Xiao Xu, Qiong Wu, Pingyi Fan, Kezhi Wang

Figure 1 for Enhanced SPS Velocity-adaptive Scheme: Access Fairness in 5G NR V2I Networks

Figure 2 for Enhanced SPS Velocity-adaptive Scheme: Access Fairness in 5G NR V2I Networks

Figure 3 for Enhanced SPS Velocity-adaptive Scheme: Access Fairness in 5G NR V2I Networks

Figure 4 for Enhanced SPS Velocity-adaptive Scheme: Access Fairness in 5G NR V2I Networks

Abstract:Vehicle-to-Infrastructure (V2I) technology enables information exchange between vehicles and road infrastructure. Specifically, when a vehicle approaches a roadside unit (RSU), it can exchange information with the RSU to obtain accurate data that assists in driving. With the release of the 3rd Generation Partnership Project (3GPP) Release 16, which includes the 5G New Radio (NR) Vehicle-to-Everything (V2X) standards, vehicles typically adopt mode-2 communication using sensing-based semi-persistent scheduling (SPS) for resource allocation. In this approach, vehicles identify candidate resources within a selection window and exclude ineligible resources based on information from a sensing window. However, vehicles often drive at different speeds, resulting in varying amounts of data transmission with RSUs as they pass by, which leads to unfair access. Therefore, it is essential to design an access scheme that accounts for different vehicle speeds to achieve fair access across the network. This paper formulates an optimization problem for vehicular networks and proposes a multi-objective optimization scheme to address it by adjusting the selection window in the SPS mechanism of 5G NR V2I mode-2. Simulation results demonstrate the effectiveness of the proposed scheme

* This paper has been submitted to IEEE Journal. The source code has been released at: https://github.com/qiongwu86/Enhanced-SPS-Velocity-adaptiveScheme-Access-Fariness-in-5G-NR-V2I-Networks

Via

Access Paper or Ask Questions