Abstract:Cell-free networks leverage distributed access points (APs) to achieve macro-diversity, yet their performance is often constrained by large disparities in channel quality arising from user geometry and blockages. To address this, rotatable antennas (RAs) add a lightweight hardware degree of freedom by steering the antenna boresight toward dominant propagation directions to strengthen unfavorable links, thereby enabling the network to better exploit macro-diversity for higher and more uniform performance. This paper investigates an RA-enabled cell-free downlink network and formulates a max-min rate problem that jointly optimizes transmit beamforming and antenna orientations. To tackle this challenging problem, we develop an alternating-optimization-based algorithm that iteratively updates the beamformers via a second-order cone program (SOCP) and optimizes the antenna orientations using successive convex approximation. To reduce complexity, we further propose an efficient two-stage scheme that first designs orientations by maximizing a proportional-fair log-utility using manifold-aware Frank-Wolfe updates, and then computes the beamformers using an SOCP-based design. Simulation results demonstrate that the proposed orientation-aware designs achieve a substantially higher worst-user rate than conventional beamforming-only benchmarks. Furthermore, larger antenna directivity enhances fairness with proper orientation but can degrade the worst-user performance otherwise.
Abstract:This paper proposes a two-scale spatial deployment strategy to ensure reliable coverage for multiple target areas, integrating macroscopic intelligent reflecting surfaces (IRSs) and fine-grained movable antennas (MAs). Specifically, IRSs are selectively deployed from candidate sites to shape the propagation geometry, while MAs are locally repositioned among discretized locations to exploit small-scale channel variations. The objective is to minimize the total deployment cost of MAs and IRSs by jointly optimizing the IRS site selection, MA positions, transmit precoding, and IRS phase shifts, subject to the signal-to-noise ratio (SNR) requirements for all target areas. This leads to a challenging mixed-integer non-convex optimization problem that is intractable to solve directly. To address this, we first formulate an auxiliary problem to verify the feasibility. A penalty-based double-loop algorithm integrating alternating optimization and successive convex approximation (SCA) is developed to solve this feasibility issue, which is subsequently adapted to obtain a suboptimal solution for the original cost minimization problem. Finally, based on the obtained solution, we formulate an element refinement problem to further reduce the deployment cost, which is solved by a penalty-based SCA algorithm. Simulation results demonstrate that the proposed designs consistently outperform benchmarks relying on independent area planning or full IRS deployment in terms of cost-efficiency. Moreover, for cost minimization, MA architectures are preferable in large placement apertures, whereas fully populated FPA architectures excel in compact ones; for worst-case SNR maximization, MA architectures exhibit a lower cost threshold for feasibility, while FPA architectures can attain peak SNR at a lower total cost.
Abstract:Capturing complex user preferences from sparse behavioral sequences remains a fundamental challenge in sequential recommendation. Recent latent reasoning methods have shown promise by extending test-time computation through multi-step reasoning, yet they exclusively rely on depth-level scaling along a single trajectory, suffering from diminishing returns as reasoning depth increases. To address this limitation, we propose \textbf{Parallel Latent Reasoning (PLR)}, a novel framework that pioneers width-level computational scaling by exploring multiple diverse reasoning trajectories simultaneously. PLR constructs parallel reasoning streams through learnable trigger tokens in continuous latent space, preserves diversity across streams via global reasoning regularization, and adaptively synthesizes multi-stream outputs through mixture-of-reasoning-streams aggregation. Extensive experiments on three real-world datasets demonstrate that PLR substantially outperforms state-of-the-art baselines while maintaining real-time inference efficiency. Theoretical analysis further validates the effectiveness of parallel reasoning in improving generalization capability. Our work opens new avenues for enhancing reasoning capacity in sequential recommendation beyond existing depth scaling.
Abstract:This paper investigates a low-altitude integrated sensing and communication (ISAC) system that leverages cooperative rotatable active and passive arrays. We consider a downlink scenario where a base station (BS) with an active rotatable array serves multiple communication users and senses low-altitude targets, assisted by a rotatable reconfigurable intelligent surface (RIS). A rotation-aware geometry-based multipath model is developed to capture the impact of three-dimensional (3D) array orientations on both steering vectors and direction-dependent element gains. On this basis, we formulate a new optimization problem that maximizes the downlink sum rate subject to a transmit power budget, RIS unit-modulus constraints, mechanical rotation limits, and a sensing beampattern mean-squared-error (MSE) constraint. To address the resulting highly non-convex problem, we propose a penalty-based alternating-optimization (AO) framework that alternately updates the BS precoder, RIS phase shifts, and BS/RIS array rotation angles. The three blocks are efficiently handled via a convex optimization method based on quadratic-transform (QT) and majorization-minorization (MM), Riemannian conjugate gradient (RCG) on the unit-modulus manifold, and projected gradient descent (PGD) with Barzilai-Borwein step sizes, respectively. Numerical results in low-altitude geometries demonstrate that the proposed jointly rotatable BS-RIS architecture achieves significant sum-rate gains over fixed or partially rotatable baselines while guaranteeing sensing requirements, especially with directional antennas and in interference-limited regimes.
Abstract:Vision-Language Navigation in Continuous Environments (VLN-CE) requires an embodied agent to navigate towards target in continuous environments, following natural language instructions. While current graph-based methods offer an efficient, structured approach by abstracting the environment into a topological map and simplifying the action space to waypoint selection, they lag behind methods based on Large Vision-Language Models (LVLMs) in leveraging large-scale data and advanced training paradigms. In this paper, we try to bridge this gap by introducing ETP-R1, a framework that applies the paradigm of scaling up data and Reinforcement Fine-Tuning (RFT) to a graph-based VLN-CE model. To build a strong foundation, we first construct a high-quality, large-scale pretraining dataset using the Gemini API. This dataset consists of diverse, low-hallucination instructions for topological trajectories, providing rich supervision for our graph-based policy to map language to topological paths. This foundation is further strengthened by unifying data from both R2R and RxR tasks for joint pretraining. Building on this, we introduce a three-stage training paradigm, which culminates in the first application of closed-loop, online RFT to a graph-based VLN-CE model, powered by the Group Relative Policy Optimization (GRPO) algorithm. Extensive experiments demonstrate that our approach is highly effective, establishing new state-of-the-art performance across all major metrics on both the R2R-CE and RxR-CE benchmarks. Our code is available at https://github.com/Cepillar/ETP-R1.
Abstract:Industrial recommender systems face two fundamental limitations under the log-driven paradigm: (1) knowledge poverty in ID-based item representations that causes brittle interest modeling under data sparsity, and (2) systemic blindness to beyond-log user interests that constrains model performance within platform boundaries. These limitations stem from an over-reliance on shallow interaction statistics and close-looped feedback while neglecting the rich world knowledge about product semantics and cross-domain behavioral patterns that Large Language Models have learned from vast corpora. To address these challenges, we introduce ReaSeq, a reasoning-enhanced framework that leverages world knowledge in Large Language Models to address both limitations through explicit and implicit reasoning. Specifically, ReaSeq employs explicit Chain-of-Thought reasoning via multi-agent collaboration to distill structured product knowledge into semantically enriched item representations, and latent reasoning via Diffusion Large Language Models to infer plausible beyond-log behaviors. Deployed on Taobao's ranking system serving hundreds of millions of users, ReaSeq achieves substantial gains: >6.0% in IPV and CTR, >2.9% in Orders, and >2.5% in GMV, validating the effectiveness of world-knowledge-enhanced reasoning over purely log-driven approaches.




Abstract:Large language models (LLMs) have demonstrated remarkable potential in transforming recommender systems from implicit behavioral pattern matching to explicit intent reasoning. While RecGPT-V1 successfully pioneered this paradigm by integrating LLM-based reasoning into user interest mining and item tag prediction, it suffers from four fundamental limitations: (1) computational inefficiency and cognitive redundancy across multiple reasoning routes; (2) insufficient explanation diversity in fixed-template generation; (3) limited generalization under supervised learning paradigms; and (4) simplistic outcome-focused evaluation that fails to match human standards. To address these challenges, we present RecGPT-V2 with four key innovations. First, a Hierarchical Multi-Agent System restructures intent reasoning through coordinated collaboration, eliminating cognitive duplication while enabling diverse intent coverage. Combined with Hybrid Representation Inference that compresses user-behavior contexts, our framework reduces GPU consumption by 60% and improves exclusive recall from 9.39% to 10.99%. Second, a Meta-Prompting framework dynamically generates contextually adaptive prompts, improving explanation diversity by +7.3%. Third, constrained reinforcement learning mitigates multi-reward conflicts, achieving +24.1% improvement in tag prediction and +13.0% in explanation acceptance. Fourth, an Agent-as-a-Judge framework decomposes assessment into multi-step reasoning, improving human preference alignment. Online A/B tests on Taobao demonstrate significant improvements: +2.98% CTR, +3.71% IPV, +2.19% TV, and +11.46% NER. RecGPT-V2 establishes both the technical feasibility and commercial viability of deploying LLM-powered intent reasoning at scale, bridging the gap between cognitive exploration and industrial utility.



Abstract:Rotatable intelligent reflecting surfaces (IRSs) introduce a new degree of freedom (DoF) for shaping wireless propagation by adaptively adjusting the orientation of IRSs. This paper considers an angle-dependent reflection model in a wireless communication system aided by two rotatable IRSs. Specifically, we study the joint design of the base station transmit beamforming, as well as the cooperative passive beamforming and orientation of the two IRSs, to maximize the received signal-to-noise ratio (SNR). Under the light-of-sight (LoS) channels, we first develop a particle swarm optimization (PSO) based method to determine the IRS rotation and derive an optimal rotation in a closed-form expression for a two-dimensional IRS deployment. Then, we extend the design to the general Rician fading channels by proposing an efficient alternating optimization and PSO (AO-PSO) algorithm. Numerical results validate the substantial gains achieved by the IRS rotation over fixed-IRS schemes and also demonstrate the superior performance of the double rotatable IRSs over a single rotatable IRS given a sufficient total number of IRS elements.
Abstract:Semantic Communication (SC) combined with Vehicular edge computing (VEC) provides an efficient edge task processing paradigm for Internet of Vehicles (IoV). Focusing on highway scenarios, this paper proposes a Tripartite Cooperative Semantic Communication (TCSC) framework, which enables Vehicle Users (VUs) to perform semantic task offloading via Vehicle-to-Infrastructure (V2I) and Vehicle-to-Vehicle (V2V) communications. Considering task latency and the number of semantic symbols, the framework constructs a Mixed-Integer Nonlinear Programming (MINLP) problem, which is transformed into two subproblems. First, we innovatively propose a multi-agent proximal policy optimization task offloading optimization method based on parametric distribution noise (MAPPO-PDN) to solve the optimization problem of the number of semantic symbols; second, linear programming (LP) is used to solve offloading ratio. Simulations show that performance of this scheme is superior to that of other algorithms.




Abstract:Vehicle edge caching is a promising technology that can significantly reduce the latency for vehicle users (VUs) to access content by pre-caching user-interested content at edge nodes. It is crucial to accurately predict the content that VUs are interested in without exposing their privacy. Traditional federated learning (FL) can protect user privacy by sharing models rather than raw data. However, the training of FL requires frequent model transmission, which can result in significant communication overhead. Additionally, vehicles may leave the road side unit (RSU) coverage area before training is completed, leading to training failures. To address these issues, in this letter, we propose a federated distillation-assisted vehicle edge caching scheme based on lightweight denoising diffusion probabilistic model (LDPM). The simulation results demonstrate that the proposed vehicle edge caching scheme has good robustness to variations in vehicle speed, significantly reducing communication overhead and improving cache hit percentage.