Low Earth orbit (LEO) satellites and reconfigurable intelligent surfaces (RISs) have recently drawn significant attention as two transformative technologies, and the synergy between them emerges as a promising paradigm for providing cross-environment communication and positioning services. This paper investigates an integrated terrestrial and non-terrestrial wireless network that leverages LEO satellites and RISs to achieve simultaneous tracking of the 3D position, 3D velocity, and 3D orientation of user equipment (UE). To address inherent challenges including nonlinear observation function, constrained UE state, and unknown observation statistics, we develop a Riemannian manifold-based unscented Kalman filter (UKF) method. This method propagates statistics over nonlinear functions using generated sigma points and maintains state constraints through projection onto the defined manifold space. Additionally, by employing Fisher information matrices (FIMs) of the sigma points, a belief assignment principle is proposed to approximate the unknown observation covariance matrix, thereby ensuring accurate measurement updates in the UKF procedure. Numerical results demonstrate a substantial enhancement in tracking accuracy facilitated by RIS integration, despite urban signal reception challenges from LEO satellites. In addition, extensive simulations underscore the superior performance of the proposed tracking method and FIM-based belief assignment over the adopted benchmarks. Furthermore, the robustness of the proposed UKF is verified across various uncertainty levels.
In the field of urban planning, general-purpose large language models often struggle to meet the specific needs of planners. Tasks like generating urban planning texts, retrieving related information, and evaluating planning documents pose unique challenges. To enhance the efficiency of urban professionals and overcome these obstacles, we introduce PlanGPT, the first specialized Large Language Model tailored for urban and spatial planning. Developed through collaborative efforts with institutions like the Chinese Academy of Urban Planning, PlanGPT leverages a customized local database retrieval framework, domain-specific fine-tuning of base models, and advanced tooling capabilities. Empirical tests demonstrate that PlanGPT has achieved advanced performance, delivering responses of superior quality precisely tailored to the intricacies of urban planning.
Large-scale recommendation systems are characterized by their reliance on high cardinality, heterogeneous features and the need to handle tens of billions of user actions on a daily basis. Despite being trained on huge volume of data with thousands of features, most Deep Learning Recommendation Models (DLRMs) in industry fail to scale with compute. Inspired by success achieved by Transformers in language and vision domains, we revisit fundamental design choices in recommendation systems. We reformulate recommendation problems as sequential transduction tasks within a generative modeling framework (``Generative Recommenders''), and propose a new architecture, HSTU, designed for high cardinality, non-stationary streaming recommendation data. HSTU outperforms baselines over synthetic and public datasets by up to 65.8\% in NDCG, and is 5.3x to 15.2x faster than FlashAttention2-based Transformers on 8192 length sequences. HSTU-based Generative Recommenders, with 1.5 trillion parameters, improve metrics in online A/B tests by 12.4\% and have been deployed on multiple surfaces of a large internet platform with billions of users. More importantly, the model quality of Generative Recommenders empirically scales as a power-law of training compute across three orders of magnitude, up to GPT-3/LLaMa-2 scale, which reduces carbon footprint needed for future model developments, and further paves the way for the first foundational models in recommendations.
This paper presents GEA, a novel method for creating expressive 3D avatars with high-fidelity reconstructions of body and hands based on 3D Gaussians. The key contributions are twofold. First, we design a two-stage pose estimation method to obtain an accurate SMPL-X pose from input images, providing a correct mapping between the pixels of a training image and the SMPL-X model. It uses an attention-aware network and an optimization scheme to align the normal and silhouette between the estimated SMPL-X body and the real body in the image. Second, we propose an iterative re-initialization strategy to handle unbalanced aggregation and initialization bias faced by Gaussian representation. This strategy iteratively redistributes the avatar's Gaussian points, making it evenly distributed near the human body surface by applying meshing, resampling and re-Gaussian operations. As a result, higher-quality rendering can be achieved. Extensive experimental analyses validate the effectiveness of the proposed model, demonstrating that it achieves state-of-the-art performance in photorealistic novel view synthesis while offering fine-grained control over the human body and hand pose. Project page: https://3d-aigc.github.io/GEA/.
The contemporary landscape of wireless technology underscores the critical role of precise localization services. Traditional global navigation satellite systems (GNSS)-based solutions, however, fall short when it comes to indoor environments, and existing indoor localization techniques such as electromagnetic fingerprinting methods face challenges of high implementation costs and limited coverage. This article explores an innovative solution that seamlessly blends low Earth orbit (LEO) satellites with reconfigurable intelligent surfaces (RISs), unlocking its potential for realizing uninterrupted indoor and outdoor localization with global coverage. By leveraging the strong signal reception of the LEO satellite signals and capitalizing on the radio environment-reshaping capability of RISs, the integration of these two technologies presents a vision of a future where localization services transcend existing constraints. After a comprehensive review of the distinctive attributes of LEO satellites and RISs, we evaluate the localization error bounds for the proposed collaborative system, showcasing their promising performance on simultaneous indoor and outdoor localization. To conclude, we engage in a discussion on open problems and future research directions for LEO satellite and RIS-enabled localization.
This paper presents GIR, a 3D Gaussian Inverse Rendering method for relightable scene factorization. Compared to existing methods leveraging discrete meshes or neural implicit fields for inverse rendering, our method utilizes 3D Gaussians to estimate the material properties, illumination, and geometry of an object from multi-view images. Our study is motivated by the evidence showing that 3D Gaussian is a more promising backbone than neural fields in terms of performance, versatility, and efficiency. In this paper, we aim to answer the question: ``How can 3D Gaussian be applied to improve the performance of inverse rendering?'' To address the complexity of estimating normals based on discrete and often in-homogeneous distributed 3D Gaussian representations, we proposed an efficient self-regularization method that facilitates the modeling of surface normals without the need for additional supervision. To reconstruct indirect illumination, we propose an approach that simulates ray tracing. Extensive experiments demonstrate our proposed GIR's superior performance over existing methods across multiple tasks on a variety of widely used datasets in inverse rendering. This substantiates its efficacy and broad applicability, highlighting its potential as an influential tool in relighting and reconstruction. Project page: https://3dgir.github.io
To address the challenge of identifying and understanding hidden dangers in substations from unstructured text data, a novel dynamic analysis method is proposed. This approach begins by analyzing and extracting data from the unstructured text related to hidden dangers. It then leverages a flexible, distributed data search engine built on Elastic-Search to handle this information. Following this, the hidden Markov model is employed to train the data within the engine. The Viterbi algorithm is integrated to decipher the hidden state sequences, facilitating the segmentation and labeling of entities related to hidden dangers. The final step involves using the Neo4j graph database to dynamically create a knowledge map that visualizes hidden dangers in the substation. This method's effectiveness is demonstrated through an example analysis using data from a specific substation's hidden dangers.
Robotic manipulation holds the potential to replace humans in the execution of tedious or dangerous tasks. However, control-based approaches are not suitable due to the difficulty of formally describing open-world manipulation in reality, and the inefficiency of existing learning methods. Thus, applying manipulation in a wide range of scenarios presents significant challenges. In this study, we propose a novel method for skill learning in robotic manipulation called Tactile Active Inference Reinforcement Learning (Tactile-AIRL), aimed at achieving efficient training. To enhance the performance of reinforcement learning (RL), we introduce active inference, which integrates model-based techniques and intrinsic curiosity into the RL process. This integration improves the algorithm's training efficiency and adaptability to sparse rewards. Additionally, we utilize a vision-based tactile sensor to provide detailed perception for manipulation tasks. Finally, we employ a model-based approach to imagine and plan appropriate actions through free energy minimization. Simulation results demonstrate that our method achieves significantly high training efficiency in non-prehensile objects pushing tasks. It enables agents to excel in both dense and sparse reward tasks with just a few interaction episodes, surpassing the SAC baseline. Furthermore, we conduct physical experiments on a gripper screwing task using our method, which showcases the algorithm's rapid learning capability and its potential for practical applications.
The growing availability of low-Earth orbit (LEO) satellites, coupled with the anticipated widespread deployment of reconfigurable intelligent surfaces (RISs), opens up promising prospects for new localization paradigms. This paper studies RIS-aided localization using LEO satellite signals. The Cram\'er-Rao bound of the considered localization problem is derived, based on which an optimal RIS beamforming design that minimizes the derived bound is proposed. Numerical results demonstrate the superiority of the proposed beamforming scheme over benchmark alternatives, while also revealing that the synergy between LEO satellites and RISs holds the promise of achieving localization accuracy at the meter or even sub-meter level.
The recently released artificial intelligence conversational agent, ChatGPT, has gained significant attention in academia and real life. A multitude of early ChatGPT users eagerly explore its capabilities and share their opinions on it via social media. Both user queries and social media posts express public concerns regarding this advanced dialogue system. To mine public concerns about ChatGPT, a novel Self-Supervised neural Topic Model (SSTM), which formalizes topic modeling as a representation learning procedure, is proposed in this paper. Extensive experiments have been conducted on Twitter posts about ChatGPT and queries asked by ChatGPT users. And experimental results demonstrate that the proposed approach could extract higher quality public concerns with improved interpretability and diversity, surpassing the performance of state-of-the-art approaches.