Abstract:Fine-grained urban flow inference is crucial for urban planning and intelligent transportation systems, enabling precise traffic management and resource allocation. However, the practical deployment of existing methods is hindered by two key challenges: the prohibitive computational cost of over-parameterized models and the suboptimal performance of conventional loss functions on the highly skewed distribution of urban flows. To address these challenges, we propose a unified solution that synergizes architectural efficiency with adaptive optimization. Specifically, we first introduce PLGF, a lightweight yet powerful architecture that employs a Progressive Local-Global Fusion strategy to effectively capture both fine-grained details and global contextual dependencies. Second, we propose DualFocal Loss, a novel function that integrates dual-space supervision with a difficulty-aware focusing mechanism, enabling the model to adaptively concentrate on hard-to-predict regions. Extensive experiments on 4 real-world scenarios validate the effectiveness and scalability of our method. Notably, while achieving state-of-the-art performance, PLGF reduces the model size by up to 97% compared to current high-performing methods. Furthermore, under comparable parameter budgets, our model yields an accuracy improvement of over 10% against strong baselines. The implementation is included in the https://github.com/Yasoz/PLGF.
Abstract:Temporal non-stationarity, the phenomenon that time series distributions change over time, poses fundamental challenges to reliable time series forecasting. Intuitively, the complex time series can be decomposed into two factors, \ie time-invariant and time-varying components, which indicate static and dynamic patterns, respectively. Nonetheless, existing methods often conflate the time-varying and time-invariant components, and jointly learn the combined long-term patterns and short-term fluctuations, leading to suboptimal performance facing distribution shifts. To address this issue, we initiatively propose a lightweight static-dynamic decomposition framework, TimeEmb, for time series forecasting. TimeEmb innovatively separates time series into two complementary components: (1) time-invariant component, captured by a novel global embedding module that learns persistent representations across time series, and (2) time-varying component, processed by an efficient frequency-domain filtering mechanism inspired by full-spectrum analysis in signal processing. Experiments on real-world datasets demonstrate that TimeEmb outperforms state-of-the-art baselines and requires fewer computational resources. We conduct comprehensive quantitative and qualitative analyses to verify the efficacy of static-dynamic disentanglement. This lightweight framework can also improve existing time-series forecasting methods with simple integration. To ease reproducibility, the code is available at https://github.com/showmeon/TimeEmb.
Abstract:The widespread adoption of mobile devices and data collection technologies has led to an exponential increase in trajectory data, presenting significant challenges in spatio-temporal data mining, particularly for efficient and accurate trajectory retrieval. However, existing methods for trajectory retrieval face notable limitations, including inefficiencies in large-scale data, lack of support for condition-based queries, and reliance on trajectory similarity measures. To address the above challenges, we propose OmniTraj, a generalized and flexible omni-semantic trajectory retrieval framework that integrates four complementary modalities or semantics -- raw trajectories, topology, road segments, and regions -- into a unified system. Unlike traditional approaches that are limited to computing and processing trajectories as a single modality, OmniTraj designs dedicated encoders for each modality, which are embedded and fused into a shared representation space. This design enables OmniTraj to support accurate and flexible queries based on any individual modality or combination thereof, overcoming the rigidity of traditional similarity-based methods. Extensive experiments on two real-world datasets demonstrate the effectiveness of OmniTraj in handling large-scale data, providing flexible, multi-modality queries, and supporting downstream tasks and applications.
Abstract:Human trajectory modeling is essential for deciphering movement patterns and supporting advanced applications across various domains. However, existing methods are often tailored to specific tasks and regions, resulting in limitations related to task specificity, regional dependency, and data quality sensitivity. Addressing these challenges requires a universal human trajectory foundation model capable of generalizing and scaling across diverse tasks and geographic contexts. To this end, we propose UniTraj, a Universal human Trajectory foundation model that is task-adaptive, region-independent, and highly generalizable. To further enhance performance, we construct WorldTrace, the first large-scale, high-quality, globally distributed dataset sourced from open web platforms, encompassing 2.45 million trajectories with billions of points across 70 countries. Through multiple resampling and masking strategies designed for pre-training, UniTraj effectively overcomes geographic and task constraints, adapting to heterogeneous data quality. Extensive experiments across multiple trajectory analysis tasks and real-world datasets demonstrate that UniTraj consistently outperforms existing approaches in terms of scalability and adaptability. These results underscore the potential of UniTraj as a versatile, robust solution for a wide range of trajectory analysis applications, with WorldTrace serving as an ideal but non-exclusive foundation for training.




Abstract:Sequential Recommender Systems (SRS) are extensively applied across various domains to predict users' next interaction by modeling their interaction sequences. However, these systems typically grapple with the long-tail problem, where they struggle to recommend items that are less popular. This challenge results in a decline in user discovery and reduced earnings for vendors, negatively impacting the system as a whole. Large Language Model (LLM) has the potential to understand the semantic connections between items, regardless of their popularity, positioning them as a viable solution to this dilemma. In our paper, we present LLMEmb, an innovative technique that harnesses LLM to create item embeddings that bolster the performance of SRS. To align the capabilities of general-purpose LLM with the needs of the recommendation domain, we introduce a method called Supervised Contrastive Fine-Tuning (SCFT). This method involves attribute-level data augmentation and a custom contrastive loss designed to tailor LLM for enhanced recommendation performance. Moreover, we highlight the necessity of incorporating collaborative filtering signals into LLM-generated embeddings and propose Recommendation Adaptation Training (RAT) for this purpose. RAT refines the embeddings to be optimally suited for SRS. The embeddings derived from LLMEmb can be easily integrated with any SRS model, showcasing its practical utility. Extensive experimentation on three real-world datasets has shown that LLMEmb significantly improves upon current methods when applied across different SRS models.
Abstract:Generating trajectory data is among promising solutions to addressing privacy concerns, collection costs, and proprietary restrictions usually associated with human mobility analyses. However, existing trajectory generation methods are still in their infancy due to the inherent diversity and unpredictability of human activities, grappling with issues such as fidelity, flexibility, and generalizability. To overcome these obstacles, we propose ControlTraj, a Controllable Trajectory generation framework with the topology-constrained diffusion model. Distinct from prior approaches, ControlTraj utilizes a diffusion model to generate high-fidelity trajectories while integrating the structural constraints of road network topology to guide the geographical outcomes. Specifically, we develop a novel road segment autoencoder to extract fine-grained road segment embedding. The encoded features, along with trip attributes, are subsequently merged into the proposed geographic denoising UNet architecture, named GeoUNet, to synthesize geographic trajectories from white noise. Through experimentation across three real-world data settings, ControlTraj demonstrates its ability to produce human-directed, high-fidelity trajectory generation with adaptability to unexplored geographical contexts.




Abstract:Trajectory modeling refers to characterizing human movement behavior, serving as a pivotal step in understanding mobility patterns. Nevertheless, existing studies typically ignore the confounding effects of geospatial context, leading to the acquisition of spurious correlations and limited generalization capabilities. To bridge this gap, we initially formulate a Structural Causal Model (SCM) to decipher the trajectory representation learning process from a causal perspective. Building upon the SCM, we further present a Trajectory modeling framework (TrajCL) based on Causal Learning, which leverages the backdoor adjustment theory as an intervention tool to eliminate the spurious correlations between geospatial context and trajectories. Extensive experiments on two real-world datasets verify that TrajCL markedly enhances performance in trajectory classification tasks while showcasing superior generalization and interpretability.




Abstract:Trajectory computing is a pivotal domain encompassing trajectory data management and mining, garnering widespread attention due to its crucial role in various practical applications such as location services, urban traffic, and public safety. Traditional methods, focusing on simplistic spatio-temporal features, face challenges of complex calculations, limited scalability, and inadequate adaptability to real-world complexities. In this paper, we present a comprehensive review of the development and recent advances in deep learning for trajectory computing (DL4Traj). We first define trajectory data and provide a brief overview of widely-used deep learning models. Systematically, we explore deep learning applications in trajectory management (pre-processing, storage, analysis, and visualization) and mining (trajectory-related forecasting, trajectory-related recommendation, trajectory classification, travel time estimation, anomaly detection, and mobility generation). Notably, we encapsulate recent advancements in Large Language Models (LLMs) that hold the potential to augment trajectory computing. Additionally, we summarize application scenarios, public datasets, and toolkits. Finally, we outline current challenges in DL4Traj research and propose future directions. Relevant papers and open-source resources have been collated and are continuously updated at: \href{https://github.com/yoshall/Awesome-Trajectory-Computing}{DL4Traj Repo}.




Abstract:The recommendation of medication is a vital aspect of intelligent healthcare systems, as it involves prescribing the most suitable drugs based on a patient's specific health needs. Unfortunately, many sophisticated models currently in use tend to overlook the nuanced semantics of medical data, while only relying heavily on identities. Furthermore, these models face significant challenges in handling cases involving patients who are visiting the hospital for the first time, as they lack prior prescription histories to draw upon. To tackle these issues, we harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs). Our research aims to transform existing medication recommendation methodologies using LLMs. In this paper, we introduce a novel approach called Large Language Model Distilling Medication Recommendation (LEADER). We begin by creating appropriate prompt templates that enable LLMs to suggest medications effectively. However, the straightforward integration of LLMs into recommender systems leads to an out-of-corpus issue specific to drugs. We handle it by adapting the LLMs with a novel output layer and a refined tuning loss function. Although LLM-based models exhibit remarkable capabilities, they are plagued by high computational costs during inference, which is impractical for the healthcare sector. To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model. Extensive experiments conducted on two real-world datasets, MIMIC-III and MIMIC-IV, demonstrate that our proposed model not only delivers effective results but also is efficient. To ease the reproducibility of our experiments, we release the implementation code online.




Abstract:Modeling future traffic conditions often relies heavily on complex spatial-temporal neural networks to capture spatial and temporal correlations, which can overlook the inherent noise in the data. This noise, often manifesting as unexpected short-term peaks or drops in traffic observation, is typically caused by traffic accidents or inherent sensor vibration. In practice, such noise can be challenging to model due to its stochastic nature and can lead to overfitting risks if a neural network is designed to learn this behavior. To address this issue, we propose a learnable filter module to filter out noise in traffic data adaptively. This module leverages the Fourier transform to convert the data to the frequency domain, where noise is filtered based on its pattern. The denoised data is then recovered to the time domain using the inverse Fourier transform. Our approach focuses on enhancing the quality of the input data for traffic prediction models, which is a critical yet often overlooked aspect in the field. We demonstrate that the proposed module is lightweight, easy to integrate with existing models, and can significantly improve traffic prediction performance. Furthermore, we validate our approach with extensive experimental results on real-world datasets, showing that it effectively mitigates noise and enhances prediction accuracy.