Abstract:Even though thermodynamic energy-based crystal structure prediction (CSP) has revolutionized materials discovery, the energy-driven CSP approaches often struggle to identify experimentally realizable metastable materials synthesized through kinetically controlled pathways, creating a critical gap between theoretical predictions and experimental synthesis. Here, we propose a synthesizability-driven CSP framework that integrates symmetry-guided structure derivation with a Wyckoff encode-based machine-learning model, allowing for the efficient localization of subspaces likely to yield highly synthesizable structures. Within the identified promising subspaces, a structure-based synthesizability evaluation model, fine-tuned using recently synthesized structures to enhance predictive accuracy, is employed in conjunction with ab initio calculations to systematically identify synthesizable candidates. The framework successfully reproduces 13 experimentally known XSe (X = Sc, Ti, Mn, Fe, Ni, Cu, Zn) structures, demonstrating its effectiveness in predicting synthesizable structures. Notably, 92,310 structures are filtered from the 554,054 candidates predicted by GNoME, exhibiting great potential for promising synthesizability. Additionally, eight thermodynamically favorable Hf-X-O (X = Ti, V, and Mn) structures have been identified, among which three HfV$_2$O$_7$ candidates exhibit high synthesizability, presenting viable candidates for experimental realization and potentially associated with experimentally observed temperature-induced phase transitions. This work establishes a data-driven paradigm for machine-learning-assisted inorganic materials synthesis, highlighting its potential to bridge the gap between computational predictions and experimental realization while unlocking new opportunities for the targeted discovery of novel functional materials.
Abstract:Multi-object tracking (MOT) is a rising topic in video processing technologies and has important application value in consumer electronics. Currently, tracking-by-detection (TBD) is the dominant paradigm for MOT, which performs target detection and association frame by frame. However, the association performance of TBD methods degrades in complex scenes with heavy occlusions, which hinders the application of such methods in real-world scenarios.To this end, we incorporate pseudo-depth cues to enhance the association performance and propose Pseudo-Depth SORT (PD-SORT). First, we extend the Kalman filter state vector with pseudo-depth states. Second, we introduce a novel depth volume IoU (DVIoU) by combining the conventional 2D IoU with pseudo-depth. Furthermore, we develop a quantized pseudo-depth measurement (QPDM) strategy for more robust data association. Besides, we also integrate camera motion compensation (CMC) to handle dynamic camera situations. With the above designs, PD-SORT significantly alleviates the occlusion-induced ambiguous associations and achieves leading performances on DanceTrack, MOT17, and MOT20. Note that the improvement is especially obvious on DanceTrack, where objects show complex motions, similar appearances, and frequent occlusions. The code is available at https://github.com/Wangyc2000/PD_SORT.
Abstract:Ubiquitous mobile devices are generating vast amounts of location-based service data that reveal how individuals navigate and utilize urban spaces in detail. In this study, we utilize these extensive, unlabeled sequences of user trajectories to develop a foundation model for understanding urban space and human mobility. We introduce the \textbf{P}retrained \textbf{M}obility \textbf{T}ransformer (PMT), which leverages the transformer architecture to process user trajectories in an autoregressive manner, converting geographical areas into tokens and embedding spatial and temporal information within these representations. Experiments conducted in three U.S. metropolitan areas over a two-month period demonstrate PMT's ability to capture underlying geographic and socio-demographic characteristics of regions. The proposed PMT excels across various downstream tasks, including next-location prediction, trajectory imputation, and trajectory generation. These results support PMT's capability and effectiveness in decoding complex patterns of human mobility, offering new insights into urban spatial functionality and individual mobility preferences.