Real-Time Strategy (RTS) game unit generation is an unexplored area of Procedural Content Generation (PCG) research, which leaves the question of how to automatically generate interesting and balanced units unanswered. Creating unique and balanced units can be a difficult task when designing an RTS game, even for humans. Having an automated method of designing units could help developers speed up the creation process as well as find new ideas. In this work we propose a method of generating balanced and useful RTS units. We draw on Search-Based PCG and a fitness function based on Monte Carlo Tree Search (MCTS). We present ten units generated by our system designed to be used in the game microRTS, as well as results demonstrating that these units are unique, useful, and balanced.
ML platforms help enable intelligent data-driven applications and maintain them with limited engineering effort. Upon sufficiently broad adoption, such platforms reach economies of scale that bring greater component reuse while improving efficiency of system development and maintenance. For an end-to-end ML platform with broad adoption, scaling relies on pervasive ML automation and system integration to reach the quality we term self-serve that we define with ten requirements and six optional capabilities. With this in mind, we identify long-term goals for platform development, discuss related tradeoffs and future work. Our reasoning is illustrated on two commercially-deployed end-to-end ML platforms that host hundreds of real-time use cases -- one general-purpose and one specialized.
We present a novel algorithm for learning-based loop-closure for SLAM (simultaneous localization and mapping) applications. Our approach is designed for general 3D point cloud data, including those from lidar, and is used to prevent accumulated drift over time for autonomous driving. We voxelize the point clouds into coarse voxels and calculate the overlap to estimate if the vehicle drives in a loop. We perform point-level registration to compute the current pose accurately. We have evaluated our approach on well-known datasets KITTI, KITTI-360, Nuscenes, Complex Urban, NCLT, and MulRan. We show at most 2 times improvement in accuracy estimation of translation and rotation. On some challenging sequences, our method is the first approach that can obtain a 100% success rate.
Although a variety of methods have been proposed for sequential recommendation, it is still far from being well solved partly due to two challenges. First, the existing methods often lack the simultaneous consideration of the global stability and local fluctuation of user preference, which might degrade the learning of a user's current preference. Second, the existing methods often use a scalar based weighting schema to fuse the long-term and short-term preferences, which is too coarse to learn an expressive embedding of current preference. To address the two challenges, we propose a novel model called Time Lag aware Sequential Recommendation (TLSRec), which integrates a hierarchical modeling of user preference and a time lag sensitive fine-grained fusion of the long-term and short-term preferences. TLSRec employs a hierarchical self-attention network to learn users' preference at both global and local time scales, and a neural time gate to adaptively regulate the contributions of the long-term and short-term preferences for the learning of a user's current preference at the aspect level and based on the lag between the current time and the time of the last behavior of a user. The extensive experiments conducted on real datasets verify the effectiveness of TLSRec.
We develop a framework for non-asymptotic analysis of deterministic samplers used for diffusion generative modeling. Several recent works have analyzed stochastic samplers using tools like Girsanov's theorem and a chain rule variant of the interpolation argument. Unfortunately, these techniques give vacuous bounds when applied to deterministic samplers. We give a new operational interpretation for deterministic sampling by showing that one step along the probability flow ODE can be expressed as two steps: 1) a restoration step that runs gradient ascent on the conditional log-likelihood at some infinitesimally previous time, and 2) a degradation step that runs the forward process using noise pointing back towards the current iterate. This perspective allows us to extend denoising diffusion implicit models to general, non-linear forward processes. We then develop the first polynomial convergence bounds for these samplers under mild conditions on the data distribution.
In this letter, an orthogonal time frequency space (OTFS) based non-orthogonal multiple access (NOMA) scheme is investigated for the coordinated direct and relay transmission system, where a source directly communicates with a near user with high mobile speed, and it needs the relaying assistance to serve the far user also having high mobility. Due to the coexistence of signal superposition coding and multi-domain transformation, the performance of OTFS-based NOMA is usually challenging to be measured from a theoretical perspective. To accurately evaluate the system performance of the proposed scheme, we derive the closed-form expressions for the outage probability and the outage sum rate by using the Inversion formula and characteristic function. Numerical results verify the performance superiority and the effectiveness of the proposed scheme.
Building a scalable and real-time recommendation system is vital for many businesses driven by time-sensitive customer feedback, such as short-videos ranking or online ads. Despite the ubiquitous adoption of production-scale deep learning frameworks like TensorFlow or PyTorch, these general-purpose frameworks fall short of business demands in recommendation scenarios for various reasons: on one hand, tweaking systems based on static parameters and dense computations for recommendation with dynamic and sparse features is detrimental to model quality; on the other hand, such frameworks are designed with batch-training stage and serving stage completely separated, preventing the model from interacting with customer feedback in real-time. These issues led us to reexamine traditional approaches and explore radically different design choices. In this paper, we present Monolith, a system tailored for online training. Our design has been driven by observations of our application workloads and production environment that reflects a marked departure from other recommendations systems. Our contributions are manifold: first, we crafted a collisionless embedding table with optimizations such as expirable embeddings and frequency filtering to reduce its memory footprint; second, we provide an production-ready online training architecture with high fault-tolerance; finally, we proved that system reliability could be traded-off for real-time learning. Monolith has successfully landed in the BytePlus Recommend product.
Multiple Instance Learning (MIL), a powerful strategy for weakly supervised learning, is able to perform various prediction tasks on gigapixel Whole Slide Images (WSIs). However, the tens of thousands of patches in WSIs usually incur a vast computational burden for image augmentation, limiting the MIL model's improvement in performance. Currently, the feature augmentation-based MIL framework is a promising solution, while existing methods such as Mixup often produce unrealistic features. To explore a more efficient and practical augmentation method, we introduce the Diffusion Model (DM) into MIL for the first time and propose a feature augmentation framework called AugDiff. Specifically, we employ the generation diversity of DM to improve the quality of feature augmentation and the step-by-step generation property to control the retention of semantic information. We conduct extensive experiments over three distinct cancer datasets, two different feature extractors, and three prevalent MIL algorithms to evaluate the performance of AugDiff. Ablation study and visualization further verify the effectiveness. Moreover, we highlight AugDiff's higher-quality augmented feature over image augmentation and its superiority over self-supervised learning. The generalization over external datasets indicates its broader applications.
Multiple access is the base for increasing the capacity in multi-user communication networks. However, the growing demand for higher data rates and the number of users who requires communication services has led to the scarcity of orthogonal resources in current wireless communications. On the other hand, integrating the satellite within terrestrial networks as an initiative of 3GPP since its Release 15 entails the need for new forms of multiple access between terrestrial and non-terrestrial users. This paper studies constellation schemes as a new domain to enhance the state-of-the-art multiple-access techniques for future communication technologies employing non-coherent communications with massive MIMO. In addition, we propose a hybrid model between the classic access methods such as Time Division Multiple Access (TDMA) or Frequency Division Multiple Access (FDMA), the emerging models of non-orthogonal multiple access (NOMA) and the proposed domain of the constellation based on non-coherent massive multiple-input multiple-output (MIMO) schemes. This model is discussed for different scenarios in satellite communications that help increase the system's capacity and avoid interference between terrestrial and non-terrestrial users.
Newly arriving nodes in dynamics networks would gradually make the node embedding space drifted and the retraining of node embedding and downstream models indispensable. An exact threshold size of these new nodes, below which the node embedding space will be predicatively maintained, however, is rarely considered in either theory or experiment. From the view of matrix perturbation theory, a threshold of the maximum number of new nodes that keep the node embedding space approximately equivalent is analytically provided and empirically validated. It is therefore theoretically guaranteed that as the size of newly arriving nodes is below this threshold, embeddings of these new nodes can be quickly derived from embeddings of original nodes. A generation framework, Space-Invariant Projection (SIP), is accordingly proposed to enables arbitrary static MF-based embedding schemes to embed new nodes in dynamics networks fast. The time complexity of SIP is linear with the network size. By combining SIP with four state-of-the-art MF-based schemes, we show that SIP exhibits not only wide adaptability but also strong empirical performance in terms of efficiency and efficacy on the node classification task in three real datasets.