Physics-informed neural networks (PINNs) have shown promising potential for solving partial differential equations (PDEs) using deep learning. However, PINNs face training difficulties for evolutionary PDEs, particularly for dynamical systems whose solutions exhibit multi-scale or turbulent behavior over time. The reason is that PINNs may violate the temporal causality property since all the temporal features in the PINNs loss are trained simultaneously. This paper proposes to use implicit time differencing schemes to enforce temporal causality, and use transfer learning to sequentially update the PINNs in space as surrogates for PDE solutions in different time frames. The evolving PINNs are better able to capture the varying complexities of the evolutionary equations, while only requiring minor updates between adjacent time frames. Our method is theoretically proven to be convergent if the time step is small and each PINN in different time frames is well-trained. In addition, we provide state-of-the-art (SOTA) numerical results for a variety of benchmarks for which existing PINNs formulations may fail or be inefficient. We demonstrate that the proposed method improves the accuracy of PINNs approximation for evolutionary PDEs and improves efficiency by a factor of 4-40x.
We present a new method for text-driven motion transfer - synthesizing a video that complies with an input text prompt describing the target objects and scene while maintaining an input video's motion and scene layout. Prior methods are confined to transferring motion across two subjects within the same or closely related object categories and are applicable for limited domains (e.g., humans). In this work, we consider a significantly more challenging setting in which the target and source objects differ drastically in shape and fine-grained motion characteristics (e.g., translating a jumping dog into a dolphin). To this end, we leverage a pre-trained and fixed text-to-video diffusion model, which provides us with generative and motion priors. The pillar of our method is a new space-time feature loss derived directly from the model. This loss guides the generation process to preserve the overall motion of the input video while complying with the target object in terms of shape and fine-grained motion traits.
This paper introduces the Poker Hand History (PHH) file format, designed to standardize the recording of poker hands across different game variants. Despite poker's widespread popularity in the mainstream culture as a mind sport and its prominence in the field of artificial intelligence (AI) research as a benchmark for imperfect information AI agents, it lacks a consistent format that humans can use to document poker hands across different variants that can also easily be parsed by machines. To address this gap in the literature, we propose the PHH format which provides a concise human-readable machine-friendly representation of hand history that comprehensively captures various details of the hand, ranging from initial game parameters and actions to contextual parameters including but not limited to the venue, players, and time control information. In the supplementary, we provide over 10,000 hands covering 11 different variants in the PHH format. Building on our previous work on PokerKit, a premier poker hand simulation tool, we demonstrate the usages of our open-source Python implementation of the PHH parser. The source code of the parser is available on GitHub: https://github.com/uoftcprg/pokerkit
Federated Learning (FL) enables collaborative model training among participants while guaranteeing the privacy of raw data. Mainstream FL methodologies overlook the dynamic nature of real-world data, particularly its tendency to grow in volume and diversify in classes over time. This oversight results in FL methods suffering from catastrophic forgetting, where models inadvertently discard previously learned information upon assimilating new data. In response to this challenge, we propose a novel Federated Class-Incremental Learning (FCIL) method, named FCIL with New-Class Augmented Self-Distillation (FedNASD). FedNASD combines new class scores, which are inferred from current models, with historical models' predictions. Based on the combined past and present knowledge, it incorporates self-distillation over models on clients, aiming to achieve effective knowledge transfer from historical models to current models. Theoretical analysis demonstrates that FedNASD is equivalent to modeling old class scores as conditional probabilities in the absence of new classes. Additionally, it reconciles the predictions of new classes with current models to refine the conditional probabilities of historical scores where new classes do not exist. Empirical experiments demonstrate the superiority of FedNASD over four baseline algorithms in reducing the average forgetting rate and boosting global accuracy.
Chest X-ray imaging is a critical diagnostic tool for identifying pulmonary diseases. However, manual interpretation of these images is time-consuming and error-prone. Automated systems utilizing convolutional neural networks (CNNs) have shown promise in improving the accuracy and efficiency of chest X-ray image classification. While previous work has mainly focused on using feature maps from the final convolution layer, there is a need to explore the benefits of leveraging additional layers for improved disease classification. Extracting robust features from limited medical image datasets remains a critical challenge. In this paper, we propose a novel deep learning-based multilayer multimodal fusion model that emphasizes extracting features from different layers and fusing them. Our disease detection model considers the discriminatory information captured by each layer. Furthermore, we propose the fusion of different-sized feature maps (FDSFM) module to effectively merge feature maps from diverse layers. The proposed model achieves a significantly higher accuracy of 97.21% and 99.60% for both three-class and two-class classifications, respectively. The proposed multilayer multimodal fusion model, along with the FDSFM module, holds promise for accurate disease classification and can also be extended to other disease classifications in chest X-ray images.
We propose a Safe Adversarial Trained Actor Critic (SATAC) algorithm for offline reinforcement learning (RL) with general function approximation in the presence of limited data coverage. SATAC operates as a two-player Stackelberg game featuring a refined objective function. The actor (leader player) optimizes the policy against two adversarially trained value critics (follower players), who focus on scenarios where the actor's performance is inferior to the behavior policy. Our framework provides both theoretical guarantees and a robust deep-RL implementation. Theoretically, we demonstrate that when the actor employs a no-regret optimization oracle, SATAC achieves two guarantees: (i) For the first time in the offline RL setting, we establish that SATAC can produce a policy that outperforms the behavior policy while maintaining the same level of safety, which is critical to designing an algorithm for offline RL. (ii) We demonstrate that the algorithm guarantees policy improvement across a broad range of hyperparameters, indicating its practical robustness. Additionally, we offer a practical version of SATAC and compare it with existing state-of-the-art offline safe-RL algorithms in continuous control environments. SATAC outperforms all baselines across a range of tasks, thus validating the theoretical performance.
Pulmonary embolism (PE) is a prevalent lung disease that can lead to right ventricular hypertrophy and failure in severe cases, ranking second in severity only to myocardial infarction and sudden death. Pulmonary artery CT angiography (CTPA) is a widely used diagnostic method for PE. However, PE detection presents challenges in clinical practice due to limitations in imaging technology. CTPA can produce noises similar to PE, making confirmation of its presence time-consuming and prone to overdiagnosis. Nevertheless, the traditional segmentation method of PE can not fully consider the hierarchical structure of features, local and global spatial features of PE CT images. In this paper, we propose an automatic PE segmentation method called SCUNet++ (Swin Conv UNet++). This method incorporates multiple fusion dense skip connections between the encoder and decoder, utilizing the Swin Transformer as the encoder. And fuses features of different scales in the decoder subnetwork to compensate for spatial information loss caused by the inevitable downsampling in Swin-UNet or other state-of-the-art methods, effectively solving the above problem. We provide a theoretical analysis of this method in detail and validate it on publicly available PE CT image datasets FUMPE and CAD-PE. The experimental results indicate that our proposed method achieved a Dice similarity coefficient (DSC) of 83.47% and a Hausdorff distance 95th percentile (HD95) of 3.83 on the FUMPE dataset, as well as a DSC of 83.42% and an HD95 of 5.10 on the CAD-PE dataset. These findings demonstrate that our method exhibits strong performance in PE segmentation tasks, potentially enhancing the accuracy of automatic segmentation of PE and providing a powerful diagnostic tool for clinical physicians. Our source code and new FUMPE dataset are available at https://github.com/JustlfC03/SCUNet-plusplus.
Two-sided matching markets have been widely studied in the literature due to their rich applications. Since participants are usually uncertain about their preferences, online algorithms have recently been adopted to learn them through iterative interactions. \citet{wang2022bandit} initiate the study of this problem in a many-to-one setting with \textit{responsiveness}. However, their results are far from optimal and lack guarantees of incentive compatibility. An extension of \citet{kong2023player} to this more general setting achieves a near-optimal bound for player-optimal regret. Nevertheless, due to the substantial requirement for collaboration, a single player's deviation could lead to a huge increase in its own cumulative rewards and an $O(T)$ regret for others. In this paper, we aim to enhance the regret bound in many-to-one markets while ensuring incentive compatibility. We first propose the adaptively explore-then-deferred-acceptance (AETDA) algorithm for responsiveness setting and derive an $O(N\min\left\{N,K\right\}C\log T/\Delta^2)$ upper bound for player-optimal stable regret while demonstrating its guarantee of incentive compatibility, where $N$ represents the number of players, $K$ is the number of arms, $T$ denotes the time horizon, $C$ is arms' total capacities and $\Delta$ signifies the minimum preference gap among players. This result is a significant improvement over \citet{wang2022bandit}. And to the best of our knowledge, it constitutes the first player-optimal guarantee in matching markets that offers such robust assurances. We also consider broader \textit{substitutable} preferences, one of the most general conditions to ensure the existence of a stable matching and cover responsiveness. We devise an online DA (ODA) algorithm and establish an $O(NK\log T/\Delta^2)$ player-pessimal stable regret bound for this setting.
Stochastic Gradient Descent (SGD) and its variants are almost universally used to train neural networks and to fit a variety of other parametric models. An important hyperparameter in this context is the batch size, which determines how many samples are processed before an update of the parameters occurs. Previous studies have demonstrated the benefits of using variable batch sizes. In this work, we will theoretically derive optimal batch size schedules for SGD and similar algorithms, up to an error that is quadratic in the learning rate. To achieve this, we approximate the discrete process of parameter updates using a family of stochastic differential equations indexed by the learning rate. To better handle the state-dependent diffusion coefficient, we further expand the solution of this family into a series with respect to the learning rate. Using this setup, we derive a continuous-time optimal batch size schedule for a large family of diffusion coefficients and then apply the results in the setting of linear regression.
Intelligent transportation systems (ITS) have been developed to improve traffic flow, efficiency, and safety in transportation. Technological advancements in communication such as the Vehicle-to-Everything (V2X), Vehicle-to-Vehicle (V2V) and Vehicle-to Infrastructure (V2I) enable the real-time exchange of information between vehicles and other entities on the road network, and thus play a significant role in their safety and efficiency. This paper presents a simulation study that models V2V and V2I communication to identify the most suitable range of data transmission between vehicles and infrastructure. The provincial city of Xanthi, Greece is used as a cases study, and the goal is to evaluate whether the proposed placement of Road Side Unit (RSU) provided adequate communication coverage on the city's road network. An analysis through different scenarios identified improvements in traffic management, driving behavior and environmental conditions under different RSU coverage. The results highlight that the communication range of 400 meters is the most adequate option for optimum traffic management in the city of Xanthi.