This paper proposes a gaze correction and animation method for high-resolution, unconstrained portrait images, which can be trained without the gaze angle and the head pose annotations. Common gaze-correction methods usually require annotating training data with precise gaze, and head pose information. Solving this problem using an unsupervised method remains an open problem, especially for high-resolution face images in the wild, which are not easy to annotate with gaze and head pose labels. To address this issue, we first create two new portrait datasets: CelebGaze and high-resolution CelebHQGaze. Second, we formulate the gaze correction task as an image inpainting problem, addressed using a Gaze Correction Module (GCM) and a Gaze Animation Module (GAM). Moreover, we propose an unsupervised training strategy, i.e., Synthesis-As-Training, to learn the correlation between the eye region features and the gaze angle. As a result, we can use the learned latent space for gaze animation with semantic interpolation in this space. Moreover, to alleviate both the memory and the computational costs in the training and the inference stage, we propose a Coarse-to-Fine Module (CFM) integrated with GCM and GAM. Extensive experiments validate the effectiveness of our method for both the gaze correction and the gaze animation tasks in both low and high-resolution face datasets in the wild and demonstrate the superiority of our method with respect to the state of the arts. Code is available at https://github.com/zhangqianhui/GazeAnimationV2
The second-order optimization methods, notably the D-KFAC (Distributed Kronecker Factored Approximate Curvature) algorithms, have gained traction on accelerating deep neural network (DNN) training on GPU clusters. However, existing D-KFAC algorithms require to compute and communicate a large volume of second-order information, i.e., Kronecker factors (KFs), before preconditioning gradients, resulting in large computation and communication overheads as well as a high memory footprint. In this paper, we propose DP-KFAC, a novel distributed preconditioning scheme that distributes the KF constructing tasks at different DNN layers to different workers. DP-KFAC not only retains the convergence property of the existing D-KFAC algorithms but also enables three benefits: reduced computation overhead in constructing KFs, no communication of KFs, and low memory footprint. Extensive experiments on a 64-GPU cluster show that DP-KFAC reduces the computation overhead by 1.55x-1.65x, the communication cost by 2.79x-3.15x, and the memory footprint by 1.14x-1.47x in each second-order update compared to the state-of-the-art D-KFAC methods.
This paper presents the summary report on our DFGC 2022 competition. The DeepFake is rapidly evolving, and realistic face-swaps are becoming more deceptive and difficult to detect. On the contrary, methods for detecting DeepFakes are also improving. There is a two-party game between DeepFake creators and defenders. This competition provides a common platform for benchmarking the game between the current state-of-the-arts in DeepFake creation and detection methods. The main research question to be answered by this competition is the current state of the two adversaries when competed with each other. This is the second edition after the last year's DFGC 2021, with a new, more diverse video dataset, a more realistic game setting, and more reasonable evaluation metrics. With this competition, we aim to stimulate research ideas for building better defenses against the DeepFake threats. We also release our DFGC 2022 dataset contributed by both our participants and ourselves to enrich the DeepFake data resources for the research community (https://github.com/NiCE-X/DFGC-2022).
Social media has been rapidly developing in the public sphere due to its ease of spreading new information, which leads to the circulation of rumors. However, detecting rumors from such a massive amount of information is becoming an increasingly arduous challenge. Previous work generally obtained valuable features from propagation information. It should be noted that most methods only target the propagation structure while ignoring the rumor transmission pattern. This limited focus severely restricts the collection of spread data. To solve this problem, the authors of the present study are motivated to explore the regionalized propagation patterns of rumors. Specifically, a novel region-enhanced deep graph convolutional network (RDGCN) that enhances the propagation features of rumors by learning regionalized propagation patterns and trains to learn the propagation patterns by unsupervised learning is proposed. In addition, a source-enhanced residual graph convolution layer (SRGCL) is designed to improve the graph neural network (GNN) oversmoothness and increase the depth limit of the rumor detection methods-based GNN. Experiments on Twitter15 and Twitter16 show that the proposed model performs better than the baseline approach on rumor detection and early rumor detection.
The morphological changes in knee cartilage (especially femoral and tibial cartilages) are closely related to the progression of knee osteoarthritis, which is expressed by magnetic resonance (MR) images and assessed on the cartilage segmentation results. Thus, it is necessary to propose an effective automatic cartilage segmentation model for longitudinal research on osteoarthritis. In this research, to relieve the problem of inaccurate discontinuous segmentation caused by the limited receptive field in convolutional neural networks, we proposed a novel position-prior clustering-based self-attention module (PCAM). In PCAM, long-range dependency between each class center and feature point is captured by self-attention allowing contextual information re-allocated to strengthen the relative features and ensure the continuity of segmentation result. The clutsering-based method is used to estimate class centers, which fosters intra-class consistency and further improves the accuracy of segmentation results. The position-prior excludes the false positives from side-output and makes center estimation more precise. Sufficient experiments are conducted on OAI-ZIB dataset. The experimental results show that the segmentation performance of combination of segmentation network and PCAM obtains an evident improvement compared to original model, which proves the potential application of PCAM in medical segmentation tasks. The source code is publicly available from link: https://github.com/LeongDong/PCAMNet
Federated learning (FL) is typically performed in a synchronous parallel manner, where the involvement of a slow client delays a training iteration. Current FL systems employ a participant selection strategy to select fast clients with quality data in each iteration. However, this is not always possible in practice, and the selection strategy often has to navigate an unpleasant trade-off between the speed and the data quality of clients. In this paper, we present Pisces, an asynchronous FL system with intelligent participant selection and model aggregation for accelerated training. To avoid incurring excessive resource cost and stale training computation, Pisces uses a novel scoring mechanism to identify suitable clients to participate in a training iteration. It also adapts the pace of model aggregation to dynamically bound the progress gap between the selected clients and the server, with a provable convergence guarantee in a smooth non-convex setting. We have implemented Pisces in an open-source FL platform called Plato, and evaluated its performance in large-scale experiments with popular vision and language models. Pisces outperforms the state-of-the-art synchronous and asynchronous schemes, accelerating the time-to-accuracy by up to 2.0x and 1.9x, respectively.
GSM-R is predicted to be obsoleted by 2030, and a suitable successor is needed. Defined by the International Union of Railways (UIC), the Future Railway Mobile Communication System (FRMCS) contains many future use cases with strict requirements. These use cases should ensure regular communication not only in network coverage but also uncovered scenarios. There is still a lack of standards on off-network communication in FRMCS, so this article focuses on off-network communication and intends to provide reference and direction for standardization. We first provide a comprehensive summary and analysis of off-network use cases in FRMCS. Then we give an overview of existing technologies (GSM-R, TETRA, DMR, LTE-V2X, and NR-V2X) that may support off-network communication. In addition, we simulate and evaluate the performance of existing technologies. Simulation results show that it is possible to satisfy the off-network communication requirements in FRMCS with enhancements based on LTE-V2X or NR-V2X. Finally, we give some future research directions to provide insights for industry and academia.
Neoadjuvant therapy (NAT) for breast cancer is a common treatment option in clinical practice. Tumor cellularity (TC), which represents the percentage of invasive tumors in the tumor bed, has been widely used to quantify the response of breast cancer to NAT. Therefore, automatic TC estimation is significant in clinical practice. However, existing state-of-the-art methods usually take it as a TC score regression problem, which ignores the ambiguity of TC labels caused by subjective assessment or multiple raters. In this paper, to efficiently leverage the label ambiguities, we proposed an Uncertainty-aware Label disTRibution leArning (ULTRA) framework for automatic TC estimation. The proposed ULTRA first converted the single-value TC labels to discrete label distributions, which effectively models the ambiguity among all possible TC labels. Furthermore, the network learned TC label distributions by minimizing the Kullback-Leibler (KL) divergence between the predicted and ground-truth TC label distributions, which better supervised the model to leverage the ambiguity of TC labels. Moreover, the ULTRA mimicked the multi-rater fusion process in clinical practice with a multi-branch feature fusion module to further explore the uncertainties of TC labels. We evaluated the ULTRA on the public BreastPathQ dataset. The experimental results demonstrate that the ULTRA outperformed the regression-based methods for a large margin and achieved state-of-the-art results. The code will be available from https://github.com/PerceptionComputingLab/ULTRA
Following SimCSE, contrastive learning based methods have achieved the state-of-the-art (SOTA) performance in learning sentence embeddings. However, the unsupervised contrastive learning methods still lag far behind the supervised counterparts. We attribute this to the quality of positive and negative samples, and aim to improve both. Specifically, for positive samples, we propose switch-case augmentation to flip the case of the first letter of randomly selected words in a sentence. This is to counteract the intrinsic bias of pre-trained token embeddings to frequency, word cases and subwords. For negative samples, we sample hard negatives from the whole dataset based on a pre-trained language model. Combining the above two methods with SimCSE, our proposed Contrastive learning with Augmented and Retrieved Data for Sentence embedding (CARDS) method significantly surpasses the current SOTA on STS benchmarks in the unsupervised setting.
In this paper, we investigate secure transmissions in integrated satellite-terrestrial communications and the green interference based symbiotic security scheme is proposed. Particularly, the co-channel interference induced by the spectrum sharing between satellite and terrestrial networks and the inter-beam interference due to frequency reuse among satellite multi-beam serve as the green interference to assist the symbiotic secure transmission, where the secure transmissions of both satellite and terrestrial links are guaranteed simultaneously. Specifically, to realize the symbiotic security, we formulate a problem to maximize the sum secrecy rate of satellite users by cooperatively beamforming optimizing and a constraint of secrecy rate of each terrestrial user is guaranteed. Since the formulated problem is non-convex and intractable, the Taylor expansion and semi-definite relaxation (SDR) are adopted to further reformulate this problem, and the successive convex approximation (SCA) algorithm is designed to solve it. Finally, the tightness of the relaxation is proved. In addition, numerical results verify the efficiency of our proposed approach.