With the increasing popularity of E-sport live, Highlight Flashback has been a critical functionality of live platforms, which aggregates the overall exciting fighting scenes in a few seconds. In this paper, we introduce a novel training strategy without any additional annotation to automatically generate highlights for game video live. Considering that the existing manual edited clips contain more highlights than long game live videos, we perform pair-wise ranking constraints across clips from edited and long live videos. A multi-stream framework is also proposed to fuse spatial, temporal as well as audio features extracted from videos. To evaluate our method, we test on long game live videos with an average length of about 15 minutes. Extensive experimental results on videos demonstrate its satisfying performance on highlights generation and effectiveness by the fusion of three streams.
Bayesian deep learning is recently regarded as an intrinsic way to characterize the weight uncertainty of deep neural networks~(DNNs). Stochastic Gradient Langevin Dynamics~(SGLD) is an effective method to enable Bayesian deep learning on large-scale datasets. Previous theoretical studies have shown various appealing properties of SGLD, ranging from the convergence properties to the generalization bounds. In this paper, we study the properties of SGLD from a novel perspective of membership privacy protection (i.e., preventing the membership attack). The membership attack, which aims to determine whether a specific sample is used for training a given DNN model, has emerged as a common threat against deep learning algorithms. To this end, we build a theoretical framework to analyze the information leakage (w.r.t. the training dataset) of a model trained using SGLD. Based on this framework, we demonstrate that SGLD can prevent the information leakage of the training dataset to a certain extent. Moreover, our theoretical analysis can be naturally extended to other types of Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods. Empirical results on different datasets and models verify our theoretical findings and suggest that the SGLD algorithm can not only reduce the information leakage but also improve the generalization ability of the DNN models in real-world applications.
We propose an efficient algorithm for solving orthogonal canonical correlation analysis (OCCA) in the form of trace-fractional structure and orthogonal linear projections. Even though orthogonality has been widely used and proved to be a useful criterion for pattern recognition and feature extraction, existing methods for solving OCCA problem are either numerical unstable by relying on a deflation scheme, or less efficient by directly using generic optimization methods. In this paper, we propose an alternating numerical scheme whose core is the sub-maximization problem in the trace-fractional form with an orthogonal constraint. A customized self-consistent-field (SCF) iteration for this sub-maximization problem is devised. It is proved that the SCF iteration is globally convergent to a KKT point and that the alternating numerical scheme always converges. We further formulate a new trace-fractional maximization problem for orthogonal multiset CCA (OMCCA) and then propose an efficient algorithm with an either Jacobi-style or Gauss-Seidel-style updating scheme based on the same SCF iteration. Extensive experiments are conducted to evaluate the proposed algorithms against existing methods including two real world applications: multi-label classification and multi-view feature extraction. Experimental results show that our methods not only perform competitively to or better than baselines but also are more efficient.
In this paper, we aim to understand the generalization properties of generative adversarial networks (GANs) from a new perspective of privacy protection. Theoretically, we prove that a differentially private learning algorithm used for training the GAN does not overfit to a certain degree, i.e., the generalization gap can be bounded. Moreover, some recent works, such as the Bayesian GAN, can be re-interpreted based on our theoretical insight from privacy protection. Quantitatively, to evaluate the information leakage of well-trained GAN models, we perform various membership attacks on these models. The results show that previous Lipschitz regularization techniques are effective in not only reducing the generalization gap but also alleviating the information leakage of the training dataset.
Brain source imaging is an important method for noninvasively characterizing brain activity using Electroencephalogram (EEG) or Magnetoencephalography (MEG) recordings. Traditional EEG/MEG Source Imaging (ESI) methods usually assume that either source activity at different time points is unrelated, or that similar spatiotemporal patterns exist across an entire study period. The former assumption makes ESI analyses sensitive to noise, while the latter renders ESI analyses unable to account for time-varying patterns of activity. To effectively deal with noise while maintaining flexibility and continuity among brain activation patterns, we propose a novel probabilistic ESI model based on a hierarchical graph prior. Under our method, a spanning tree constraint ensures that activity patterns have spatiotemporal continuity. An efficient algorithm based on alternating convex search is presented to solve the proposed model and is provably convergent. Comprehensive numerical studies using synthetic data on a real brain model are conducted under different levels of signal-to-noise ratio (SNR) from both sensor and source spaces. We also examine the EEG/MEG data in a real application, in which our ESI reconstructions are neurologically plausible. All the results demonstrate significant improvements of the proposed algorithm over the benchmark methods in terms of source localization performance, especially at high noise levels.
Value functions are crucial for model-free Reinforcement Learning (RL) to obtain a policy implicitly or guide the policy updates. Value estimation heavily depends on the stochasticity of environmental dynamics and the quality of reward signals. In this paper, we propose a two-step understanding of value estimation from the perspective of future prediction, through decomposing the value function into a reward-independent future dynamics part and a policy-independent trajectory return part. We then derive a practical deep RL algorithm from the above decomposition, consisting of a convolutional trajectory representation model, a conditional variational dynamics model to predict the expected representation of future trajectory and a convex trajectory return model that maps a trajectory representation to its return. Our algorithm is evaluated in MuJoCo continuous control tasks and shows superior results under both common settings and delayed reward settings.
Glioma grading before the surgery is very critical for the prognosis prediction and treatment plan making. In this paper, we present a novel scattering wavelet-based radiomics method to predict noninvasively and accurately the glioma grades. The multimodal magnetic resonance images of 285 patients were used, with the intratumoral and peritumoral regions well labeled. The wavelet scattering-based features and traditional radiomics features were firstly extracted from both intratumoral and peritumoral regions respectively. The support vector machine (SVM), logistic regression (LR) and random forest (RF) were then trained with 5-fold cross validation to predict the glioma grades. The prediction obtained with different features was finally evaluated in terms of quantitative metrics. The area under the receiver operating characteristic curve (AUC) of glioma grade prediction based on scattering wavelet features was up to 0.99 when considering both intratumoral and peritumoral features in multimodal images, which increases by about 17% compared to traditional radiomics. Such results shown that the local invariant features extracted from the scattering wavelet transform allows improving the prediction accuracy for glioma grading. In addition, the features extracted from peritumoral regions further increases the accuracy of glioma grading.
We proposed a novel convolutional restricted Boltzmann machine CRBM-based radiomic method for predicting pathologic complete response (pCR) to neoadjuvant chemotherapy treatment (NACT) in breast cancer. The method consists of extracting semantic features from CRBM network, and pCR prediction. It was evaluated on the dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) data of 57 patients and using the area under the receiver operating characteristic curve (AUC). Traditional radiomics features and the semantic features learned from CRBM network were extracted from the images acquired before and after the administration of NACT. After the feature selection, the support vector machine (SVM), logistic regression (LR) and random forest (RF) were trained to predict the pCR status. Compared to traditional radiomic methods, the proposed CRBM-based radiomic method yielded an AUC of 0.92 for the prediction with the images acquired before and after NACT, and an AUC of 0.87 for the pretreatment prediction, which was increased by about 38%. The results showed that the CRBM-based radiomic method provided a potential means for accurately predicting the pCR to NACT in breast cancer before the treatment, which is very useful for making more appropriate and personalized treatment regimens.