Since the deep learning model is highly dependent on hyperparameters, hyperparameter optimization is essential in developing deep learning model-based applications, even if it takes a long time. As service development using deep learning models has gradually become competitive, many developers highly demand rapid hyperparameter optimization algorithms. In order to keep pace with the needs of faster hyperparameter optimization algorithms, researchers are focusing on improving the speed of hyperparameter optimization algorithm. However, the huge time consumption of hyperparameter optimization due to the high computational cost of the deep learning model itself has not been dealt with in-depth. Like using surrogate model in Bayesian optimization, to solve this problem, it is necessary to consider proxy model for a neural network (N_B) to be used for hyperparameter optimization. Inspired by the main goal of neural network pruning, i.e., high computational cost reduction and performance preservation, we presumed that the neural network (N_P) obtained through neural network pruning would be a good proxy model of N_B. In order to verify our idea, we performed extensive experiments by using CIFAR10, CFIAR100, and TinyImageNet datasets and three generally-used neural networks and three representative hyperparameter optmization methods. Through these experiments, we verified that N_P can be a good proxy model of N_B for rapid hyperparameter optimization. The proposed hyperparameter optimization framework can reduce the amount of time up to 37%.
We attempt for the first time to address the problem of gait transfer. In contrast to motion transfer, the objective here is not to imitate the source's normal motions, but rather to transform the source's motion into a typical gait pattern for the target. Using gait recognition models, we demonstrate that existing techniques yield a discrepancy that can be easily detected. We introduce a novel model, Cycle Transformers GAN (CTrGAN), that can successfully generate the target's natural gait. CTrGAN's generators consist of a decoder and encoder, both Transformers, where the attention is on the temporal domain between complete images rather than the spatial domain between patches. While recent Transformer studies in computer vision mainly focused on discriminative tasks, we introduce an architecture that can be applied to synthesis tasks. Using a widely-used gait recognition dataset, we demonstrate that our approach is capable of producing over an order of magnitude more realistic personalized gaits than existing methods, even when used with sources that were not available during training.
Modeling time-evolving preferences of users with their sequential item interactions, has attracted increasing attention in many online applications. Hence, sequential recommender systems have been developed to learn the dynamic user interests from the historical interactions for suggesting items. However, the interaction pattern encoding functions in most existing sequential recommender systems have focused on single type of user-item interactions. In many real-life online platforms, user-item interactive behaviors are often multi-typed (e.g., click, add-to-favorite, purchase) with complex cross-type behavior inter-dependencies. Learning from informative representations of users and items based on their multi-typed interaction data, is of great importance to accurately characterize the time-evolving user preference. In this work, we tackle the dynamic user-item relation learning with the awareness of multi-behavior interactive patterns. Towards this end, we propose a new Temporal Graph Transformer (TGT) recommendation framework to jointly capture dynamic short-term and long-range user-item interactive patterns, by exploring the evolving correlations across different types of behaviors. The new TGT method endows the sequential recommendation architecture to distill dedicated knowledge for type-specific behavior relational context and the implicit behavior dependencies. Experiments on the real-world datasets indicate that our method TGT consistently outperforms various state-of-the-art recommendation methods. Our model implementation codes are available at https://github.com/akaxlh/TGT.
We present a real-time neural radiance caching method for path-traced global illumination. Our system is designed to handle fully dynamic scenes, and makes no assumptions about the lighting, geometry, and materials. The data-driven nature of our approach sidesteps many difficulties of caching algorithms, such as locating, interpolating, and updating cache points. Since pretraining neural networks to handle novel, dynamic scenes is a formidable generalization challenge, we do away with pretraining and instead achieve generalization via adaptation, i.e. we opt for training the radiance cache while rendering. We employ self-training to provide low-noise training targets and simulate infinite-bounce transport by merely iterating few-bounce training updates. The updates and cache queries incur a mild overhead -- about 2.6ms on full HD resolution -- thanks to a streaming implementation of the neural network that fully exploits modern hardware. We demonstrate significant noise reduction at the cost of little induced bias, and report state-of-the-art, real-time performance on a number of challenging scenarios.
Mental health disorders may cause severe consequences on all the countries' economies and health. For example, the impacts of the COVID-19 pandemic, such as isolation and travel ban, can make us feel depressed. Identifying early signs of mental health disorders is vital. For example, depression may increase an individual's risk of suicide. The state-of-the-art research in identifying mental disorder patterns from textual data, uses hand-labelled training sets, especially when a domain expert's knowledge is required to analyse various symptoms. This task could be time-consuming and expensive. To address this challenge, in this paper, we study and analyse the various clinical and non-clinical approaches to identifying mental health disorders. We leverage the domain knowledge and expertise in cognitive science to build a domain-specific Knowledge Base (KB) for the mental health disorder concepts and patterns. We present a weaker form of supervision by facilitating the generating of training data from a domain-specific Knowledge Base (KB). We adopt a typical scenario for analysing social media to identify major depressive disorder symptoms from the textual content generated by social users. We use this scenario to evaluate how our knowledge-based approach significantly improves the quality of results.
The goal of a recommendation system is to model the relevance between each user and each item through the user-item interaction history, so that maximize the positive samples score and minimize negative samples. Currently, two popular loss functions are widely used to optimize recommender systems: the pointwise and the pairwise. Although these loss functions are widely used, however, there are two problems. (1) These traditional loss functions do not fit the goals of recommendation systems adequately and utilize prior knowledge information sufficiently. (2) The slow convergence speed of these traditional loss functions makes the practical application of various recommendation models difficult. To address these issues, we propose a novel loss function named Supervised Personalized Ranking (SPR) Based on Prior Knowledge. The proposed method improves the BPR loss by exploiting the prior knowledge on the interaction history of each user or item in the raw data. Unlike BPR, instead of constructing <user, positive item, negative item> triples, the proposed SPR constructs <user, similar user, positive item, negative item> quadruples. Although SPR is very simple, it is very effective. Extensive experiments show that our proposed SPR not only achieves better recommendation performance, but also significantly accelerates the convergence speed, resulting in a significant reduction in the required training time.
Coronavirus disease or COVID-19 is an infectious disease caused by the SARS-CoV-2 virus. The first confirmed case caused by this virus was found at the end of December 2019 in Wuhan City, China. This case then spread throughout the world, including Indonesia. Therefore, the COVID-19 case was designated as a global pandemic by WHO. The growth of COVID-19 cases, especially in Indonesia, can be predicted using several approaches, such as the Deep Neural Network (DNN). One of the DNN models that can be used is Deep Transformer which can predict time series. The model is trained with several test scenarios to get the best model. The evaluation is finding the best hyperparameters. Then, further evaluation was carried out using the best hyperparameters setting of the number of prediction days, the optimizer, the number of features, and comparison with the former models of the Long Short-Term Memory (LSTM) and Recurrent Neural Network (RNN). All evaluations used metric of the Mean Absolute Percentage Error (MAPE). Based on the results of the evaluations, Deep Transformer produces the best results when using the Pre-Layer Normalization and predicting one day ahead with a MAPE value of 18.83. Furthermore, the model trained with the Adamax optimizer obtains the best performance among other tested optimizers. The performance of the Deep Transformer also exceeds other test models, which are LSTM and RNN.
Multivariate time series prediction has attracted a lot of attention because of its wide applications such as intelligence transportation, AIOps. Generative models have achieved impressive results in time series modeling because they can model data distribution and take noise into consideration. However, many existing works can not be widely used because of the constraints of functional form of generative models or the sensitivity to hyperparameters. In this paper, we propose ScoreGrad, a multivariate probabilistic time series forecasting framework based on continuous energy-based generative models. ScoreGrad is composed of time series feature extraction module and conditional stochastic differential equation based score matching module. The prediction can be achieved by iteratively solving reverse-time SDE. To the best of our knowledge, ScoreGrad is the first continuous energy based generative model used for time series forecasting. Furthermore, ScoreGrad achieves state-of-the-art results on six real-world datasets. The impact of hyperparameters and sampler types on the performance are also explored. Code is available at https://github.com/yantijin/ScoreGradPred.
Freehand 3D ultrasound (US) has important clinical value due to its low cost and unrestricted field of view. Recently deep learning algorithms have removed its dependence on bulky and expensive external positioning devices. However, improving reconstruction accuracy is still hampered by difficult elevational displacement estimation and large cumulative drift. In this context, we propose a novel deep motion network (MoNet) that integrates images and a lightweight sensor known as the inertial measurement unit (IMU) from a velocity perspective to alleviate the obstacles mentioned above. Our contribution is two-fold. First, we introduce IMU acceleration for the first time to estimate elevational displacements outside the plane. We propose a temporal and multi-branch structure to mine the valuable information of low signal-to-noise ratio (SNR) acceleration. Second, we propose a multi-modal online self-supervised strategy that leverages IMU information as weak labels for adaptive optimization to reduce drift errors and further ameliorate the impacts of acceleration noise. Experiments show that our proposed method achieves the superior reconstruction performance, exceeding state-of-the-art methods across the board.
In this paper, we propose a supervised learning approach based on an Artificial Neural Network (ANN) model for real-time classification of subtasks in a physical human-robot interaction (pHRI) task involving contact with a stiff environment. In this regard, we consider three subtasks for a given pHRI task: Idle, Driving, and Contact. Based on this classification, the parameters of an admittance controller that regulates the interaction between human and robot are adjusted adaptively in real time to make the robot more transparent to the operator (i.e. less resistant) during the Driving phase and more stable during the Contact phase. The Idle phase is primarily used to detect the initiation of task. Experimental results have shown that the ANN model can learn to detect the subtasks under different admittance controller conditions with an accuracy of 98% for 12 participants. Finally, we show that the admittance adaptation based on the proposed subtask classifier leads to 20% lower human effort (i.e. higher transparency) in the Driving phase and 25% lower oscillation amplitude (i.e. higher stability) during drilling in the Contact phase compared to an admittance controller with fixed parameters.