In this work, we present a learning-based approach to analysis cyberspace security configuration. Unlike prior methods, our approach has the ability to learn from past experience and improve over time. In particular, as we train over a greater number of agents as attackers, our method becomes better at discovering hidden attack paths for previously methods, especially in multi-domain cyberspace. To achieve these results, we pose discovering attack paths as a Reinforcement Learning (RL) problem and train an agent to discover multi-domain cyberspace attack paths. To enable our RL policy to discover more hidden attack paths and shorter attack paths, we ground representation introduction an multi-domain action select module in RL. Our objective is to discover more hidden attack paths and shorter attack paths by our proposed method, to analysis the weakness of cyberspace security configuration. At last, we designed a simulated cyberspace experimental environment to verify our proposed method, the experimental results show that our method can discover more hidden multi-domain attack paths and shorter attack paths than existing baseline methods.
In evolutionary robotics, several approaches have been shown to be capable of the joint optimisation of body-plans and controllers by either using only evolution or combining evolution and learning. When working in rich morphological spaces, it is common for offspring to have body-plans that are very different from either of their parents, which can cause difficulties with respect to inheriting a suitable controller. To address this, we propose a framework that combines an evolutionary algorithm to generate body-plans and a learning algorithm to optimise the parameters of a neural controller where the topology of this controller is created once the body-plan of each offspring body-plan is generated. The key novelty of the approach is to add an external archive for storing learned controllers that map to explicit `types' of robots (where this is defined with respect the features of the body-plan). By inheriting an appropriate controller from the archive rather than learning from a randomly initialised one, we show that both the speed and magnitude of learning increases over time when compared to an approach that starts from scratch, using three different test-beds. The framework also provides new insights into the complex interactions between evolution and learning, and the role of morphological intelligence in robot design.
We present a novel method for single image depth estimation using surface normal constraints. Existing depth estimation methods either suffer from the lack of geometric constraints, or are limited to the difficulty of reliably capturing geometric context, which leads to a bottleneck of depth estimation quality. We therefore introduce a simple yet effective method, named Adaptive Surface Normal (ASN) constraint, to effectively correlate the depth estimation with geometric consistency. Our key idea is to adaptively determine the reliable local geometry from a set of randomly sampled candidates to derive surface normal constraint, for which we measure the consistency of the geometric contextual features. As a result, our method can faithfully reconstruct the 3D geometry and is robust to local shape variations, such as boundaries, sharp corners and noises. We conduct extensive evaluations and comparisons using public datasets. The experimental results demonstrate our method outperforms the state-of-the-art methods and has superior efficiency and robustness.
Domain adaptation (DA) enables knowledge transfer from a labeled source domain to an unlabeled target domain by reducing the cross-domain distribution discrepancy. Most prior DA approaches leverage complicated and powerful deep neural networks to improve the adaptation capacity and have shown remarkable success. However, they may have a lack of applicability to real-world situations such as real-time interaction, where low target inference latency is an essential requirement under limited computational budget. In this paper, we tackle the problem by proposing a dynamic domain adaptation (DDA) framework, which can simultaneously achieve efficient target inference in low-resource scenarios and inherit the favorable cross-domain generalization brought by DA. In contrast to static models, as a simple yet generic method, DDA can integrate various domain confusion constraints into any typical adaptive network, where multiple intermediate classifiers can be equipped to infer "easier" and "harder" target data dynamically. Moreover, we present two novel strategies to further boost the adaptation performance of multiple prediction exits: 1) a confidence score learning strategy to derive accurate target pseudo labels by fully exploring the prediction consistency of different classifiers; 2) a class-balanced self-training strategy to explicitly adapt multi-stage classifiers from source to target without losing prediction diversity. Extensive experiments on multiple benchmarks are conducted to verify that DDA can consistently improve the adaptation performance and accelerate target inference under domain shift and limited resources scenarios
Domain adaptation has been widely explored by transferring the knowledge from a label-rich source domain to a related but unlabeled target domain. Most existing domain adaptation algorithms attend to adapting feature representations across two domains with the guidance of a shared source-supervised classifier. However, such classifier limits the generalization ability towards unlabeled target recognition. To remedy this, we propose a Transferable Semantic Augmentation (TSA) approach to enhance the classifier adaptation ability through implicitly generating source features towards target semantics. Specifically, TSA is inspired by the fact that deep feature transformation towards a certain direction can be represented as meaningful semantic altering in the original input space. Thus, source features can be augmented to effectively equip with target semantics to train a more transferable classifier. To achieve this, for each class, we first use the inter-domain feature mean difference and target intra-class feature covariance to construct a multivariate normal distribution. Then we augment source features with random directions sampled from the distribution class-wisely. Interestingly, such source augmentation is implicitly implemented through an expected transferable cross-entropy loss over the augmented source distribution, where an upper bound of the expected loss is derived and minimized, introducing negligible computational overhead. As a light-weight and general technique, TSA can be easily plugged into various domain adaptation methods, bringing remarkable improvements. Comprehensive experiments on cross-domain benchmarks validate the efficacy of TSA.
We study the problem of word-level confidence estimation in subword-based end-to-end (E2E) models for automatic speech recognition (ASR). Although prior works have proposed training auxiliary confidence models for ASR systems, they do not extend naturally to systems that operate on word-pieces (WP) as their vocabulary. In particular, ground truth WP correctness labels are needed for training confidence models, but the non-unique tokenization from word to WP causes inaccurate labels to be generated. This paper proposes and studies two confidence models of increasing complexity to solve this problem. The final model uses self-attention to directly learn word-level confidence without needing subword tokenization, and exploits full context features from multiple hypotheses to improve confidence accuracy. Experiments on Voice Search and long-tail test sets show standard metrics (e.g., NCE, AUC, RMSE) improving substantially. The proposed confidence module also enables a model selection approach to combine an on-device E2E model with a hybrid model on the server to address the rare word recognition problem for the E2E model.
We consider the underwater acoustic channel which is time-variant and doubly-spread in this work. Since conventional channel estimation and decision feedback equalizer (DFE) can not work well for this type of channel, a path-specific underwater acoustic channel tracking is proposed. It is based on the framework of Kalman filter. We provide a simplified sound propagation model as the state transition model. A multipath tracker is proposed which is tolerant of the model-mismatch. Then we can obtain the time-variant path number and path-specific parameters such as delay and Doppler scaling factor. We also consider the application of the proposed path-specific underwater acoustic channel tracking. We propose two types of passive time reversal mirror (PTRM) with our path-specific parameters for time-variant and doubly-spread underwater acoustic channel. With the path-specific parameters obtained by the proposed channel tracking, the proposed PTRM can not only match the time dispersion as conventional PTRM, but also the doubly-spread channel, since the path-specific delay and Doppler scaler factor can help to match the channel in both time and frequency domain. For extensive doubly-spread channel, we can further apply the path-specific compensation to the PTRM. Both simulations and experimental results by data from 2016 Qiandao Lake experiment show the efficiency of proposed path-specific channel tracking and proposed PTRMs with path-specific parameters.
In this work, we focus on the model-mismatch problem for model-based subspace channel tracking in the correlated underwater acoustic channel. A model based on the underwater acoustic channel's correlation can be used as the state-space model in the Kalman filter to improve the underwater acoustic channel tracking compared that without a model. Even though the data support the assumption that the model is slow-varying and uncorrelated to some degree, to improve the tracking performance further, we can not ignore the model-mismatch problem because most channel models encounter this problem in the underwater acoustic channel. Therefore, in this work, we provide a dynamic time-variant state-space model for underwater acoustic channel tracking. This model is tolerant to the slight correlation after decorrelation. Moreover, a forward-backward Kalman filter is combined to further improve the tracking performance. The performance of our proposed algorithm is demonstrated with the same at-sea data as that used for conventional channel tracking. Compared with the conventional algorithms, the proposed algorithm shows significant improvement, especially in rough sea conditions in which the channels are fast-varying.
The research community has proposed copious modifications to the Transformer architecture since it was introduced over three years ago, relatively few of which have seen widespread adoption. In this paper, we comprehensively evaluate many of these modifications in a shared experimental setting that covers most of the common uses of the Transformer in natural language processing. Surprisingly, we find that most modifications do not meaningfully improve performance. Furthermore, most of the Transformer variants we found beneficial were either developed in the same codebase that we used or are relatively minor changes. We conjecture that performance improvements may strongly depend on implementation details and correspondingly make some recommendations for improving the generality of experimental results.
In this paper, we study the issue of automatic singer identification (SID) in popular music recordings, which aims to recognize who sang a given piece of song. The main challenge for this investigation lies in the fact that a singer's singing voice changes and intertwines with the signal of background accompaniment in time domain. To handle this challenge, we propose the KNN-Net for SID, which is a deep neural network model with the goal of learning local timbre feature representation from the mixture of singer voice and background music. Unlike other deep neural networks using the softmax layer as the output layer, we instead utilize the KNN as a more interpretable layer to output target singer labels. Moreover, attention mechanism is first introduced to highlight crucial timbre features for SID. Experiments on the existing artist20 dataset show that the proposed approach outperforms the state-of-the-art method by 4%. We also create singer32 and singer60 datasets consisting of Chinese pop music to evaluate the reliability of the proposed method. The more extensive experiments additionally indicate that our proposed model achieves a significant performance improvement compared to the state-of-the-art methods.