Terahertz (THz) systems are capable of supporting ultra-high data rates thanks to large bandwidth, and the potential to harness high-gain beamforming to combat high pathloss. In this paper, a novel quantum sensing (Ghost Imaging (GI)) based beam training is proposed for Simultaneously Transmitting and Reflecting Reconfigurable Intelligent Surface (STAR RIS) aided THz multi-user massive MIMO systems. We first conduct GI by surrounding 5G downlink signals to obtain 3D images of the environment including users and obstacles. Based on the information, we calculate the optimal position of the UAV-mounted STAR by the proposed algorithm. Thus the position-based beam training can be performed. To enhance the beam-forming gain, we further combine with channel estimation and propose a semi-passive structure of the STAR and ambiguity elimination scheme for separated channel estimation. Thus the ambiguity in cascaded channel estimation, which may affect optimal passive beamforming, is avoided. The optimal active and passive beamforming are then carried out and data transmission is initiated. The proposed BS sub-array and sub-STAR spatial multiplexing architecture, optimal active and passive beamforming, digital precoding, and optimal position of the UAV- mounted STAR are investigated jointly to maximize the average achievable sum rate of the users. Moreover, the cloud radio access networks (CRAN) structured 5G downlink signal is proposed for GI with enhanced resolution. The simulation results show that the proposed scheme achieves beam training and separated channel estimation efficiently, and increases the spectral efficiency dramatically compared to the case when the STAR operates with random phase.
Computing is a critical driving force in the development of human civilization. In recent years, we have witnessed the emergence of intelligent computing, a new computing paradigm that is reshaping traditional computing and promoting digital revolution in the era of big data, artificial intelligence and internet-of-things with new computing theories, architectures, methods, systems, and applications. Intelligent computing has greatly broadened the scope of computing, extending it from traditional computing on data to increasingly diverse computing paradigms such as perceptual intelligence, cognitive intelligence, autonomous intelligence, and human-computer fusion intelligence. Intelligence and computing have undergone paths of different evolution and development for a long time but have become increasingly intertwined in recent years: intelligent computing is not only intelligence-oriented but also intelligence-driven. Such cross-fertilization has prompted the emergence and rapid advancement of intelligent computing. Intelligent computing is still in its infancy and an abundance of innovations in the theories, systems, and applications of intelligent computing are expected to occur soon. We present the first comprehensive survey of literature on intelligent computing, covering its theory fundamentals, the technological fusion of intelligence and computing, important applications, challenges, and future perspectives. We believe that this survey is highly timely and will provide a comprehensive reference and cast valuable insights into intelligent computing for academic and industrial researchers and practitioners.
Faced with the threat of identity leakage during voice data publishing, users are engaged in a privacy-utility dilemma when enjoying convenient voice services. Existing studies employ direct modification or text-based re-synthesis to de-identify users' voices, but resulting in inconsistent audibility in the presence of human participants. In this paper, we propose a voice de-identification system, which uses adversarial examples to balance the privacy and utility of voice services. Instead of typical additive examples inducing perceivable distortions, we design a novel convolutional adversarial example that modulates perturbations into real-world room impulse responses. Benefit from this, our system could preserve user identity from exposure by Automatic Speaker Identification (ASI) while remaining the voice perceptual quality for non-intrusive de-identification. Moreover, our system learns a compact speaker distribution through a conditional variational auto-encoder to sample diverse target embeddings on demand. Combining diverse target generation and input-specific perturbation construction, our system enables any-to-any identify transformation for adaptive de-identification. Experimental results show that our system could achieve 98% and 79% successful de-identification on mainstream ASIs and commercial systems with an objective Mel cepstral distortion of 4.31dB and a subjective mean opinion score of 4.48.
Terahertz (THz) systems have the benefit of high bandwidth and hence are capable of supporting ultra-high data rates, albeit at the cost of high pathloss. Hence they tend to harness high-gain beamforming. Therefore a novel hybrid 3D beamformer relying on sophisticated sensor-based beam training and channel estimation is proposed for Reconfigurable Intelligent Surface (RIS) aided THz MIMO systems. A so-called array-of-subarray based THz BS architecture is adopted and the corresponding sub-RIS structure is proposed. The BS, RIS and receiver antenna arrays of the users are all uniform planar arrays (UPAs). The Ultra-wideband (UWB) sensors are integrated into the RIS and the user location information obtained by the UWB sensors is exploited for channel estimation and beamforming. Furthermore, the novel concept of a Precise Beamforming Algorithm (PBA) is proposed, which further improves the beam-forming accuracy by circumventing the performance limitations imposed by positioning errors. Moreover, the conditions of maintaining the orthogonality of the RIS-aided THz channel are derived in support of spatial multiplexing. The closed-form expressions of the near-field and far-field path-loss are also derived. Our simulation results show that the proposed scheme accurately estimates the RIS-aided THz channel and the spectral efficiency is much improved, despite its low complexity. This makes our solution eminently suitable for delay-sensitive applications.
Fine-grained human action recognition is a core research topic in computer vision. Inspired by the recently proposed hierarchy representation of fine-grained actions in FineGym and SlowFast network for action recognition, we propose a novel multi-task network which exploits the FineGym hierarchy representation to achieve effective joint learning and prediction for fine-grained human action recognition. The multi-task network consists of three pathways of SlowOnly networks with gradually increased frame rates for events, sets and elements of fine-grained actions, followed by our proposed integration layers for joint learning and prediction. It is a two-stage approach, where it first learns deep feature representation at each hierarchical level, and is followed by feature encoding and fusion for multi-task learning. Our empirical results on the FineGym dataset achieve a new state-of-the-art performance, with 91.80% Top-1 accuracy and 88.46% mean accuracy for element actions, which are 3.40% and 7.26% higher than the previous best results.
This paper aims to solve a distributed learning problem under Byzantine attacks. In the underlying distributed system, a number of unknown but malicious workers (termed as Byzantine workers) can send arbitrary messages to the master and bias the learning process, due to data corruptions, computation errors or malicious attacks. Prior work has considered a total variation (TV) norm-penalized approximation formulation to handle the Byzantine attacks, where the TV norm penalty forces the regular workers' local variables to be close, and meanwhile, tolerates the outliers sent by the Byzantine workers. To solve the TV norm-penalized approximation formulation, we propose a Byzantine-robust stochastic alternating direction method of multipliers (ADMM) that fully utilizes the separable problem structure. Theoretically, we prove that the proposed method converges to a bounded neighborhood of the optimal solution at a rate of O(1/k) under mild assumptions, where k is the number of iterations and the size of neighborhood is determined by the number of Byzantine workers. Numerical experiments on the MNIST and COVERTYPE datasets demonstrate the effectiveness of the proposed method to various Byzantine attacks.
This work presents FG-Net, a general deep learning framework for large-scale point clouds understanding without voxelizations, which achieves accurate and real-time performance with a single NVIDIA GTX 1080 GPU. First, a novel noise and outlier filtering method is designed to facilitate subsequent high-level tasks. For effective understanding purpose, we propose a deep convolutional neural network leveraging correlated feature mining and deformable convolution based geometric-aware modelling, in which the local feature relationships and geometric patterns can be fully exploited. For the efficiency issue, we put forward an inverse density sampling operation and a feature pyramid based residual learning strategy to save the computational cost and memory consumption respectively. Extensive experiments on real-world challenging datasets demonstrated that our approaches outperform state-of-the-art approaches in terms of accuracy and efficiency. Moreover, weakly supervised transfer learning is also conducted to demonstrate the generalization capacity of our method.
Named entity recognition (NER) plays an essential role in natural language processing systems. Judicial NER is a fundamental component of judicial information retrieval, entity relation extraction, and knowledge map building. However, Chinese judicial NER remains to be more challenging due to the characteristics of Chinese and high accuracy requirements in the judicial filed. Thus, in this paper, we propose a deep learning-based method named BiLSTM-CRF which consists of bi-directional long short-term memory (BiLSTM) and conditional random fields (CRF). For further accuracy promotion, we propose to use Adaptive moment estimation (Adam) for optimization of the model. To validate our method, we perform experiments on judgment documents including commutation, parole and temporary service outside prison, which is acquired from China Judgments Online. Experimental results achieve the accuracy of 0.876, recall of 0.856 and F1 score of 0.855, which suggests the superiority of the proposed BiLSTM-CRF with Adam optimizer.
Wireless signal-based gesture recognition has promoted the developments of VR game, smart home, etc. However, traditional approaches suffer from the influence of the domain gap. Low recognition accuracy occurs when the recognition model is trained in one domain but is used in another domain. Though some solutions, such as adversarial learning, transfer learning and body-coordinate velocity profile, have been proposed to achieve cross-domain recognition, these solutions more or less have flaws. In this paper, we define the concept of domain gap and then propose a more promising solution, namely DI, to eliminate domain gap and further achieve domain-independent gesture recognition. DI leverages the sign map of the gradient map as the domain gap eliminator to improve the recognition accuracy. We conduct experiments with ten domains and ten gestures. The experiment results show that DI can achieve the recognition accuracies of 87.13%, 90.12% and 94.45% on KNN, SVM and CNN, which outperforms existing solutions.