Dust storms may remarkably degrade the imaging quality of Martian orbiters and delay the progress of mapping the global topography and geomorphology. To address this issue, this paper presents an approach that reuses the image dehazing knowledge obtained on Earth to resolve the dust-removal problem on Mars. In this approach, we collect remote-sensing images captured by Tianwen-1 and manually select hundreds of clean and dusty images. Inspired by the haze formation process on Earth, we formulate a similar visual degradation process on clean images and synthesize dusty images sharing a similar feature distribution with realistic dusty images. These realistic clean and synthetic dusty image pairs are used to train a deep model that inherently encodes dust irrelevant features and decodes them into dust-free images. Qualitative and quantitative results show that dust storms can be effectively eliminated by the proposed approach, leading to obviously improved topographical and geomorphological details of Mars.
Reconfigurable intelligent surfaces (RISs) are envisioned as a promising technology for future wireless communications. With various hardware realizations, RISs can work under different modes (reflective/transmissive/hybrid) or have different architectures (single-/group-/fully-connected). However, most existing research focused on either reflective RISs or single-connected hybrid RISs while there is lack of a comprehensive study for RISs unifying different modes/architectures. In this paper, we solve this issue by analyzing and proposing a general RIS-aided communication model which unifies the reflective/transmissive/hybrid modes and single-/group-/fullyconnected architectures. With the proposed model, we consider jointly designing the transmit precoder and RIS beamformer to maximize the sum-rate for RIS-aided systems. Leveraging fractional programming theory, the original sum-rate maximization problem is equivalently transformed into a multi-block optimization, which can be solved by block coordinate descent methods. We also provide simulation results to compare the performance of RISs with different modes/architectures. Compared with singleconnected hybrid RISs, fully- and group-connected hybrid RISs can increase the sum-rate by around 75% and 37% for Rayleigh fading channels.
This letter is the third part of a three-part tutorial that focuses on rate-splitting multiple access (RSMA) for 6G. As Part III of the tutorial, this letter provides an overview of integrating RSMA and reconfigurable intelligent surface (RIS). We first introduce two potential PHY layer techniques, namely, RSMA and RIS, including the need for integrating RSMA with RIS and how they could help each other. Next, we provide a general model of an RIS-aided RSMA system and summarize some key performance metrics. Then, we discuss the major advantages of RIS-aided RSMA networks, and illustrate the rate region of RIS-aided RSMA for both perfect and imperfect channel conditions. Finally, we summarize the research challenges and open problems for RIS-aided RSMA systems. In conclusion, RSMA is a promising technology for next generation multiple access (NGMA) and future networks such as 6G and beyond.
The coin-tap test is a convenient and primary method for non-destructive testing, while its manual on-site operation is tough and costly. With the help of the latest intelligent signal processing method, convolutional neural networks (CNN), we achieve an intelligent coin-tap test which exhibited superior performance in recognizing the defects. However, this success of CNNs relies on plenty of well-labeled data from the identical scenario, which could be difficult to get for many real industrial practices. This paper further develops transfer learning strategies for this issue, that is, to transfer the model trained on data of one scenario to another. In experiments, the result presents a notable improvement by using domain adaptation and pseudo label learning strategies. Hence, it becomes possible to apply the model into scenarios with none or little (less than 10\%) labeled data adopting the transfer learning strategies proposed herein. In addition, we used a benchmark dataset constructed ourselves throughout this study. This benchmark dataset for the coin-tap test containing around 100,000 sound signals is published at https://github.com/PPhub-hy/torch-tapnet.
In this paper, we present DuReader_retrieval, a large-scale Chinese dataset for passage retrieval. DuReader_retrieval contains more than 90K queries and over 8M unique passages from Baidu search. To ensure the quality of our benchmark and address the shortcomings in other existing datasets, we (1) reduce the false negatives in development and testing sets by pooling the results from multiple retrievers with human annotations, (2) and remove the semantically similar questions between training with development and testing sets. We further introduce two extra out-of-domain testing sets for benchmarking the domain generalization capability. Our experiment results demonstrate that DuReader_retrieval is challenging and there is still plenty of room for the community to improve, e.g. the generalization across domains, salient phrase and syntax mismatch between query and paragraph and robustness. DuReader_retrieval will be publicly available at https://github.com/baidu/DuReader/tree/master/DuReader-Retrieval
In decades, enormous computational resources are poured into solving the transient partial differential equations for multifarious physical fields. The latest artificial intelligence has shown great potential in accelerating these computations, but its road to wide applications is hindered by the variety of computational domains and boundary conditions. Here, we overcome this obstacle by constructing a learning framework capable of purely representing the transient PDEs with local neural operators (LNOs). This framework is demonstrated in learning several transient PDEs, especially the Navier-Stokes equations, and successfully applied to solve problems with quite different domains and boundaries, including the internal flow, the external flow, and remarkably, the flow across the cascade of airfoils. In these applications, our LNOs are faster than the conventional numerical solver by over 1000 times, which could be significant for scientific computations and engineering simulations.
Recently, the Vision Transformer (ViT) has shown impressive performance on high-level and low-level vision tasks. In this paper, we propose a new ViT architecture, named Hybrid Local-Global Vision Transformer (HyLoG-ViT), for single image dehazing. The HyLoG-ViT block consists of two paths, the local ViT path and the global ViT path, which are used to capture local and global dependencies. The hybrid features are fused via convolution layers. As a result, the HyLoG-ViT reduces the computational complexity and introduces locality in the networks. Then, the HyLoG-ViT blocks are incorporated within our dehazing networks, which jointly learn the intrinsic image decomposition and image dehazing. Specifically, the network consists of one shared encoder and three decoders for reflectance prediction, shading prediction, and haze-free image generation. The tasks of reflectance and shading prediction can produce meaningful intermediate features that can serve as complementary features for haze-free image generation. To effectively aggregate the complementary features, we propose a complementary features selection module (CFSM) to select the useful ones for image dehazing. Extensive experiments on homogeneous, non-homogeneous, and nighttime dehazing tasks reveal that our proposed Transformer-based dehazing network can achieve comparable or even better performance than CNNs-based dehazing models.
Learning how to adapt and make real-time informed decisions in dynamic and complex environments is a challenging problem. To learn this task, Reinforcement Learning (RL) relies on an agent interacting with an environment and learning through trial and error to maximize the cumulative sum of rewards received by it. In multi-player Monopoly game, players have to make several decisions every turn which involves complex actions, such as making trades. This makes the decision-making harder and thus, introduces a highly complicated task for an RL agent to play and learn its winning strategies. In this paper, we introduce a Hybrid Model-Free Deep RL (DRL) approach that is capable of playing and learning winning strategies of the popular board game, Monopoly. To achieve this, our DRL agent (1) starts its learning process by imitating a rule-based agent (that resembles the human logic) to initialize its policy, (2) learns the successful actions, and improves its policy using DRL. Experimental results demonstrate an intelligent behavior of our proposed agent as it shows high win rates against different types of agent-players.
Intelligent reflecting surface (IRS), composed of a large number of hardware-efficient passive elements, is deemed as a potential technique for future wireless communications since it can adaptively enhance the propagation environment. In order to effectively utilize IRS to achieve promising beamforming gains, the problem of channel state information (CSI) acquisition needs to be carefully considered. However, most recent works assume to employ an ideal IRS, i.e., each reflecting element has constant amplitude, variable phase shifts, as well as the same response for the signals with different frequencies, which will cause severe estimation error due to the mismatch between the ideal IRS and the practical one. In this paper, we study channel estimation in practical IRS-aided orthogonal frequency division multiplexing (OFDM) systems with discrete phase shifts. Different from the prior works which assume that IRS has an ideal reflection model, we perform channel estimation by considering amplitude-phase shift-frequency relationship for the response of practical IRS. Aiming at minimizing normalized-mean-square-error (NMSE) of the estimated channel, a novel IRS time-varying reflection pattern is designed by leveraging the alternating optimization (AO) algorithm for the case of using low-resolution phase shifters. Moreover, for the high-resolution IRS cases, we provide another practical reflection pattern scheme to further reduce the complexity. Simulation results demonstrate the necessity of considering practical IRS model for channel estimation and the effectiveness of our proposed channel estimation methods.
A reasonable evaluation standard underlies construction of effective deep learning models. However, we find in experiments that the automatic crack detectors based on deep learning are obviously underestimated by the widely used mean Average Precision (mAP) standard. This paper presents a study on the evaluation standard. It is clarified that the random fractal of crack disables the mAP standard, because the strict box matching in mAP calculation is unreasonable for the fractal feature. As a solution, a fractal-available evaluation standard named CovEval is proposed to correct the underestimation in crack detection. In CovEval, a different matching process based on the idea of covering box matching is adopted for this issue. In detail, Cover Area rate (CAr) is designed as a covering overlap, and a multi-match strategy is employed to release the one-to-one matching restriction in mAP. Extended Recall (XR), Extended Precision (XP) and Extended F-score (Fext) are defined for scoring the crack detectors. In experiments using several common frameworks for object detection, models get much higher scores in crack detection according to CovEval, which matches better with the visual performance. Moreover, based on faster R-CNN framework, we present a case study to optimize a crack detector based on CovEval standard. Recall (XR) of our best model achieves an industrial-level at 95.8, which implies that with reasonable standard for evaluation, the methods for object detection are with great potential for automatic industrial inspection.