The large-scale reflector array of programmable metasurfaces is capable of increasing the power efficiency of backscatter communications via passive beamforming and thus has the potential to revolutionize the low-data-rate nature of backscatter communications. In this paper, we propose to design the power-efficient higher-order constellation and reflection pattern under the amplitude constraint brought by backscatter communications. For constellation design, we adopt the amplitude and phase-shift keying (APSK) constellation and optimize the parameters of APSK such as ring number, ring radius, and inter-ring phase difference. Specifically, we derive closed-form solutions to the optimal ring radius and inter-ring phase difference for an arbitrary modulation order. For reflection pattern design, we propose to optimize the passive beamforming vector by solving a multi-objective optimization problem that maximizes reflection power and guarantees beam homogenization within the interested angle range. To solve the problem, we propose a constant-modulus power iteration method, which is proven to be monotonically increasing, to maximize the objective function in each iteration. Numerical results show that the proposed APSK constellation design and reflection pattern design outperform the existing modulation and beam pattern design in programmable metasurface-enabled backscatter communications.
Generally, humans are more skilled at perceiving differences between high-quality (HQ) and low-quality (LQ) images than directly judging the quality of a single LQ image. This situation also applies to image quality assessment (IQA). Although recent no-reference (NR-IQA) methods have made great progress to predict image quality free from the reference image, they still have the potential to achieve better performance since HQ image information is not fully exploited. In contrast, full-reference (FR-IQA) methods tend to provide more reliable quality evaluation, but its practicability is affected by the requirement for pixel-level aligned reference images. To address this, we firstly propose the content-variant reference method via knowledge distillation (CVRKD-IQA). Specifically, we use non-aligned reference (NAR) images to introduce various prior distributions of high-quality images. The comparisons of distribution differences between HQ and LQ images can help our model better assess the image quality. Further, the knowledge distillation transfers more HQ-LQ distribution difference information from the FR-teacher to the NAR-student and stabilizing CVRKD-IQA performance. Moreover, to fully mine the local-global combined information, while achieving faster inference speed, our model directly processes multiple image patches from the input with the MLP-mixer. Cross-dataset experiments verify that our model can outperform all NAR/NR-IQA SOTAs, even reach comparable performance with FR-IQA methods on some occasions. Since the content-variant and non-aligned reference HQ images are easy to obtain, our model can support more IQA applications with its relative robustness to content variations. Our code and more detailed elaborations of supplements are available: https://github.com/guanghaoyin/CVRKD-IQA.
In this paper, the queue-aware simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS) assisted non-orthogonal multiple access (NOMA) communication system is investigated to ensure the system stability, where the long-term stability-oriented problem is reformulated to maximize the per-slot queue-weighted sum rate (QWSR) of users based on the Lyapunov drift theory. By jointly optimizing the NOMA decoding order, the active beamforming coefficients at the BS, and the passive transmission and reflection coefficients at the STAR-RIS, three STAR-RIS operating protocols are considered, namely energy splitting (ES), mode switching (MS), and time switching (TS). For ES, the blocked coordinate descent and the successive convex approximation methods are invoked to handle the highly-coupled and non-convex problem. For MS, the proposed algorithm is further extended to a penalty-based two-loop algorithm to solve the binary amplitude constrained problem. For TS, the formulated problem is decomposed into two subproblems, each of which can be solved in a similar manner to ES. Simulation results show that: i) our proposed STAR-RIS assisted NOMA communication achieves better performance than the conventional schemes; ii) the reformulated QWSR maximization problem confirms the system stability; and iii) TS achieves superior performance with respect to both the QWSR and the average queue length.
In the upcoming next-generation (5G-Advanced and 6G) wireless networks, sensing as a service will play a more important role than ever before. Recently, the concept of perceptive network is proposed as a paradigm shift that provides sensing and communication (S&C) services simultaneously. This type of technology is typically referred to as Integrated Sensing and Communications (ISAC). In this paper, we propose the concept of sensing quality of service (QoS) in terms of diverse applications. Specifically, the probability of detection, the Cramer-Rao bound (CRB) for parameter estimation and the posterior CRB for moving target indication are employed to measure the sensing QoS for detection, localization, and tracking, respectively. Then, we establish a unified framework for ISAC resource allocation, where the fairness and the comprehensiveness optimization criteria are considered for the aforementioned sensing services. The proposed schemes can flexibly allocate the limited power and bandwidth resources according to both S&C QoSs. Finally, we study the performance trade-off between S&C services in different resource allocation schemes by numerical simulations.
Salient object detection has been long studied to identify the most visually attractive objects in images/videos. Recently, a growing amount of approaches have been proposed all of which rely on the contour/edge information to improve detection performance. The edge labels are either put into the loss directly or used as extra supervision. The edge and body can also be learned separately and then fused afterward. Both methods either lead to high prediction errors near the edge or cannot be trained in an end-to-end manner. Another problem is that existing methods may fail to detect objects of various sizes due to the lack of efficient and effective feature fusion mechanisms. In this work, we propose to decompose the saliency detection task into two cascaded sub-tasks, \emph{i.e.}, detail modeling and body filling. Specifically, the detail modeling focuses on capturing the object edges by supervision of explicitly decomposed detail label that consists of the pixels that are nested on the edge and near the edge. Then the body filling learns the body part which will be filled into the detail map to generate more accurate saliency map. To effectively fuse the features and handle objects at different scales, we have also proposed two novel multi-scale detail attention and body attention blocks for precise detail and body modeling. Experimental results show that our method achieves state-of-the-art performances on six public datasets.
Computing the matrix square root and its inverse in a differentiable manner is important in a variety of computer vision tasks. Previous methods either adopt the Singular Value Decomposition (SVD) to explicitly factorize the matrix or use the Newton-Schulz iteration (NS iteration) to derive the approximate solution. However, both methods are not computationally efficient enough in either the forward pass or the backward pass. In this paper, we propose two more efficient variants to compute the differentiable matrix square root and the inverse square root. For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP), and the other method is to use Matrix Pad\'e Approximants (MPA). The backward gradient is computed by iteratively solving the continuous-time Lyapunov equation using the matrix sign function. A series of numerical tests show that both methods yield considerable speed-up compared with the SVD or the NS iteration. Moreover, we validate the effectiveness of our methods in several real-world applications, including de-correlated batch normalization, second-order vision transformer, global covariance pooling for large-scale and fine-grained recognition, attentive covariance pooling for video recognition, and neural style transfer. The experimental results demonstrate that our methods can also achieve competitive and even slightly better performances. The Pytorch implementation is available at \href{https://github.com/KingJamesSong/FastDifferentiableMatSqrt}{https://github.com/KingJamesSong/FastDifferentiableMatSqrt}.
Subgraph matching is a fundamental problem in various fields that use graph structured data. Subgraph matching algorithms enumerate all isomorphic embeddings of a query graph q in a data graph G. An important branch of matching algorithms exploit the backtracking search approach which recursively extends intermediate results following a matching order of query vertices. It has been shown that the matching order plays a critical role in time efficiency of these backtracking based subgraph matching algorithms. In recent years, many advanced techniques for query vertex ordering (i.e., matching order generation) have been proposed to reduce the unpromising intermediate results according to the preset heuristic rules. In this paper, for the first time we apply the Reinforcement Learning (RL) and Graph Neural Networks (GNNs) techniques to generate the high-quality matching order for subgraph matching algorithms. Instead of using the fixed heuristics to generate the matching order, our model could capture and make full use of the graph information, and thus determine the query vertex order with the adaptive learning-based rule that could significantly reduces the number of redundant enumerations. With the help of the reinforcement learning framework, our model is able to consider the long-term benefits rather than only consider the local information at current ordering step.Extensive experiments on six real-life data graphs demonstrate that our proposed matching order generation technique could reduce up to two orders of magnitude of query processing time compared to the state-of-the-art algorithms.
The cybersecurity breaches expose surveillance video streams to forgery attacks, under which authentic streams are falsified to hide unauthorized activities. Traditional video forensics approaches can localize forgery traces using spatial-temporal analysis on relatively long video clips, while falling short in real-time forgery detection. The recent work correlates time-series camera and wireless signals to detect looped videos but cannot realize fine-grained forgery localization. To overcome these limitations, we propose Secure-Pose, which exploits the pervasive coexistence of surveillance and Wi-Fi infrastructures to defend against video forgery attacks in a real-time and fine-grained manner. We observe that coexisting camera and Wi-Fi signals convey common human semantic information and forgery attacks on video streams will decouple such information correspondence. Particularly, retrievable human pose features are first extracted from concurrent video and Wi-Fi channel state information (CSI) streams. Then, a lightweight detection network is developed to accurately discover forgery attacks and an efficient localization algorithm is devised to seamlessly track forgery traces in video streams. We implement Secure-Pose using one Logitech camera and two Intel 5300 NICs and evaluate it in different environments. Secure-Pose achieves a high detection accuracy of 98.7% and localizes abnormal objects under playback and tampering attacks.
Computing the matrix square root or its inverse in a differentiable manner is important in a variety of computer vision tasks. Previous methods either adopt the Singular Value Decomposition (SVD) to explicitly factorize the matrix or use the Newton-Schulz iteration (NS iteration) to derive the approximate solution. However, both methods are not computationally efficient enough in either the forward pass or in the backward pass. In this paper, we propose two more efficient variants to compute the differentiable matrix square root. For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP), and the other method is to use Matrix Pad\'e Approximants (MPA). The backward gradient is computed by iteratively solving the continuous-time Lyapunov equation using the matrix sign function. Both methods yield considerable speed-up compared with the SVD or the Newton-Schulz iteration. Experimental results on the de-correlated batch normalization and second-order vision transformer demonstrate that our methods can also achieve competitive and even slightly better performances. The code is available at \href{https://github.com/KingJamesSong/FastDifferentiableMatSqrt}{https://github.com/KingJamesSong/FastDifferentiableMatSqrt}.
In this paper, we propose a new GPU implementation of the Katsevich algorithm for helical CT reconstruction. Our implementation divides the sinograms and reconstructs the CT images pitch by pitch. By utilizing the periodic properties of the parameters of the Katsevich algorithm, our method only needs to calculate these parameters once for all the pitches and so has lower GPU-memory burdens and is very suitable for deep learning. By embedding our implementation into the network, we propose an end-to-end deep network for the high pitch helical CT reconstruction with sparse detectors. Since our network utilizes the features extracted from both sinograms and CT images, it can simultaneously reduce the streak artifacts caused by the sparsity of sinograms and preserve fine details in the CT images. Experiments show that our network outperforms the related methods both in subjective and objective evaluations.