Modern society is interested in capturing high-resolution and fine-quality images due to the surge of sophisticated cameras. However, the noise contamination in the images not only inferior people's expectations but also conversely affects the subsequent processes if such images are utilized in computer vision tasks such as remote sensing, object tracking, etc. Even though noise filtration plays an essential role, real-time processing of a high-resolution image is limited by the hardware limitations of the image-capturing instruments. Geodesic Gramian Denoising (GGD) is a manifold-based noise filtering method that we introduced in our past research which utilizes a few prominent singular vectors of the geodesics' Gramian matrix for the noise filtering process. The applicability of GDD is limited as it encounters $\mathcal{O}(n^6)$ when denoising a given image of size $n\times n$ since GGD computes the prominent singular vectors of a $n^2 \times n^2$ data matrix that is implemented by singular value decomposition (SVD). In this research, we increase the efficiency of our GGD framework by replacing its SVD step with four diverse singular vector approximation techniques. Here, we compare both the computational time and the noise filtering performance between the four techniques integrated into GGD.
Confidence intervals are a standard technique for analyzing data. When applied to time series, confidence intervals are computed for each time point separately. Alternatively, we can compute confidence bands, where we are required to find the smallest area enveloping $k$ time series, where $k$ is a user parameter. Confidence bands can be then used to detect abnormal time series, not just individual observations within the time series. We will show that despite being an NP-hard problem it is possible to find optimal confidence band for some $k$. We do this by considering a different problem: discovering regularized bands, where we minimize the envelope area minus the number of included time series weighted by a parameter $\alpha$. Unlike normal confidence bands we can solve the problem exactly by using a minimum cut. By varying $\alpha$ we can obtain solutions for various $k$. If we have a constraint $k$ for which we cannot find appropriate $\alpha$, we demonstrate a simple algorithm that yields $O(\sqrt{n})$ approximation guarantee by connecting the problem to a minimum $k$-union problem. This connection also implies that we cannot approximate the problem better than $O(n^{1/4})$ under some (mild) assumptions. Finally, we consider a variant where instead of minimizing the area we minimize the maximum width. Here, we demonstrate a simple 2-approximation algorithm and show that we cannot achieve better approximation guarantee.
We consider the parametric data model employed in applications such as line spectral estimation and direction-of-arrival estimation. We focus on the stochastic maximum likelihood estimation (MLE) framework and offer approaches to estimate the parameter of interest in a gridless manner, overcoming the model complexities of the past. This progress is enabled by the modern trend of reparameterization of the objective and exploiting the sparse Bayesian learning (SBL) approach. The latter is shown to be a correlation-aware method, and for the underlying problem it is identified as a grid-based technique for recovering a structured covariance matrix of the measurements. For the case when the structured matrix is expressible as a sampled Toeplitz matrix, such as when measurements are sampled in time or space at regular intervals, additional constraints and reparameterization of the SBL objective leads to the proposed structured matrix recovery technique based on MLE. The proposed optimization problem is non-convex, and we propose a majorization-minimization based iterative procedure to estimate the structured matrix; each iteration solves a semidefinite program. We recover the parameter of interest in a gridless manner by appealing to the Caratheodory-Fejer result on decomposition of PSD Toeplitz matrices. For the general case of irregularly spaced time or spatial samples, we propose an iterative SBL procedure that refines grid points to increase resolution near potential source locations, while maintaining a low per iteration complexity. We provide numerical results to evaluate and compare the performance of the proposed techniques with other gridless techniques, and the CRB. The proposed correlation-aware approach is more robust to environmental/system effects such as low number of snapshots, correlated sources, small separation between source locations and improves sources identifiability.
Large curated datasets are necessary, but annotating medical images is a time-consuming, laborious, and expensive process. Therefore, recent supervised methods are focusing on utilizing a large amount of unlabeled data. However, to do so, is a challenging task. To address this problem, we propose a new 3D Cross Pseudo Supervision (3D-CPS) method, a semi-supervised network architecture based on nnU-Net with the Cross Pseudo Supervision method. We design a new nnU-Net based preprocessing method and adopt the forced spacing settings strategy in the inference stage to speed up the inference time. In addition, we set the semi-supervised loss weights to expand linearity with each epoch to prevent the model from low-quality pseudo-labels in the early training process. Our proposed method achieves an average dice similarity coefficient (DSC) of 0.881 and an average normalized surface distance (NSD) of 0.913 on the MICCAI FLARE2022 validation set (20 cases).
In real-world crowdsourcing annotation systems, due to differences in user knowledge and cultural backgrounds, as well as the high cost of acquiring annotation information, the supervision information we obtain might be insufficient and ambiguous. To mitigate the negative impacts, in this paper, we investigate a more general and broadly applicable learning problem, i.e. \emph{semi-supervised partial label learning}, and propose a novel method based on pseudo-labeling and contrastive learning. Following the key inventing principle, our method facilitate the partial label disambiguation process with unlabeled data and at the same time assign reliable pseudo-labels to weakly supervised examples. Specifically, our method learns from the ambiguous labeling information via partial cross-entropy loss. Meanwhile, high-accuracy pseudo-labels are generated for both partial and unlabeled examples through confidence-based thresholding and contrastive learning is performed in a hybrid unsupervised and supervised manner for more discriminative representations, while its supervision increases curriculumly. The two main components systematically work as a whole and reciprocate each other. In experiments, our method consistently outperforms all comparing methods by a significant margin and set up the first state-of-the-art performance for semi-supervised partial label learning on image benchmarks.
In the evolution of agriculture to its next stage, Agriculture 5.0, artificial intelligence will play a central role. Controlled-environment agriculture, or CEA, is a special form of urban and suburban agricultural practice that offers numerous economic, environmental, and social benefits, including shorter transportation routes to population centers, reduced environmental impact, and increased productivity. Due to its ability to control environmental factors, CEA couples well with computer vision (CV) in the adoption of real-time monitoring of the plant conditions and autonomous cultivation and harvesting. The objective of this paper is to familiarize CV researchers with agricultural applications and agricultural practitioners with the solutions offered by CV. We identify five major CV applications in CEA, analyze their requirements and motivation, and survey the state of the art as reflected in 68 technical papers using deep learning methods. In addition, we discuss five key subareas of computer vision and how they related to these CEA problems, as well as nine vision-based CEA datasets. We hope the survey will help researchers quickly gain a bird-eye view of the striving research area and will spark inspiration for new research and development.
Using drones for communications and transportation is drawing great attention in many practical scenarios, such as package delivery and providing additional wireless coverage. However, the increasing demand for UAVs from industry and academia will cause aerial traffic conflicts in the future. This, in turn, motivates the idea of this paper: multi-purpose UAVs, acting as aerial wireless data relays and means of aerial transportation simultaneously, to deliver packages and data at the same time. This paper aims to analyze the feasibility of using drones to collect and deliver data from the Internet of Things (IoT) devices to terrestrial base stations (TBSs) while delivering packages from warehouses to residential areas. We propose an algorithm to optimize the trajectory of UAVs to maximize the size of collected/delivered data while minimizing the total round trip time subject to the limited onboard battery of UAVs. Specifically, we use tools from stochastic geometry to model the locations of the IoT clusters and the TBSs and study the system performance with respect to energy efficiency, average size of collected/delivered data, and package delivery time. Our numerical results reveal that multi-functional UAVs have great potential to enhance the efficiency of both communication and transportation networks.
In modern industrial systems, diagnosing faults in time and using the best methods becomes more and more crucial. It is possible to fail a system or to waste resources if faults are not detected or are detected late. Machine learning and deep learning have proposed various methods for data-based fault diagnosis, and we are looking for the most reliable and practical ones. This paper aims to develop a framework based on deep learning and reinforcement learning for fault detection. We can increase accuracy, overcome data imbalance, and better predict future defects by updating the reinforcement learning policy when new data is received. By implementing this method, we will see an increase of $3\%$ in all evaluation metrics, an improvement in prediction speed, and $3\%$ - $4\%$ in all evaluation metrics compared to typical backpropagation multi-layer neural network prediction with similar parameters.
Cooperative perception is challenging for connected and automated driving because of the real-time requirements and bandwidth limitation, especially when the vehicle location and pose information are inaccurate. We propose an efficient object-level cooperative perception framework, in which data of the 3D bounding boxes, location, and pose are broadcast and received between the connected vehicles, then fused at the object level. Two Iterative Closest Point (ICP) and Optimal Transport theory-based matching algorithms are developed to maximize the total correlations between the 3D bounding boxes jointly detected by the vehicles. Experiment results show that it only takes 5ms to associate objects from different vehicles for each frame, and robust performance is achieved for different levels of location and heading errors. Meanwhile, the proposed framework outperforms the state-of-the-art benchmark methods when location or pose errors occur.
Visual reconstruction of fast non-rigid object deformations over time is a challenge for conventional frame-based cameras. In this paper, we propose a novel approach for reconstructing such deformations using measurements from event-based cameras. Under the assumption of a static background, where all events are generated by the motion, our approach estimates the deformation of objects from events generated at the object contour in a probabilistic optimization framework. It associates events to mesh faces on the contour and maximizes the alignment of the line of sight through the event pixel with the associated face. In experiments on synthetic and real data, we demonstrate the advantages of our method over state-of-the-art optimization and learning-based approaches for reconstructing the motion of human hands. A video of the experiments is available at https://youtu.be/gzfw7i5OKjg