Recently, numerous approaches have achieved notable success in compressed video quality enhancement (VQE). However, these methods usually ignore the utilization of valuable coding priors inherently embedded in compressed videos, such as motion vectors and residual frames, which carry abundant temporal and spatial information. To remedy this problem, we propose the Coding Priors-Guided Aggregation (CPGA) network to utilize temporal and spatial information from coding priors. The CPGA mainly consists of an inter-frame temporal aggregation (ITA) module and a multi-scale non-local aggregation (MNA) module. Specifically, the ITA module aggregates temporal information from consecutive frames and coding priors, while the MNA module globally captures spatial information guided by residual frames. In addition, to facilitate research in VQE task, we newly construct the Video Coding Priors (VCP) dataset, comprising 300 videos with various coding priors extracted from corresponding bitstreams. It remedies the shortage of previous datasets on the lack of coding information. Experimental results demonstrate the superiority of our method compared to existing state-of-the-art methods. The code and dataset will be released at https://github.com/CPGA/CPGA.git.
The artificial intelligence (AI) system has achieved expert-level performance in electrocardiogram (ECG) signal analysis. However, in underdeveloped countries or regions where the healthcare information system is imperfect, only paper ECGs can be provided. Analysis of real-world ECG images (photos or scans of paper ECGs) remains challenging due to complex environments or interference. In this study, we present an AI system developed to detect and screen cardiac abnormalities (CAs) from real-world ECG images. The system was evaluated on a large dataset of 52,357 patients from multiple regions and populations across the world. On the detection task, the AI system obtained area under the receiver operating curve (AUC) of 0.996 (hold-out test), 0.994 (external test 1), 0.984 (external test 2), and 0.979 (external test 3), respectively. Meanwhile, the detection results of AI system showed a strong correlation with the diagnosis of cardiologists (cardiologist 1 (R=0.794, p<1e-3), cardiologist 2 (R=0.812, p<1e-3)). On the screening task, the AI system achieved AUCs of 0.894 (hold-out test) and 0.850 (external test). The screening performance of the AI system was better than that of the cardiologists (AI system (0.846) vs. cardiologist 1 (0.520) vs. cardiologist 2 (0.480)). Our study demonstrates the feasibility of an accurate, objective, easy-to-use, fast, and low-cost AI system for CA detection and screening. The system has the potential to be used by healthcare professionals, caregivers, and general users to assess CAs based on real-world ECG images.
Image matting is a long-standing problem in computer graphics and vision, mostly identified as the accurate estimation of the foreground in input images. We argue that the foreground objects can be represented by different-level information, including the central bodies, large-grained boundaries, refined details, etc. Based on this observation, in this paper, we propose a multi-scale information assembly framework (MSIA-matte) to pull out high-quality alpha mattes from single RGB images. Technically speaking, given an input image, we extract advanced semantics as our subject content and retain initial CNN features to encode different-level foreground expression, then combine them by our well-designed information assembly strategy. Extensive experiments can prove the effectiveness of the proposed MSIA-matte, and we can achieve state-of-the-art performance compared to most existing matting networks.
The inverse problem of inferring electrocardiogram (ECG) from photoplethysmogram (PPG) is an emerging research direction that combines the easy measurability of PPG and the rich clinical knowledge of ECG for long-term continuous cardiac monitoring. The prior art for reconstruction using a universal basis has limited fidelity for uncommon ECG waveform shapes due to the lack of rich representative power. In this paper, we design two dictionary learning frameworks, the cross-domain joint dictionary learning (XDJDL) and the label-consistent XDJDL (LC-XDJDL), to further improve the ECG inference quality and enrich the PPG-based diagnosis knowledge. Building on the K-SVD technique, our proposed joint dictionary learning frameworks aim to maximize the expressive power by optimizing simultaneously a pair of signal dictionaries for PPG and ECG with the transforms to relate their sparse codes and disease information. The proposed models are evaluated with 34,000+ ECG/PPG cycle pairs containing a variety of ECG morphologies and cardiovascular diseases. We demonstrate both visually and quantitatively that our proposed frameworks can achieve better inference performance than previous methods, suggesting an encouraging potential for ECG screening using PPG based on the proactive learned PPG-ECG relationship.
One fundamental problem in the learning treatment effect from observational data is confounder identification and balancing. Most of the previous methods realized confounder balancing by treating all observed variables as confounders, ignoring the identification of confounders and non-confounders. In general, not all the observed variables are confounders which are the common causes of both the treatment and the outcome, some variables only contribute to the treatment and some contribute to the outcome. Balancing those non-confounders would generate additional bias for treatment effect estimation. By modeling the different relations among variables, treatment and outcome, we propose a synergistic learning framework to 1) identify and balance confounders by learning decomposed representation of confounders and non-confounders, and simultaneously 2) estimate the treatment effect in observational studies via counterfactual inference. Our empirical results demonstrate that the proposed method can precisely identify and balance confounders, while the estimation of the treatment effect performs better than the state-of-the-art methods on both synthetic and real-world datasets.
Infectious keratitis is the most common entities of corneal diseases, in which pathogen grows in the cornea leading to inflammation and destruction of the corneal tissues. Infectious keratitis is a medical emergency, for which a rapid and accurate diagnosis is needed for speedy initiation of prompt and precise treatment to halt the disease progress and to limit the extent of corneal damage; otherwise it may develop sight-threatening and even eye-globe-threatening condition. In this paper, we propose a sequential-level deep learning model to effectively discriminate the distinction and subtlety of infectious corneal disease via the classification of clinical images. In this approach, we devise an appropriate mechanism to preserve the spatial structures of clinical images and disentangle the informative features for clinical image classification of infectious keratitis. In competition with 421 ophthalmologists, the performance of the proposed sequential-level deep model achieved 80.00% diagnostic accuracy, far better than the 49.27% diagnostic accuracy achieved by ophthalmologists over 120 test images.
Machine Learning as a Service (MLaaS), such as Microsoft Azure, Amazon AWS, offers an effective DNN model to complete the machine learning task for small businesses and individuals who are restricted to the lacking data and computing power. However, here comes an issue that user privacy is ex-posed to the MLaaS server, since users need to upload their sensitive data to the MLaaS server. In order to preserve their privacy, users can encrypt their data before uploading it. This makes it difficult to run the DNN model because it is not designed for running in ciphertext domain. In this paper, using the Paillier homomorphic cryptosystem we present a new Privacy-Preserving Deep Neural Network model that we called 2P-DNN. This model can fulfill the machine leaning task in ciphertext domain. By using 2P-DNN, MLaaS is able to provide a Privacy-Preserving machine learning ser-vice for users. We build our 2P-DNN model based on LeNet-5, and test it with the encrypted MNIST dataset. The classification accuracy is more than 97%, which is close to the accuracy of LeNet-5 running with the MNIST dataset and higher than that of other existing Privacy-Preserving machine learning models