Recent learning-based approaches have achieved significant progress in light field (LF) image super-resolution (SR) by exploring convolution-based or transformer-based network structures. However, LF imaging has many intrinsic physical priors that have not been fully exploited. In this paper, we analyze the coordinate transformation of the LF imaging process to reveal the geometric relationship in the LF images. Based on such geometric priors, we introduce a new LF subspace of virtual-slit images (VSI) that provide sub-pixel information complementary to sub-aperture images. To leverage the abundant correlation across the four-dimensional data with manageable complexity, we propose learning ensemble representation of all $C_4^2$ LF subspaces for more effective feature extraction. To super-resolve image structures from undersampled LF data, we propose a geometry-aware decoder, named EPIXformer, which constrains the transformer's operational searching regions with a LF physical prior. Experimental results on both spatial and angular SR tasks demonstrate that the proposed method outperforms other state-of-the-art schemes, especially in handling various disparities.
Surrogate-assisted evolutionary algorithms have been widely developed to solve complex and computationally expensive multi-objective optimization problems in recent years. However, when dealing with high-dimensional optimization problems, the performance of these surrogate-assisted multi-objective evolutionary algorithms deteriorate drastically. In this work, a novel Classifier-assisted rank-based learning and Local Model based multi-objective Evolutionary Algorithm (CLMEA) is proposed for high-dimensional expensive multi-objective optimization problems. The proposed algorithm consists of three parts: classifier-assisted rank-based learning, hypervolume-based non-dominated search, and local search in the relatively sparse objective space. Specifically, a probabilistic neural network is built as classifier to divide the offspring into a number of ranks. The offspring in different ranks uses rank-based learning strategy to generate more promising and informative candidates for real function evaluations. Then, radial basis function networks are built as surrogates to approximate the objective functions. After searching non-dominated solutions assisted by the surrogate model, the candidates with higher hypervolume improvement are selected for real evaluations. Subsequently, in order to maintain the diversity of solutions, the most uncertain sample point from the non-dominated solutions measured by the crowding distance is selected as the guided parent to further infill in the uncertain region of the front. The experimental results of benchmark problems and a real-world application on geothermal reservoir heat extraction optimization demonstrate that the proposed algorithm shows superior performance compared with the state-of-the-art surrogate-assisted multi-objective evolutionary algorithms. The source code for this work is available at https://github.com/JellyChen7/CLMEA.
Traditional feature selections need to know the feature space before learning, and online streaming feature selection (OSFS) is proposed to process streaming features on the fly. Existing methods divide features into relevance or irrelevance without missing data, and deleting irrelevant features may lead to in-formation loss. Motivated by this, we focus on completing the streaming feature matrix and division of feature correlation and propose online sparse streaming feature selection based on adapted classification (OS2FS-AC). This study uses Latent Factor Analysis (LFA) to pre-estimate missed data. Besides, we use the adaptive method to obtain the threshold, divide the features into strongly relevant, weakly relevant, and irrelevant features, and then divide weak relevance with more information. Experimental results on ten real-world data sets demonstrate that OS2FS-AC performs better than state-of-the-art algo-rithms.
Community is a fundamental and critical characteristic of an undirected social network, making community detection be a vital yet thorny issue in network representation learning. A symmetric and non-negative matrix factorization (SNMF) model is frequently adopted to address this issue owing to its great interpretability and scalability. However, it adopts a single latent factor matrix to represent an undirected network for precisely representing its symmetry, which leads to loss of representation learning ability due to the reduced latent space. Motivated by this discovery, this paper proposes a novel Constraints Fusion-induced Symmetric Nonnegative Matrix Factorization (CFS) model that adopts three-fold ideas: a) Representing a target undirected network with multiple latent factor matrices, thus preserving its representation learning capacity; b) Incorporating a symmetry-regularizer that preserves the symmetry of the learnt low-rank approximation to the adjacency matrix into the loss function, thus making the resultant detector well-aware of the target network's symmetry; and c) Introducing a graph-regularizer that preserves local invariance of the network's intrinsic geometry, thus making the achieved detector well-aware of community structure within the target network. Extensively empirical studies on eight real-world social networks from industrial applications demonstrate that the proposed CFS model significantly outperforms state-of-the-art models in achieving highly-accurate community detection results.
The booming development and huge market of micro-videos bring new e-commerce channels for merchants. Currently, more micro-video publishers prefer to embed relevant ads into their micro-videos, which not only provides them with business income but helps the audiences to discover their interesting products. However, due to the micro-video recording by unprofessional equipment, involving various topics and including multiple modalities, it is challenging to locate the products related to micro-videos efficiently, appropriately, and accurately. We formulate the microvideo-product retrieval task, which is the first attempt to explore the retrieval between the multi-modal and multi-modal instances. A novel approach named Multi-Queue Momentum Contrast (MQMC) network is proposed for bidirectional retrieval, consisting of the uni-modal feature and multi-modal instance representation learning. Moreover, a discriminative selection strategy with a multi-queue is used to distinguish the importance of different negatives based on their categories. We collect two large-scale microvideo-product datasets (MVS and MVS-large) for evaluation and manually construct the hierarchical category ontology, which covers sundry products in daily life. Extensive experiments show that MQMC outperforms the state-of-the-art baselines. Our replication package (including code, dataset, etc.) is publicly available at https://github.com/duyali2000/MQMC.
User behavior data produced during interaction with massive items in the significant data era are generally heterogeneous and sparse, leaving the recommender system (RS) a large diversity of underlying patterns to excavate. Deep neural network-based models have reached the state-of-the-art benchmark of the RS owing to their fitting capabilities. However, prior works mainly focus on designing an intricate architecture with fixed loss function and regulation. These single-metric models provide limited performance when facing heterogeneous and sparse user behavior data. Motivated by this finding, we propose a multi-metric AutoRec (MMA) based on the representative AutoRec. The idea of the proposed MMA is mainly two-fold: 1) apply different $L_p$-norm on loss function and regularization to form different variant models in different metric spaces, and 2) aggregate these variant models. Thus, the proposed MMA enjoys the multi-metric orientation from a set of dispersed metric spaces, achieving a comprehensive representation of user data. Theoretical studies proved that the proposed MMA could attain performance improvement. The extensive experiment on five real-world datasets proves that MMA can outperform seven other state-of-the-art models in predicting unobserved user behavior data.
An enhanced geothermal system is essential to provide sustainable and long-term geothermal energy supplies and reduce carbon emissions. Optimal well-control scheme for effective heat extraction and improved heat sweep efficiency plays a significant role in geothermal development. However, the optimization performance of most existing optimization algorithms deteriorates as dimension increases. To solve this issue, a novel surrogate-assisted level-based learning evolutionary search algorithm (SLLES) is proposed for heat extraction optimization of enhanced geothermal system. SLLES consists of classifier-assisted level-based learning pre-screen part and local evolutionary search part. The cooperation of the two parts has realized the balance between the exploration and exploitation during the optimization process. After iteratively sampling from the design space, the robustness and effectiveness of the algorithm are proven to be improved significantly. To the best of our knowledge, the proposed algorithm holds state-of-the-art simulation-involved optimization framework. Comparative experiments have been conducted on benchmark functions, a two-dimensional fractured reservoir and a three-dimensional enhanced geothermal system. The proposed algorithm outperforms other five state-of-the-art surrogate-assisted algorithms on all selected benchmark functions. The results on the two heat extraction cases also demonstrate that SLLES can achieve superior optimization performance compared with traditional evolutionary algorithm and other surrogate-assisted algorithms. This work lays a solid basis for efficient geothermal extraction of enhanced geothermal system and sheds light on the model management strategies of data-driven optimization in the areas of energy exploitation.
An undirected weighted graph (UWG) is frequently adopted to describe the interactions among a solo set of nodes from real applications, such as the user contact frequency from a social network services system. A graph convolutional network (GCN) is widely adopted to perform representation learning to a UWG for subsequent pattern analysis tasks such as clustering or missing data estimation. However, existing GCNs mostly neglects the latent collaborative information hidden in its connected node pairs. To address this issue, this study proposes to model the node collaborations via a symmetric latent factor analysis model, and then regards it as a node-collaboration module for supplementing the collaboration loss in a GCN. Based on this idea, a Node-collaboration-informed Graph Convolutional Network (NGCN) is proposed with three-fold ideas: a) Learning latent collaborative information from the interaction of node pairs via a node-collaboration module; b) Building the residual connection and weighted representation propagation to obtain high representation capacity; and c) Implementing the model optimization in an end-to-end fashion to achieve precise representation to the target UWG. Empirical studies on UWGs emerging from real applications demonstrate that owing to its efficient incorporation of node-collaborations, the proposed NGCN significantly outperforms state-of-the-art GCNs in addressing the task of missing weight estimation. Meanwhile, its good scalability ensures its compatibility with more advanced GCN extensions, which will be further investigated in our future studies.
Self-supervised pre-training has become the priory choice to establish reliable models for automated recognition of massive medical images, which are routinely annotation-free, without semantics, and without guarantee of quality. Note that this paradigm is still at its infancy and limited by closely related open issues: 1) how to learn robust representations in an unsupervised manner from unlabelled medical images of low diversity in samples? and 2) how to obtain the most significant representations demanded by a high-quality segmentation? Aiming at these issues, this study proposes a knowledge-based learning framework towards enhanced recognition of medical images, which works in three phases by synergizing contrastive learning and generative learning models: 1) Sample Space Diversification: Reconstructive proxy tasks have been enabled to embed a priori knowledge with context highlighted to diversify the expanded sample space; 2) Enhanced Representation Learning: Informative noise-contrastive estimation loss regularizes the encoder to enhance representation learning of annotation-free images; 3) Correlated Optimization: Optimization operations in pre-training the encoder and the decoder have been correlated via image restoration from proxy tasks, targeting the need for semantic segmentation. Extensive experiments have been performed on various public medical image datasets (e.g., CheXpert and DRIVE) against the state-of-the-art counterparts (e.g., SimCLR and MoCo), and results demonstrate that: The proposed framework statistically excels in self-supervised benchmarks, achieving 2.08, 1.23, 1.12, 0.76 and 1.38 percentage points improvements over SimCLR in AUC/Dice. The proposed framework achieves label-efficient semi-supervised learning, e.g., reducing the annotation cost by up to 99% in pathological classification.