Robust learning on noisy-labeled data has been an important task in real applications, because label noise directly leads to the poor generalization of deep learning models. Existing label-noise learning methods usually assume that the ground-truth classes of the training data are balanced. However, the real-world data is often imbalanced, leading to the inconsistency between observed and intrinsic class distribution due to label noises. Distribution inconsistency makes the problem of label-noise learning more challenging because it is hard to distinguish clean samples from noisy samples on the intrinsic tail classes. In this paper, we propose a learning framework for label-noise learning with intrinsically long-tailed data. Specifically, we propose a robust sample selection method called two-stage bi-dimensional sample selection (TBSS) to better separate clean samples from noisy samples, especially for the tail classes. TBSS consists of two new separation metrics to jointly separate samples in each class. Extensive experiments on multiple noisy-labeled datasets with intrinsically long-tailed class distribution demonstrate the effectiveness of our method.
Unstructured pruning has the limitation of dealing with the sparse and irregular weights. By contrast, structured pruning can help eliminate this drawback but it requires complex criterion to determine which components to be pruned. To this end, this paper presents a new method termed TissueNet, which directly constructs compact neural networks with fewer weight parameters by independently stacking designed basic units, without requiring additional judgement criteria anymore. Given the basic units of various architectures, they are combined and stacked in a certain form to build up compact neural networks. We formulate TissueNet in diverse popular backbones for comparison with the state-of-the-art pruning methods on different benchmark datasets. Moreover, two new metrics are proposed to evaluate compression performance. Experiment results show that TissueNet can achieve comparable classification accuracy while saving up to around 80% FLOPs and 89.7% parameters. That is, stacking basic units provides a new promising way for network compression.
Federated learning provides a privacy guarantee for generating good deep learning models on distributed clients with different kinds of data. Nevertheless, dealing with non-IID data is one of the most challenging problems for federated learning. Researchers have proposed a variety of methods to eliminate the negative influence of non-IIDness. However, they only focus on the non-IID data provided that the universal class distribution is balanced. In many real-world applications, the universal class distribution is long-tailed, which causes the model seriously biased. Therefore, this paper studies the joint problem of non-IID and long-tailed data in federated learning and proposes a corresponding solution called Federated Ensemble Distillation with Imbalance Calibration (FEDIC). To deal with non-IID data, FEDIC uses model ensemble to take advantage of the diversity of models trained on non-IID data. Then, a new distillation method with logit adjustment and calibration gating network is proposed to solve the long-tail problem effectively. We evaluate FEDIC on CIFAR-10-LT, CIFAR-100-LT, and ImageNet-LT with a highly non-IID experimental setting, in comparison with the state-of-the-art methods of federated learning and long-tail learning. Our code is available at https://github.com/shangxinyi/FEDIC.
Despite enormous research interest and rapid application of federated learning (FL) to various areas, existing studies mostly focus on supervised federated learning under the horizontally partitioned local dataset setting. This paper will study the unsupervised FL under the vertically partitioned dataset setting. Accordingly, we propose the federated principal component analysis for vertically partitioned dataset (VFedPCA) method, which reduces the dimensionality across the joint datasets over all the clients and extracts the principal component feature information for downstream data analysis. We further take advantage of the nonlinear dimensionality reduction and propose the vertical federated advanced kernel principal component analysis (VFedAKPCA) method, which can effectively and collaboratively model the nonlinear nature existing in many real datasets. In addition, we study two communication topologies. The first is a server-client topology where a semi-trusted server coordinates the federated training, while the second is the fully-decentralized topology which further eliminates the requirement of the server by allowing clients themselves to communicate with their neighbors. Extensive experiments conducted on five types of real-world datasets corroborate the efficacy of VFedPCA and VFedAKPCA under the vertically partitioned FL setting. Code is available at: https://github.com/juyongjiang/VFedPCA-VFedAKPCA
Cross-modal hashing, favored for its effectiveness and efficiency, has received wide attention to facilitating efficient retrieval across different modalities. Nevertheless, most existing methods do not sufficiently exploit the discriminative power of semantic information when learning the hash codes, while often involving time-consuming training procedure for handling the large-scale dataset. To tackle these issues, we formulate the learning of similarity-preserving hash codes in terms of orthogonally rotating the semantic data so as to minimize the quantization loss of mapping such data to hamming space, and propose an efficient Fast Discriminative Discrete Hashing (FDDH) approach for large-scale cross-modal retrieval. More specifically, FDDH introduces an orthogonal basis to regress the targeted hash codes of training examples to their corresponding semantic labels, and utilizes "-dragging technique to provide provable large semantic margins. Accordingly, the discriminative power of semantic information can be explicitly captured and maximized. Moreover, an orthogonal transformation scheme is further proposed to map the nonlinear embedding data into the semantic subspace, which can well guarantee the semantic consistency between the data feature and its semantic representation. Consequently, an efficient closed form solution is derived for discriminative hash code learning, which is very computationally efficient. In addition, an effective and stable online learning strategy is presented for optimizing modality-specific projection functions, featuring adaptivity to different training sizes and streaming data. The proposed FDDH approach theoretically approximates the bi-Lipschitz continuity, runs sufficiently fast, and also significantly improves the retrieval performance over the state-of-the-art methods. The source code is released at: https://github.com/starxliu/FDDH.
The main feature of the Dynamic Multi-objective Optimization Problems (DMOPs) is that optimization objective functions will change with times or environments. One of the promising approaches for solving the DMOPs is reusing the obtained Pareto optimal set (POS) to train prediction models via machine learning approaches. In this paper, we train an Incremental Support Vector Machine (ISVM) classifier with the past POS, and then the solutions of the DMOP we want to solve at the next moment are filtered through the trained ISVM classifier. A high-quality initial population will be generated by the ISVM classifier, and a variety of different types of population-based dynamic multi-objective optimization algorithms can benefit from the population. To verify this idea, we incorporate the proposed approach into three evolutionary algorithms, the multi-objective particle swarm optimization(MOPSO), Nondominated Sorting Genetic Algorithm II (NSGA-II), and the Regularity Model-based multi-objective estimation of distribution algorithm(RE-MEDA). We employ experiments to test these algorithms, and experimental results show the effectiveness.
Recent studies have shown that imbalance ratio is not the only cause of the performance loss of a classifier in imbalanced data classification. In fact, other data factors, such as small disjuncts, noises and overlapping, also play the roles in tandem with imbalance ratio, which makes the problem difficult. Thus far, the empirical studies have demonstrated the relationship between the imbalance ratio and other data factors only. To the best of our knowledge, there is no any measurement about the extent of influence of class imbalance on the classification performance of imbalanced data. Further, it is also unknown for a dataset which data factor is actually the main barrier for classification. In this paper, we focus on Bayes optimal classifier and study the influence of class imbalance from a theoretical perspective. Accordingly, we propose an instance measure called Individual Bayes Imbalance Impact Index ($IBI^3$) and a data measure called Bayes Imbalance Impact Index ($BI^3$). $IBI^3$ and $BI^3$ reflect the extent of influence purely by the factor of imbalance in terms of each minority class sample and the whole dataset, respectively. Therefore, $IBI^3$ can be used as an instance complexity measure of imbalance and $BI^3$ is a criterion to show the degree of how imbalance deteriorates the classification. As a result, we can therefore use $BI^3$ to judge whether it is worth using imbalance recovery methods like sampling or cost-sensitive methods to recover the performance loss of a classifier. The experiments show that $IBI^3$ is highly consistent with the increase of prediction score made by the imbalance recovery methods and $BI^3$ is highly consistent with the improvement of F1 score made by the imbalance recovery methods on both synthetic and real benchmark datasets.
Hashing has recently sparked a great revolution in cross-modal retrieval due to its low storage cost and high query speed. Most existing cross-modal hashing methods learn unified hash codes in a common Hamming space to represent all multi-modal data and make them intuitively comparable. However, such unified hash codes could inherently sacrifice their representation scalability because the data from different modalities may not have one-to-one correspondence and could be stored more efficiently by different hash codes of unequal lengths. To mitigate this problem, this paper proposes a generalized and flexible cross-modal hashing framework, termed Matrix Tri-Factorization Hashing (MTFH), which not only preserves the semantic similarity between the multi-modal data points, but also works seamlessly in various settings including paired or unpaired multi-modal data, and equal or varying hash length encoding scenarios. Specifically, MTFH exploits an efficient objective function to jointly learn the flexible modality-specific hash codes with different length settings, while simultaneously excavating two semantic correlation matrices to ensure heterogeneous data comparable. As a result, the derived hash codes are more semantically meaningful for various challenging cross-modal retrieval tasks. Extensive experiments evaluated on public benchmark datasets highlight the superiority of MTFH under various retrieval scenarios and show its very competitive performance with the state-of-the-arts.