Federated Learning (FL) allows a number of agents to participate in training a global machine learning model without disclosing locally stored data. Compared to traditional distributed learning, the heterogeneity (non-IID) of the agents slows down the convergence in FL. Furthermore, many datasets, being too noisy or too small, are easily overfitted by complex models, such as deep neural networks. Here, we consider the problem of using FL regression on noisy, hierarchical and tabular datasets in which user distributions are significantly different. Inspired by Latent Class Regression (LCR), we propose a novel probabilistic model, Hierarchical Latent Class Regression (HLCR), and its extension to Federated Learning, FEDHLCR. FEDHLCR consists of a mixture of linear regression models, allowing better accuracy than simple linear regression, while at the same time maintaining its analytical properties and avoiding overfitting. Our inference algorithm, being derived from Bayesian theory, provides strong convergence guarantees and good robustness to overfitting. Experimental results show that FEDHLCR offers fast convergence even in non-IID datasets.
We apply the deep learning neural network architecture to the two-level system in quantum optics to solve the time-dependent Schrodinger equation. By carefully designing the network structure and tuning parameters, above 90 percent accuracy in super long-term predictions can be achieved in the case of random electric fields, which indicates a promising new method to solve the time-dependent equation for two-level systems. By slightly modifying this network, we think that this method can solve the two- or three-dimensional time-dependent Schrodinger equation more efficiently than traditional approaches.
Nowadays, deep neural networks outperform humans in many tasks. However, if the input distribution drifts away from the one used in training, their performance drops significantly. Recently published research has shown that adapting the model parameters to the test sample can mitigate this performance degradation. In this paper, we therefore propose a novel modification of the self-supervised training algorithm SwAV that adds the ability to adapt to single test samples. Using the provided prototypes of SwAV and our derived test-time loss, we align the representation of unseen test samples with the self-supervised learned prototypes. We show the success of our method on the common benchmark dataset CIFAR10-C.
Since Bustince et al. introduced the concepts of overlap and grouping functions, these two types of aggregation functions have attracted a lot of interest in both theory and applications. In this paper, the depiction of $(O,G)$-granular variable precision fuzzy rough sets ($(O,G)$-GVPFRSs for short) is first given based on overlap and grouping functions. Meanwhile, to work out the approximation operators efficiently, we give another expression of upper and lower approximation operators by means of fuzzy implications and co-implications. Furthermore, starting from the perspective of construction methods, $(O,G)$-GVPFRSs are represented under diverse fuzzy relations. Finally, some conclusions on the granular variable precision fuzzy rough sets (GVPFRSs for short) are extended to $(O,G)$-GVPFRSs under some additional conditions.
Fuzzy rough set (FRS) has a great effect on data mining processes and the fuzzy logical operators play a key role in the development of FRS theory. In order to further generalize the FRS theory to more complicated data environments, we firstly propose four types of fuzzy neighborhood operators based on fuzzy covering by overlap functions and their implicators in this paper. Meanwhile, the derived fuzzy coverings from an original fuzzy covering are defined and the equalities among overlap function-based fuzzy neighborhood operators based on a finite fuzzy covering are also investigated. Secondly, we prove that new operators can be divided into seventeen groups according to equivalence relations, and the partial order relations among these seventeen classes of operators are discussed, as well. Go further, the comparisons with $ t$-norm-based fuzzy neighborhood operators given by D'eer et al. are also made and two types of neighborhood-related fuzzy covering-based rough set models, which are defined via different fuzzy neighborhood operators that are on the basis of diverse kinds of fuzzy logical operators proposed. Furthermore, the groupings and partially order relations are also discussed. Finally, a novel fuzzy TOPSIS methodology is put forward to solve a biosynthetic nanomaterials select issue, and the rationality and enforceability of our new approach is verified by comparing its results with nine different methods.
A variety of real-world applications rely on far future information to make decisions, thus calling for efficient and accurate long sequence multivariate time series forecasting. While recent attention-based forecasting models show strong abilities in capturing long-term dependencies, they still suffer from two key limitations. First, canonical self attention has a quadratic complexity w.r.t. the input time series length, thus falling short in efficiency. Second, different variables' time series often have distinct temporal dynamics, which existing studies fail to capture, as they use the same model parameter space, e.g., projection matrices, for all variables' time series, thus falling short in accuracy. To ensure high efficiency and accuracy, we propose Triformer, a triangular, variable-specific attention. (i) Linear complexity: we introduce a novel patch attention with linear complexity. When stacking multiple layers of the patch attentions, a triangular structure is proposed such that the layer sizes shrink exponentially, thus maintaining linear complexity. (ii) Variable-specific parameters: we propose a light-weight method to enable distinct sets of model parameters for different variables' time series to enhance accuracy without compromising efficiency and memory usage. Strong empirical evidence on four datasets from multiple domains justifies our design choices, and it demonstrates that Triformer outperforms state-of-the-art methods w.r.t. both accuracy and efficiency. This is an extended version of "Triformer: Triangular, Variable-Specific Attentions for Long Sequence Multivariate Time Series Forecasting", to appear in IJCAI 2022 [Cirstea et al., 2022a], including additional experimental results.
Directional area scattering factor (DASF) is a critical canopy structural parameter for vegetation monitoring. It provides an efficient tool for decoupling of canopy structure and leaf optics from canopy reflectance. Current standard approach to estimate DASF from canopy bidirectional reflectance factor (BRF) is based on the assumption that in the weakly absorbing 710 to 790 nm spectral interval, leaf scattering does not change much with the concentration of dry matter and thus its variation can be neglected. This results in biased estimates of DASF and consequently leads to uncertainty in DASF-related applications. This study proposes a new approach to account for variations in concentrations of this biochemical constituent, which additionally uses the canopy BRF at 2260 nm. In silico analysis of the proposed approach suggests significant increase in accuracy over the standard technique by a relative root mean square error (rRMSE) of 49% and 34% for one- and three dimensional scenes, respectively. When compared with indoor multi-angular hyperspectral measurements reported in literature, the mean absolute error has reduced by 68% for needle leaf and 20% for broadleaf canopies. Thus, the proposed DASF estimation approach outperforms the current one and can be used more reliably in DASF-related applications, such as vegetation monitoring of functional traits, dynamics, and radiation budget.
In step with the digitalization of transportation, we are witnessing a growing range of path-based smart-city applications, e.g., travel-time estimation and travel path ranking. A temporal path(TP) that includes temporal information, e.g., departure time, into the path is fundamental to enable such applications. In this setting, it is essential to learn generic temporal path representations(TPRs) that consider spatial and temporal correlations simultaneously and that can be used in different applications, i.e., downstream tasks. Existing methods fail to achieve the goal since (i) supervised methods require large amounts of task-specific labels when training and thus fail to generalize the obtained TPRs to other tasks; (ii) through unsupervised methods can learn generic representations, they disregard the temporal aspect, leading to sub-optimal results. To contend with the limitations of existing solutions, we propose a Weakly-Supervised Contrastive (WSC) learning model. We first propose a temporal path encoder that encodes both the spatial and temporal information of a temporal path into a TPR. To train the encoder, we introduce weak labels that are easy and inexpensive to obtain and are relevant to different tasks, e.g., temporal labels indicating peak vs. off-peak hours from departure times. Based on the weak labels, we construct meaningful positive and negative temporal path samples by considering both spatial and temporal information, which facilities training the encoder using contrastive learning by pulling closer to the positive samples' representations while pushing away the negative samples' representations. To better guide contrastive learning, we propose a learning strategy based on Curriculum Learning such that the learning performs from easy to hard training instances. Experiments studies verify the effectiveness of the proposed method.
Few-shot object detection (FSOD) has thrived in recent years to learn novel object classes with limited data by transfering knowledge gained on abundant base classes. FSOD approaches commonly assume that both the scarcely provided examples of novel classes and test-time data belong to the same domain. However, this assumption does not hold in various industrial and robotics applications (e.g., object grasping and manipulation), where a model can learn novel classes from a source domain while inferring on classes from a different target domain. In this work, we address the task of zero-shot domain adaptation, also known as domain generalization, for FSOD. Specifically, we assume that neither images nor labels of the novel classes in the target domain are available during training. Our approach for solving the domain gap is two-fold. First, we leverage a meta-training paradigm, where we learn domain-invariant features on the base classes. Second, we propose various data augmentations techniques on the few shots of novel classes to account for all possible domain-specific information. To further constraint the network into encoding domain-agnostic class-specific representations only, a contrastive loss is proposed to maximize the mutual information between foreground proposals and class prototypes, and to reduce the network's bias to the background information. Our experiments on the T-LESS dataset show that the proposed approach succeeds in alleviating the domain gap considerably without utilizing labels or images of novel categories from the target domain.
Time series data occurs widely, and outlier detection is a fundamental problem in data mining, which has numerous applications. Existing autoencoder-based approaches deliver state-of-the-art performance on challenging real-world data but are vulnerable to outliers and exhibit low explainability. To address these two limitations, we propose robust and explainable unsupervised autoencoder frameworks that decompose an input time series into a clean time series and an outlier time series using autoencoders. Improved explainability is achieved because clean time series are better explained with easy-to-understand patterns such as trends and periodicities. We provide insight into this by means of a post-hoc explainability analysis and empirical studies. In addition, since outliers are separated from clean time series iteratively, our approach offers improved robustness to outliers, which in turn improves accuracy. We evaluate our approach on five real-world datasets and report improvements over the state-of-the-art approaches in terms of robustness and explainability. This is an extended version of "Robust and Explainable Autoencoders for Unsupervised Time Series Outlier Detection", to appear in IEEE ICDE 2022.