The sharp and recent increase in the availability of data captured by different sensors combined with their considerably heterogeneous natures poses a serious challenge for the effective and efficient processing of remotely sensed data. Such an increase in remote sensing and ancillary datasets, however, opens up the possibility of utilizing multimodal datasets in a joint manner to further improve the performance of the processing approaches with respect to the application at hand. Multisource data fusion has, therefore, received enormous attention from researchers worldwide for a wide variety of applications. Moreover, thanks to the revisit capability of several spaceborne sensors, the integration of the temporal information with the spatial and/or spectral/backscattering information of the remotely sensed data is possible and helps to move from a representation of 2D/3D data to 4D data structures, where the time variable adds new information as well as challenges for the information extraction algorithms. There are a huge number of research works dedicated to multisource and multitemporal data fusion, but the methods for the fusion of different modalities have expanded in different paths according to each research community. This paper brings together the advances of multisource and multitemporal data fusion approaches with respect to different research communities and provides a thorough and discipline-specific starting point for researchers at different levels (i.e., students, researchers, and senior researchers) willing to conduct novel investigations on this challenging topic by supplying sufficient detail and references.
The fast-growing amount of information on the Internet makes the research in automatic document summarization very urgent. It is an effective solution for information overload. Many approaches have been proposed based on different strategies, such as latent semantic analysis (LSA). However, LSA, when applied to document summarization, has some limitations which diminish its performance. In this work, we try to overcome these limitations by applying statistic and linear algebraic approaches combined with syntactic and semantic processing of text. First, the part of speech tagger is utilized to reduce the dimension of LSA. Then, the weight of the term in four adjacent sentences is added to the weighting schemes while calculating the input matrix to take into account the word order and the syntactic relations. In addition, a new LSA-based sentence selection algorithm is proposed, in which the term description is combined with sentence description for each topic which in turn makes the generated summary more informative and diverse. To ensure the effectiveness of the proposed LSA-based sentence selection algorithm, extensive experiment on Arabic and English are done. Four datasets are used to evaluate the new model, Linguistic Data Consortium (LDC) Arabic Newswire-a corpus, Essex Arabic Summaries Corpus (EASC), DUC2002, and Multilingual MSS 2015 dataset. Experimental results on the four datasets show the effectiveness of the proposed model on Arabic and English datasets. It performs comprehensively better compared to the state-of-the-art methods.
Social and behavioral interventions are a critical tool for governments and communities to tackle deep-rooted societal challenges such as homelessness, disease, and poverty. However, real-world interventions are almost always plagued by limited resources and limited data, which creates a computational challenge: how can we use algorithmic techniques to enhance the targeting and delivery of social and behavioral interventions? The goal of my thesis is to provide a unified study of such questions, collectively considered under the name "algorithmic social intervention". This proposal introduces algorithmic social intervention as a distinct area with characteristic technical challenges, presents my published research in the context of these challenges, and outlines open problems for future work. A common technical theme is decision making under uncertainty: how can we find actions which will impact a social system in desirable ways under limitations of knowledge and resources? The primary application area for my work thus far is public health, e.g. HIV or tuberculosis prevention. For instance, I have developed a series of algorithms which optimize social network interventions for HIV prevention. Two of these algorithms have been pilot-tested in collaboration with LA-area service providers for homeless youth, with preliminary results showing substantial improvement over status-quo approaches. My work also spans other topics in infectious disease prevention and underlying algorithmic questions in robust and risk-aware submodular optimization.
With advances in data collection technologies, tensor data is assuming increasing prominence in many applications and the problem of supervised tensor learning has emerged as a topic of critical significance in the data mining and machine learning community. Conventional methods for supervised tensor learning mainly focus on learning kernels by flattening the tensor into vectors or matrices, however structural information within the tensors will be lost. In this paper, we introduce a new scheme to design structure-preserving kernels for supervised tensor learning. Specifically, we demonstrate how to leverage the naturally available structure within the tensorial representation to encode prior knowledge in the kernel. We proposed a tensor kernel that can preserve tensor structures based upon dual-tensorial mapping. The dual-tensorial mapping function can map each tensor instance in the input space to another tensor in the feature space while preserving the tensorial structure. Theoretically, our approach is an extension of the conventional kernels in the vector space to tensor space. We applied our novel kernel in conjunction with SVM to real-world tensor classification problems including brain fMRI classification for three different diseases (i.e., Alzheimer's disease, ADHD and brain damage by HIV). Extensive empirical studies demonstrate that our proposed approach can effectively boost tensor classification performances, particularly with small sample sizes.
The thyroid, an endocrine gland that secretes hormones in the blood, circulates its products to all tissues of the body, where they control vital functions in every cell. Normal levels of thyroid hormone help the brain, heart, intestines, muscles and reproductive system function normally. Thyroid hormones control the metabolism of the body. Abnormalities of thyroid function are usually related to production of too little thyroid hormone (hypothyroidism) or production of too much thyroid hormone (hyperthyroidism). Therefore, the correct diagnosis of these diseases is very important topic. In this study, Linguistic Hedges Neural-Fuzzy Classifier with Selected Features (LHNFCSF) is presented for diagnosis of thyroid diseases. The performance evaluation of this system is estimated by using classification accuracy and k-fold cross-validation. The results indicated that the classification accuracy without feature selection was 98.6047% and 97.6744% during training and testing phases, respectively with RMSE of 0.02335. After applying feature selection algorithm, LHNFCSF achieved 100% for all cluster sizes during training phase. However, in the testing phase LHNFCSF achieved 88.3721% using one cluster for each class, 90.6977% using two clusters, 91.8605% using three clusters and 97.6744% using four clusters for each class and 12 fuzzy rules. The obtained classification accuracy was very promising with regard to the other classification applications in literature for this problem.
Attribute-based Controlled Text Generation (CTG) refers to generating sentences that satisfy desirable attributes (e.g., emotions and topics). Existing works often utilize fine-tuning or resort to extra attribute classifiers, yet suffer from storage and inference time increases. To address these concerns, we explore attribute-based CTG in a prompt-based manner. In short, the proposed Tailor represents each attribute as a pre-trained continuous vector (i.e., single-attribute prompt) and guides the generation of a fixed PLM switch to a pre-specified attribute. We experimentally find that these prompts can be simply concatenated as a whole to multi-attribute CTG without any re-training, yet raises problems of fluency decrease and position sensitivity. To this end, Tailor provides a multi-attribute prompt mask and a re-indexing position-ids sequence to bridge the gap between the training (one prompt for each task) and testing stage (concatenating more than one prompt). To further enhance such single-attribute prompt combinations, Tailor also introduces a trainable prompt connector, which can be concatenated with any two single-attribute prompts to multi-attribute text generation. Experiments on 11 attribute-specific generation tasks demonstrate strong performances of Tailor on both single-attribute and multi-attribute CTG, with 0.08\% training parameters of a GPT-2.
New classes arise frequently in our ever-changing world, e.g., emerging topics in social media and new types of products in e-commerce. A model should recognize new classes and meanwhile maintain discriminability over old classes. Under severe circumstances, only limited novel instances are available to incrementally update the model. The task of recognizing few-shot new classes without forgetting old classes is called few-shot class-incremental learning (FSCIL). In this work, we propose a new paradigm for FSCIL based on meta-learning by LearnIng Multi-phase Incremental Tasks (LIMIT), which synthesizes fake FSCIL tasks from the base dataset. The data format of fake tasks is consistent with the `real' incremental tasks, and we can build a generalizable feature space for the unseen tasks through meta-learning. Besides, LIMIT also constructs a calibration module based on transformer, which calibrates the old class classifiers and new class prototypes into the same scale and fills in the semantic gap. The calibration module also adaptively contextualizes the instance-specific embedding with a set-to-set function. LIMIT efficiently adapts to new classes and meanwhile resists forgetting over old classes. Experiments on three benchmark datasets (CIFAR100, miniImageNet, and CUB200) and large-scale dataset, i.e., ImageNet ILSVRC2012 validate that LIMIT achieves state-of-the-art performance.
Salient object detection on RGB-D images is an active topic in computer vision. Although the existing methods have achieved appreciable performance, there are still some challenges. The locality of convolutional neural network requires that the model has a sufficiently deep global receptive field, which always leads to the loss of local details. To address the challenge, we propose a novel Group Transformer Network (GroupTransNet) for RGB-D salient object detection. This method is good at learning the long-range dependencies of cross layer features to promote more perfect feature expression. At the beginning, the features of the slightly higher classes of the middle three levels and the latter three levels are soft grouped to absorb the advantages of the high-level features. The input features are repeatedly purified and enhanced by the attention mechanism to purify the cross modal features of color modal and depth modal. The features of the intermediate process are first fused by the features of different layers, and then processed by several transformers in multiple groups, which not only makes the size of the features of each scale unified and interrelated, but also achieves the effect of sharing the weight of the features within the group. The output features in different groups complete the clustering staggered by two owing to the level difference, and combine with the low-level features. Extensive experiments demonstrate that GroupTransNet outperforms the comparison models and achieves the new state-of-the-art performance.
Semantic segmentation of fine-resolution urban scene images plays a vital role in extensive practical applications, such as land cover mapping, urban change detection, environmental protection and economic assessment. Driven by rapid developments in deep learning technologies, convolutional neural networks (CNNs) have dominated the semantic segmentation task for many years. Convolutional neural networks adopt hierarchical feature representation and have strong local context extraction. However, the local property of the convolution layer limits the network from capturing global information that is crucial for improving fine-resolution image segmentation. Recently, Transformer comprise a hot topic in the computer vision domain. Vision Transformer demonstrates the great capability of global information modelling, boosting many vision tasks, such as image classification, object detection and especially semantic segmentation. In this paper, we propose an efficient hybrid Transformer (EHT) for semantic segmentation of urban scene images. EHT takes advantage of CNNs and Transformer, learning global-local context to strengthen the feature representation. Extensive experiments demonstrate that EHT has higher efficiency with competitive accuracy compared with state-of-the-art benchmark methods. Specifically, the proposed EHT achieves a 67.0% mIoU on the UAVid test set and outperforms other lightweight models significantly. The code will be available soon.
Online learning is becoming increasingly popular, whether for convenience, to accommodate work hours, or simply to have the freedom to study from anywhere. Especially, during the Covid-19 pandemic, it has become the only viable option for learning. The effectiveness of teaching various hard-core programming courses with a mix of theoretical content is determined by the student interaction and responses. In contrast to a digital lecture through Zoom or Teams, a lecturer may rapidly acquire such responses from students' facial expressions, behavior, and attitude in a physical session, even if the listener is largely idle and non-interactive. However, student assessment in virtual learning is a challenging task. Despite the challenges, different technologies are progressively being integrated into teaching environments to boost student engagement and motivation. In this paper, we evaluate the effectiveness of various in-class feedback assessment methods such as Kahoot!, Mentimeter, Padlet, and polling to assist a lecturer in obtaining real-time feedback from students throughout a session and adapting the teaching style accordingly. Furthermore, some of the topics covered by student suggestions include tutor suggestions, enhancing teaching style, course content, and other subjects. Any input gives the instructor valuable insight into how to improve the student's learning experience, however, manually going through all of the qualitative comments and extracting the ideas is tedious. Thus, in this paper, we propose a sentiment analysis model for extracting the explicit suggestions from the students' qualitative feedback comments.