Abstract:Continual learning is a challenging problem in machine learning, especially for image classification tasks with imbalanced datasets. It becomes even more challenging when it involves learning new classes incrementally. One method for incremental class learning, addressing dataset imbalance, is rehearsal using previously stored data. In rehearsal-based methods, access to previous data is required for either training the classifier or the generator, but it may not be feasible due to storage, legal, or data access constraints. Although there are many rehearsal-free alternatives for class incremental learning, such as parameter or loss regularization, knowledge distillation, and dynamic architectures, they do not consistently achieve good results, especially on imbalanced data. This paper proposes a new approach called Data-Free Generative Replay (DFGR) for class incremental learning, where the generator is trained without access to real data. In addition, DFGR also addresses dataset imbalance in continual learning of an image classifier. Instead of using training data, DFGR trains a generator using mean and variance statistics of batch-norm and feature maps derived from a pre-trained classification model. The results of our experiments demonstrate that DFGR performs significantly better than other data-free methods and reveal the performance impact of specific parameter settings. DFGR achieves up to 88.5% and 46.6% accuracy on MNIST and FashionMNIST datasets, respectively. Our code is available at https://github.com/2younis/DFGR




Abstract:As herbarium specimens are increasingly becoming digitized and accessible in online repositories, advanced computer vision techniques are being used to extract information from them. The presence of certain plant organs on herbarium sheets is useful information in various scientific contexts and automatic recognition of these organs will help mobilize such information. In our study we use deep learning to detect plant organs on digitized herbarium specimens with Faster R-CNN. For our experiment we manually annotated hundreds of herbarium scans with thousands of bounding boxes for six types of plant organs and used them for training and evaluating the plant organ detection model. The model worked particularly well on leaves and stems, while flowers were also present in large numbers in the sheets, but not equally well recognized.