Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vincent Christlein

Pattern Recognition Lab, FAU Erlangen-Nürnberg

Gesture Classification in Artworks Using Contextual Image Features

Dec 04, 2024

Azhar Hussian, Mathias Zinnen, Thi My Hang Tran, Andreas Maier, Vincent Christlein

Figure 1 for Gesture Classification in Artworks Using Contextual Image Features

Figure 2 for Gesture Classification in Artworks Using Contextual Image Features

Figure 3 for Gesture Classification in Artworks Using Contextual Image Features

Figure 4 for Gesture Classification in Artworks Using Contextual Image Features

Abstract:Recognizing gestures in artworks can add a valuable dimension to art understanding and help to acknowledge the role of the sense of smell in cultural heritage. We propose a method to recognize smell gestures in historical artworks. We show that combining local features with global image context improves classification performance notably on different backbones.

* Digital Humanities Conference, Arlington, USA, 2024, pp.287-290

Via

Access Paper or Ask Questions

Nuremberg Letterbooks: A Multi-Transcriptional Dataset of Early 15th Century Manuscripts for Document Analysis

Nov 11, 2024

Martin Mayr, Julian Krenz, Katharina Neumeier, Anna Bub, Simon Bürcky, Nina Brolich, Klaus Herbers, Mechthild Habermann, Peter Fleischmann, Andreas Maier(+1 more)

Figure 1 for Nuremberg Letterbooks: A Multi-Transcriptional Dataset of Early 15th Century Manuscripts for Document Analysis

Figure 2 for Nuremberg Letterbooks: A Multi-Transcriptional Dataset of Early 15th Century Manuscripts for Document Analysis

Figure 3 for Nuremberg Letterbooks: A Multi-Transcriptional Dataset of Early 15th Century Manuscripts for Document Analysis

Figure 4 for Nuremberg Letterbooks: A Multi-Transcriptional Dataset of Early 15th Century Manuscripts for Document Analysis

Abstract:Most datasets in the field of document analysis utilize highly standardized labels, which, while simplifying specific tasks, often produce outputs that are not directly applicable to humanities research. In contrast, the Nuremberg Letterbooks dataset, which comprises historical documents from the early 15th century, addresses this gap by providing multiple types of transcriptions and accompanying metadata. This approach allows for developing methods that are more closely aligned with the needs of the humanities. The dataset includes 4 books containing 1711 labeled pages written by 10 scribes. Three types of transcriptions are provided for handwritten text recognition: Basic, diplomatic, and regularized. For the latter two, versions with and without expanded abbreviations are also available. A combination of letter ID and writer ID supports writer identification due to changing writers within pages. In the technical validation, we established baselines for various tasks, demonstrating data consistency and providing benchmarks for future research to build upon.

Via

Access Paper or Ask Questions

Zero-Shot Paragraph-level Handwriting Imitation with Latent Diffusion Models

Sep 01, 2024

Martin Mayr, Marcel Dreier, Florian Kordon, Mathias Seuret, Jochen Zöllner, Fei Wu, Andreas Maier, Vincent Christlein

Figure 1 for Zero-Shot Paragraph-level Handwriting Imitation with Latent Diffusion Models

Figure 2 for Zero-Shot Paragraph-level Handwriting Imitation with Latent Diffusion Models

Figure 3 for Zero-Shot Paragraph-level Handwriting Imitation with Latent Diffusion Models

Figure 4 for Zero-Shot Paragraph-level Handwriting Imitation with Latent Diffusion Models

Abstract:The imitation of cursive handwriting is mainly limited to generating handwritten words or lines. Multiple synthetic outputs must be stitched together to create paragraphs or whole pages, whereby consistency and layout information are lost. To close this gap, we propose a method for imitating handwriting at the paragraph level that also works for unseen writing styles. Therefore, we introduce a modified latent diffusion model that enriches the encoder-decoder mechanism with specialized loss functions that explicitly preserve the style and content. We enhance the attention mechanism of the diffusion model with adaptive 2D positional encoding and the conditioning mechanism to work with two modalities simultaneously: a style image and the target text. This significantly improves the realism of the generated handwriting. Our approach sets a new benchmark in our comprehensive evaluation. It outperforms all existing imitation methods at both line and paragraph levels, considering combined style and content preservation.

Via

Access Paper or Ask Questions

Novel Artistic Scene-Centric Datasets for Effective Transfer Learning in Fragrant Spaces

Jul 16, 2024

Shumei Liu, Haiting Huang, Mathias Zinnen, Andreas Maier, Vincent Christlein

Figure 1 for Novel Artistic Scene-Centric Datasets for Effective Transfer Learning in Fragrant Spaces

Figure 2 for Novel Artistic Scene-Centric Datasets for Effective Transfer Learning in Fragrant Spaces

Figure 3 for Novel Artistic Scene-Centric Datasets for Effective Transfer Learning in Fragrant Spaces

Figure 4 for Novel Artistic Scene-Centric Datasets for Effective Transfer Learning in Fragrant Spaces

Abstract:Olfaction, often overlooked in cultural heritage studies, holds profound significance in shaping human experiences and identities. Examining historical depictions of olfactory scenes can offer valuable insights into the role of smells in history. We show that a transfer-learning approach using weakly labeled training data can remarkably improve the classification of fragrant spaces and, more generally, artistic scene depictions. We fine-tune Places365-pre-trained models by querying two cultural heritage data sources and using the search terms as supervision signal. The models are evaluated on two manually corrected test splits. This work lays a foundation for further exploration of fragrant spaces recognition and artistic scene classification. All images and labels are released as the ArtPlaces dataset at https://zenodo.org/doi/10.5281/zenodo.11584328.

Via

Access Paper or Ask Questions

Smell and Emotion: Recognising emotions in smell-related artworks

Jul 05, 2024

Vishal Patoliya, Mathias Zinnen, Andreas Maier, Vincent Christlein

Figure 1 for Smell and Emotion: Recognising emotions in smell-related artworks

Figure 2 for Smell and Emotion: Recognising emotions in smell-related artworks

Figure 3 for Smell and Emotion: Recognising emotions in smell-related artworks

Figure 4 for Smell and Emotion: Recognising emotions in smell-related artworks

Abstract:Emotions and smell are underrepresented in digital art history. In this exploratory work, we show that recognising emotions from smell-related artworks is technically feasible but has room for improvement. Using style transfer and hyperparameter optimization we achieve a minor performance boost and open up the field for future extensions.

* 5 pages, 3 figures

Via

Access Paper or Ask Questions

Offline Writer Identification Using Convolutional Neural Network Activation Features

Feb 26, 2024

Vincent Christlein, David Bernecker, Andreas Maier, Elli Angelopoulou

Abstract:Convolutional neural networks (CNNs) have recently become the state-of-the-art tool for large-scale image classification. In this work we propose the use of activation features from CNNs as local descriptors for writer identification. A global descriptor is then formed by means of GMM supervector encoding, which is further improved by normalization with the KL-Kernel. We evaluate our method on two publicly available datasets: the ICDAR 2013 benchmark database and the CVL dataset. While we perform comparably to the state of the art on CVL, our proposed method yields about 0.21 absolute improvement in terms of mAP on the challenging bilingual ICDAR dataset.

* Pattern Recognition. DAGM 2015. Lecture Notes in Computer Science(), vol 9358. Springer, Cham
* fixed tab 1b

Via

Access Paper or Ask Questions

ARIN: Adaptive Resampling and Instance Normalization for Robust Blind Inpainting of Dunhuang Cave Paintings

Feb 25, 2024

Alexander Schmidt, Prathmesh Madhu, Andreas Maier, Vincent Christlein, Ronak Kosti

Abstract:Image enhancement algorithms are very useful for real world computer vision tasks where image resolution is often physically limited by the sensor size. While state-of-the-art deep neural networks show impressive results for image enhancement, they often struggle to enhance real-world images. In this work, we tackle a real-world setting: inpainting of images from Dunhuang caves. The Dunhuang dataset consists of murals, half of which suffer from corrosion and aging. These murals feature a range of rich content, such as Buddha statues, bodhisattvas, sponsors, architecture, dance, music, and decorative patterns designed by different artists spanning ten centuries, which makes manual restoration challenging. We modify two different existing methods (CAR, HINet) that are based upon state-of-the-art (SOTA) super resolution and deblurring networks. We show that those can successfully inpaint and enhance these deteriorated cave paintings. We further show that a novel combination of CAR and HINet, resulting in our proposed inpainting network (ARIN), is very robust to external noise, especially Gaussian noise. To this end, we present a quantitative and qualitative comparison of our proposed approach with existing SOTA networks and winners of the Dunhuang challenge. One of the proposed methods HINet) represents the new state of the art and outperforms the 1st place of the Dunhuang Challenge, while our combination ARIN, which is robust to noise, is comparable to the 1st place. We also present and discuss qualitative results showing the impact of our method for inpainting on Dunhuang cave images.

* 2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA), Salzburg, Austria, 2022, pp. 1-6

Via

Access Paper or Ask Questions

A Fair Evaluation of Various Deep Learning-Based Document Image Binarization Approaches

Jan 22, 2024

Richin Sukesh, Mathias Seuret, Anguelos Nicolaou, Martin Mayr, Vincent Christlein

Abstract:Binarization of document images is an important pre-processing step in the field of document analysis. Traditional image binarization techniques usually rely on histograms or local statistics to identify a valid threshold to differentiate between different aspects of the image. Deep learning techniques are able to generate binarized versions of the images by learning context-dependent features that are less error-prone to degradation typically occurring in document images. In recent years, many deep learning-based methods have been developed for document binarization. But which one to choose? There have been no studies that compare these methods rigorously. Therefore, this work focuses on the evaluation of different deep learning-based methods under the same evaluation protocol. We evaluate them on different Document Image Binarization Contest (DIBCO) datasets and obtain very heterogeneous results. We show that the DE-GAN model was able to perform better compared to other models when evaluated on the DIBCO2013 dataset while DP-LinkNet performed best on the DIBCO2017 dataset. The 2-StageGAN performed best on the DIBCO2018 dataset while SauvolaNet outperformed the others on the DIBCO2019 challenge. Finally, we make the code, all models and evaluation publicly available (https://github.com/RichSu95/Document_Binarization_Collection) to ensure reproducibility and simplify future binarization evaluations.

* Document Analysis Systems. DAS 2022. Lecture Notes in Computer Science, vol 13237. Springer, Cham
* DAS 2022

Via

Access Paper or Ask Questions

SniffyArt: The Dataset of Smelling Persons

Nov 20, 2023

Mathias Zinnen, Azhar Hussian, Hang Tran, Prathmesh Madhu, Andreas Maier, Vincent Christlein

Figure 1 for SniffyArt: The Dataset of Smelling Persons

Figure 2 for SniffyArt: The Dataset of Smelling Persons

Figure 3 for SniffyArt: The Dataset of Smelling Persons

Figure 4 for SniffyArt: The Dataset of Smelling Persons

Abstract:Smell gestures play a crucial role in the investigation of past smells in the visual arts yet their automated recognition poses significant challenges. This paper introduces the SniffyArt dataset, consisting of 1941 individuals represented in 441 historical artworks. Each person is annotated with a tightly fitting bounding box, 17 pose keypoints, and a gesture label. By integrating these annotations, the dataset enables the development of hybrid classification approaches for smell gesture recognition. The datasets high-quality human pose estimation keypoints are achieved through the merging of five separate sets of keypoint annotations per person. The paper also presents a baseline analysis, evaluating the performance of representative algorithms for detection, keypoint estimation, and classification tasks, showcasing the potential of combining keypoint estimation with smell gesture classification. The SniffyArt dataset lays a solid foundation for future research and the exploration of multi-task approaches leveraging pose keypoints and person boxes to advance human gesture and olfactory dimension analysis in historical artworks.

* Proceedings of the 5th Workshop on analySis, Understanding and proMotion of heritAge Contents. 2023. S. 49-58
* 10 pages, 8 figures

Via

Access Paper or Ask Questions

Efficient Annotation of Medieval Charters

Jun 24, 2023

Anguelos Nicolaou, Daniel Luger, Franziska Decker, Nicolas Renet, Vincent Christlein, Georg Vogeler

Abstract:Diplomatics, the analysis of medieval charters, is a major field of research in which paleography is applied. Annotating data, if performed by laymen, needs validation and correction by experts. In this paper, we propose an effective and efficient annotation approach for charter segmentation, essentially reducing it to object detection. This approach allows for a much more efficient use of the paleographer's time and produces results that can compete and even outperform pixel-level segmentation in some use cases. Further experiments shed light on how to design a class ontology in order to make the best use of annotators' time and effort. Exploiting the presence of calibration cards in the image, we further annotate the data with the physical length in pixels and train regression neural networks to predict it from image patches.

Via

Access Paper or Ask Questions