Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"photo": models, code, and papers

Automatic Photo Adjustment Using Deep Neural Networks

May 16, 2015
Zhicheng Yan, Hao Zhang, Baoyuan Wang, Sylvain Paris, Yizhou Yu

Photo retouching enables photographers to invoke dramatic visual impressions by artistically enhancing their photos through stylistic color and tone adjustments. However, it is also a time-consuming and challenging task that requires advanced skills beyond the abilities of casual photographers. Using an automated algorithm is an appealing alternative to manual work but such an algorithm faces many hurdles. Many photographic styles rely on subtle adjustments that depend on the image content and even its semantics. Further, these adjustments are often spatially varying. Because of these characteristics, existing automatic algorithms are still limited and cover only a subset of these challenges. Recently, deep machine learning has shown unique abilities to address hard problems that resisted machine algorithms for long. This motivated us to explore the use of deep learning in the context of photo editing. In this paper, we explain how to formulate the automatic photo adjustment problem in a way suitable for this approach. We also introduce an image descriptor that accounts for the local semantics of an image. Our experiments demonstrate that our deep learning formulation applied using these descriptors successfully capture sophisticated photographic styles. In particular and unlike previous techniques, it can model local adjustments that depend on the image semantics. We show on several examples that this yields results that are qualitatively and quantitatively better than previous work.

* TOG minor revision 
  
Access Paper or Ask Questions

Attribute-Centered Loss for Soft-Biometrics Guided Face Sketch-Photo Recognition

Apr 09, 2018
Hadi Kazemi, Sobhan Soleymani, Ali Dabouei, Mehdi Iranmanesh, Nasser M. Nasrabadi

Face sketches are able to capture the spatial topology of a face while lacking some facial attributes such as race, skin, or hair color. Existing sketch-photo recognition approaches have mostly ignored the importance of facial attributes. In this paper, we propose a new loss function, called attribute-centered loss, to train a Deep Coupled Convolutional Neural Network (DCCNN) for the facial attribute guided sketch to photo matching. Specifically, an attribute-centered loss is proposed which learns several distinct centers, in a shared embedding space, for photos and sketches with different combinations of attributes. The DCCNN simultaneously is trained to map photos and pairs of testified attributes and corresponding forensic sketches around their associated centers, while preserving the spatial topology information. Importantly, the centers learn to keep a relative distance from each other, related to their number of contradictory attributes. Extensive experiments are performed on composite (E-PRIP) and semi-forensic (IIIT-D Semi-forensic) databases. The proposed method significantly outperforms the state-of-the-art.

* Accepted as a conference paper on CVPRW 2018 
  
Access Paper or Ask Questions

Moiré Photo Restoration Using Multiresolution Convolutional Neural Networks

May 08, 2018
Yujing Sun, Yizhou Yu, Wenping Wang

Digital cameras and mobile phones enable us to conveniently record precious moments. While digital image quality is constantly being improved, taking high-quality photos of digital screens still remains challenging because the photos are often contaminated with moir\'{e} patterns, a result of the interference between the pixel grids of the camera sensor and the device screen. Moir\'{e} patterns can severely damage the visual quality of photos. However, few studies have aimed to solve this problem. In this paper, we introduce a novel multiresolution fully convolutional network for automatically removing moir\'{e} patterns from photos. Since a moir\'{e} pattern spans over a wide range of frequencies, our proposed network performs a nonlinear multiresolution analysis of the input image before computing how to cancel moir\'{e} artefacts within every frequency band. We also create a large-scale benchmark dataset with $100,000^+$ image pairs for investigating and evaluating moir\'{e} pattern removal algorithms. Our network achieves state-of-the-art performance on this dataset in comparison to existing learning architectures for image restoration problems.

* 13 pages, 19 figures, accepted to appear in IEEE Transactions on Image Processing 
  
Access Paper or Ask Questions

Photo Aesthetics Ranking Network with Attributes and Content Adaptation

Jul 27, 2016
Shu Kong, Xiaohui Shen, Zhe Lin, Radomir Mech, Charless Fowlkes

Real-world applications could benefit from the ability to automatically generate a fine-grained ranking of photo aesthetics. However, previous methods for image aesthetics analysis have primarily focused on the coarse, binary categorization of images into high- or low-aesthetic categories. In this work, we propose to learn a deep convolutional neural network to rank photo aesthetics in which the relative ranking of photo aesthetics are directly modeled in the loss function. Our model incorporates joint learning of meaningful photographic attributes and image content information which can help regularize the complicated photo aesthetics rating problem. To train and analyze this model, we have assembled a new aesthetics and attributes database (AADB) which contains aesthetic scores and meaningful attributes assigned to each image by multiple human raters. Anonymized rater identities are recorded across images allowing us to exploit intra-rater consistency using a novel sampling strategy when computing the ranking loss of training image pairs. We show the proposed sampling strategy is very effective and robust in face of subjective judgement of image aesthetics by individuals with different aesthetic tastes. Experiments demonstrate that our unified model can generate aesthetic rankings that are more consistent with human ratings. To further validate our model, we show that by simply thresholding the estimated aesthetic scores, we are able to achieve state-or-the-art classification performance on the existing AVA dataset benchmark.

  
Access Paper or Ask Questions

User Preferences Modeling and Learning for Pleasing Photo Collage Generation

Jun 01, 2015
Simone Bianco, Gianluigi Ciocca

In this paper we consider how to automatically create pleasing photo collages created by placing a set of images on a limited canvas area. The task is formulated as an optimization problem. Differently from existing state-of-the-art approaches, we here exploit subjective experiments to model and learn pleasantness from user preferences. To this end, we design an experimental framework for the identification of the criteria that need to be taken into account to generate a pleasing photo collage. Five different thematic photo datasets are used to create collages using state-of-the-art criteria. A first subjective experiment where several subjects evaluated the collages, emphasizes that different criteria are involved in the subjective definition of pleasantness. We then identify new global and local criteria and design algorithms to quantify them. The relative importance of these criteria are automatically learned by exploiting the user preferences, and new collages are generated. To validate our framework, we performed several psycho-visual experiments involving different users. The results shows that the proposed framework allows to learn a novel computational model which effectively encodes an inter-user definition of pleasantness. The learned definition of pleasantness generalizes well to new photo datasets of different themes and sizes not used in the learning. Moreover, compared with two state of the art approaches, the collages created using our framework are preferred by the majority of the users.

* To be published in ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 
  
Access Paper or Ask Questions

IMAGO: A family photo album dataset for a socio-historical analysis of the twentieth century

Dec 03, 2020
Lorenzo Stacchio, Alessia Angeli, Giuseppe Lisanti, Daniela Calanca, Gustavo Marfia

Although one of the most popular practices in photography since the end of the 19th century, an increase in scholarly interest in family photo albums dates back to the early 1980s. Such collections of photos may reveal sociological and historical insights regarding specific cultures and times. They are, however, in most cases scattered among private homes and only available on paper or photographic film, thus making their analysis by academics such as historians, social-cultural anthropologists and cultural theorists very cumbersome. In this paper, we analyze the IMAGO dataset including photos belonging to family albums assembled at the University of Bologna's Rimini campus since 2004. Following a deep learning-based approach, the IMAGO dataset has offered the opportunity of experimenting with photos taken between year 1845 and year 2009, with the goals of assessing the dates and the socio-historical contexts of the images, without use of any other sources of information. Exceeding our initial expectations, such analysis has revealed its merit not only in terms of the performance of the approach adopted in this work, but also in terms of the foreseeable implications and use for the benefit of socio-historical research. To the best of our knowledge, this is the first work that moves along this path in literature.

  
Access Paper or Ask Questions

Face sketch to photo translation using generative adversarial networks

Oct 23, 2021
Nastaran Moradzadeh Farid, Maryam Saeedi Fard, Ahmad Nickabadi

Translating face sketches to photo-realistic faces is an interesting and essential task in many applications like law enforcement and the digital entertainment industry. One of the most important challenges of this task is the inherent differences between the sketch and the real image such as the lack of color and details of the skin tissue in the sketch. With the advent of adversarial generative models, an increasing number of methods have been proposed for sketch-to-image synthesis. However, these models still suffer from limitations such as the large number of paired data required for training, the low resolution of the produced images, or the unrealistic appearance of the generated images. In this paper, we propose a method for converting an input facial sketch to a colorful photo without the need for any paired dataset. To do so, we use a pre-trained face photo generating model to synthesize high-quality natural face photos and employ an optimization procedure to keep high-fidelity to the input sketch. We train a network to map the facial features extracted from the input sketch to a vector in the latent space of the face generating model. Also, we study different optimization criteria and compare the results of the proposed model with those of the state-of-the-art models quantitatively and qualitatively. The proposed model achieved 0.655 in the SSIM index and 97.59% rank-1 face recognition rate with higher quality of the produced images.

  
Access Paper or Ask Questions

PETA: Photo Albums Event Recognition using Transformers Attention

Sep 26, 2021
Tamar Glaser, Emanuel Ben-Baruch, Gilad Sharir, Nadav Zamir, Asaf Noy, Lihi Zelnik-Manor

In recent years the amounts of personal photos captured increased significantly, giving rise to new challenges in multi-image understanding and high-level image understanding. Event recognition in personal photo albums presents one challenging scenario where life events are recognized from a disordered collection of images, including both relevant and irrelevant images. Event recognition in images also presents the challenge of high-level image understanding, as opposed to low-level image object classification. In absence of methods to analyze multiple inputs, previous methods adopted temporal mechanisms, including various forms of recurrent neural networks. However, their effective temporal window is local. In addition, they are not a natural choice given the disordered characteristic of photo albums. We address this gap with a tailor-made solution, combining the power of CNNs for image representation and transformers for album representation to perform global reasoning on image collection, offering a practical and efficient solution for photo albums event recognition. Our solution reaches state-of-the-art results on 3 prominent benchmarks, achieving above 90\% mAP on all datasets. We further explore the related image-importance task in event recognition, demonstrating how the learned attentions correlate with the human-annotated importance for this subjective task, thus opening the door for new applications.

* 8 pages, 10 including references, 3 figures, was submitted to WACV 2022 
  
Access Paper or Ask Questions

Skyline variations allow estimating distance to trees on landscape photos using semantic segmentation

Jan 14, 2022
Laura Martinez-Sanchez, Daniele Borio, Raphaël d'Andrimont, Marijn van der Velde

Approximate distance estimation can be used to determine fundamental landscape properties including complexity and openness. We show that variations in the skyline of landscape photos can be used to estimate distances to trees on the horizon. A methodology based on the variations of the skyline has been developed and used to investigate potential relationships with the distance to skyline objects. The skyline signal, defined by the skyline height expressed in pixels, was extracted for several Land Use/Cover Area frame Survey (LUCAS) landscape photos. Photos were semantically segmented with DeepLabV3+ trained with the Common Objects in Context (COCO) dataset. This provided pixel-level classification of the objects forming the skyline. A Conditional Random Fields (CRF) algorithm was also applied to increase the details of the skyline signal. Three metrics, able to capture the skyline signal variations, were then considered for the analysis. These metrics shows a functional relationship with distance for the class of trees, whose contours have a fractal nature. In particular, regression analysis was performed against 475 ortho-photo based distance measurements, and, in the best case, a R2 score equal to 0.47 was achieved. This is an encouraging result which shows the potential of skyline variation metrics for inferring distance related information.

  
Access Paper or Ask Questions

Photo Filter Recommendation by Category-Aware Aesthetic Learning

Mar 27, 2017
Wei-Tse Sun, Ting-Hsuan Chao, Yin-Hsi Kuo, Winston H. Hsu

Nowadays, social media has become a popular platform for the public to share photos. To make photos more visually appealing, users usually apply filters on their photos without domain knowledge. However, due to the growing number of filter types, it becomes a major issue for users to choose the best filter type. For this purpose, filter recommendation for photo aesthetics takes an important role in image quality ranking problems. In these years, several works have declared that Convolutional Neural Networks (CNNs) outperform traditional methods in image aesthetic categorization, which classifies images into high or low quality. Most of them do not consider the effect on filtered images; hence, we propose a novel image aesthetic learning for filter recommendation. Instead of binarizing image quality, we adjust the state-of-the-art CNN architectures and design a pairwise loss function to learn the embedded aesthetic responses in hidden layers for filtered images. Based on our pilot study, we observe image categories (e.g., portrait, landscape, food) will affect user preference on filter selection. We further integrate category classification into our proposed aesthetic-oriented models. To the best of our knowledge, there is no public dataset for aesthetic judgment with filtered images. We create a new dataset called Filter Aesthetic Comparison Dataset (FACD). It contains 28,160 filtered images based on the AVA dataset and 42,240 reliable image pairs with aesthetic annotations using Amazon Mechanical Turk. It is the first dataset containing filtered images and user preference labels. We conduct experiments on the collected FACD for filter recommendation, and the results show that our proposed category-aware aesthetic learning outperforms aesthetic classification methods (e.g., 12% relative improvement).

* 11 pages, 7 figures 
  
Access Paper or Ask Questions
<<
1
2
3
4
5
6
7
8
9
10
11
12
>>