Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"photo": models, code, and papers

TrueImage: A Machine Learning Algorithm to Improve the Quality of Telehealth Photos

Oct 01, 2020
Kailas Vodrahalli, Roxana Daneshjou, Roberto A Novoa, Albert Chiou, Justin M Ko, James Zou

Telehealth is an increasingly critical component of the health care ecosystem, especially due to the COVID-19 pandemic. Rapid adoption of telehealth has exposed limitations in the existing infrastructure. In this paper, we study and highlight photo quality as a major challenge in the telehealth workflow. We focus on teledermatology, where photo quality is particularly important; the framework proposed here can be generalized to other health domains. For telemedicine, dermatologists request that patients submit images of their lesions for assessment. However, these images are often of insufficient quality to make a clinical diagnosis since patients do not have experience taking clinical photos. A clinician has to manually triage poor quality images and request new images to be submitted, leading to wasted time for both the clinician and the patient. We propose an automated image assessment machine learning pipeline, TrueImage, to detect poor quality dermatology photos and to guide patients in taking better photos. Our experiments indicate that TrueImage can reject 50% of the sub-par quality images, while retaining 80% of good quality images patients send in, despite heterogeneity and limitations in the training data. These promising results suggest that our solution is feasible and can improve the quality of teledermatology care.

* 12 pages, 5 figures, Preprint of an article published in Pacific Symposium on Biocomputing \c{opyright} 2020 World Scientific Publishing Co., Singapore, 

Aesthetic Features for Personalized Photo Recommendation

Aug 31, 2018
Yu Qing Zhou, Ga Wu, Scott Sanner, Putra Manggala

Many photography websites such as Flickr, 500px, Unsplash, and Adobe Behance are used by amateur and professional photography enthusiasts. Unlike content-based image search, such users of photography websites are not just looking for photos with certain content, but more generally for photos with a certain photographic "aesthetic". In this context, we explore personalized photo recommendation and propose two aesthetic feature extraction methods based on (i) color space and (ii) deep style transfer embeddings. Using a dataset from 500px, we evaluate how these features can be best leveraged by collaborative filtering methods and show that (ii) provides a significant boost in photo recommendation performance.

* In Proceedings of the Late-Breaking Results track part of the Twelfth ACM Conference on Recommender Systems, Vancouver, BC, Canada, October 6, 2018, 2 pages 

Unmanned Aerial Vehicle Instrumentation for Rapid Aerial Photo System

Apr 24, 2008
Widyawardana Adiprawita, Adang Suwandi Ahmad, Jaka Semibiring

This research will proposed a new kind of relatively low cost autonomous UAV that will enable farmers to make just in time mosaics of aerial photo of their crop. These mosaics of aerial photo should be able to be produced with relatively low cost and within the 24 hours of acquisition constraint. The autonomous UAV will be equipped with payload management system specifically developed for rapid aerial mapping. As mentioned before turn around time is the key factor, so accuracy is not the main focus (not orthorectified aerial mapping). This system will also be equipped with special software to post process the aerial photos to produce the mosaic aerial photo map

* Proceedings of the International Conference on Intelligent Unmanned System (ICIUS 2007), Bali, Indonesia, October 24-25, 2007, Paper No. ICIUS2007-A020-P 
* Uploaded by ICIUS2007 Conference Organizer on behalf of the author(s). 8 pages, 9 figures 

AutoPhoto: Aesthetic Photo Capture using Reinforcement Learning

Sep 21, 2021
Hadi AlZayer, Hubert Lin, Kavita Bala

The process of capturing a well-composed photo is difficult and it takes years of experience to master. We propose a novel pipeline for an autonomous agent to automatically capture an aesthetic photograph by navigating within a local region in a scene. Instead of classical optimization over heuristics such as the rule-of-thirds, we adopt a data-driven aesthetics estimator to assess photo quality. A reinforcement learning framework is used to optimize the model with respect to the learned aesthetics metric. We train our model in simulation with indoor scenes, and we demonstrate that our system can capture aesthetic photos in both simulation and real world environments on a ground robot. To our knowledge, this is the first system that can automatically explore an environment to capture an aesthetic photo with respect to a learned aesthetic estimator.

* Accepted to IROS 2021 

Automatic Synchronization of Multi-User Photo Galleries

Jan 16, 2017
E. Sansone, K. Apostolidis, N. Conci, G. Boato, V. Mezaris, F. G. B. De Natale

In this paper we address the issue of photo galleries synchronization, where pictures related to the same event are collected by different users. Existing solutions to address the problem are usually based on unrealistic assumptions, like time consistency across photo galleries, and often heavily rely on heuristics, limiting therefore the applicability to real-world scenarios. We propose a solution that achieves better generalization performance for the synchronization task compared to the available literature. The method is characterized by three stages: at first, deep convolutional neural network features are used to assess the visual similarity among the photos; then, pairs of similar photos are detected across different galleries and used to construct a graph; eventually, a probabilistic graphical model is used to estimate the temporal offset of each pair of galleries, by traversing the minimum spanning tree extracted from this graph. The experimental evaluation is conducted on four publicly available datasets covering different types of events, demonstrating the strength of our proposed method. A thorough discussion of the obtained results is provided for a critical assessment of the quality in synchronization.

* ACCEPTED to IEEE Transactions on Multimedia 

Sequential Person Recognition in Photo Albums with a Recurrent Network

Nov 30, 2016
Yao Li, Guosheng Lin, Bohan Zhuang, Lingqiao Liu, Chunhua Shen, Anton van den Hengel

Recognizing the identities of people in everyday photos is still a very challenging problem for machine vision, due to non-frontal faces, changes in clothing, location, lighting and similar. Recent studies have shown that rich relational information between people in the same photo can help in recognizing their identities. In this work, we propose to model the relational information between people as a sequence prediction task. At the core of our work is a novel recurrent network architecture, in which relational information between instances' labels and appearance are modeled jointly. In addition to relational cues, scene context is incorporated in our sequence prediction model with no additional cost. In this sense, our approach is a unified framework for modeling both contextual cues and visual appearance of person instances. Our model is trained end-to-end with a sequence of annotated instances in a photo as inputs, and a sequence of corresponding labels as targets. We demonstrate that this simple but elegant formulation achieves state-of-the-art performance on the newly released People In Photo Albums (PIPA) dataset.


An Unpaired Sketch-to-Photo Translation Model

Sep 20, 2019
Runtao Liu, Qian Yu, Stella Yu

Sketch-based image synthesis aims to generate a photo image given a sketch. It is a challenging task; because sketches are drawn by non-professionals and only consist of strokes, they usually exhibit shape deformation and lack visual cues, i.e., colors and textures. Thus translation from sketch to photo involves two aspects: shape and color (texture). Existing methods cannot handle this task well, as they mostly focus on solving one translation. In this work, we show that the key to this task lies in decomposing the translation into two sub-tasks, shape translation and colorization. Correspondingly, we propose a model consisting of two sub-networks, with each one tackling one sub-task. We also find that, when translating shapes, specific drawing styles affect the generated results significantly and may even lead to failure. To make our model more robust to drawing style variations, we design a data augmentation strategy and re-purpose an attention module, aiming to make our model pay less attention to distracted regions of a sketch. Besides, a conditional module is adapted for color translation to improve diversity and increase users' control over the generated results. Both quantitative and qualitative comparisons are presented to show the superiority of our approach. In addition, as a side benefit, our model can synthesize high-quality sketches from photos inversely. We also demonstrate how these generated photos and sketches can benefit other applications, such as sketch-based image retrieval.

* Add more qualitative results in the appendix. 13 pages. URL: 

Legacy Photo Editing with Learned Noise Prior

Nov 23, 2020
Zhao Yuzhi, Po Lai-Man, Wang Xuehui, Liu Kangcheng, Zhang Yujia, Yu Wing-Yin, Xian Pengfei, Xiong Jingjing

There are quite a number of photographs captured under undesirable conditions in the last century. Thus, they are often noisy, regionally incomplete, and grayscale formatted. Conventional approaches mainly focus on one point so that those restoration results are not perceptually sharp or clean enough. To solve these problems, we propose a noise prior learner NEGAN to simulate the noise distribution of real legacy photos using unpaired images. It mainly focuses on matching high-frequency parts of noisy images through discrete wavelet transform (DWT) since they include most of noise statistics. We also create a large legacy photo dataset for learning noise prior. Using learned noise prior, we can easily build valid training pairs by degrading clean images. Then, we propose an IEGAN framework performing image editing including joint denoising, inpainting and colorization based on the estimated noise prior. We evaluate the proposed system and compare it with state-of-the-art image enhancement methods. The experimental results demonstrate that it achieves the best perceptual quality. Please see the webpage \href{}{} for the codes and the proposed LP dataset.

* accepted by WACV 2021, 2nd round 

Sample-Efficient Generation of Novel Photo-acid Generator Molecules using a Deep Generative Model

Dec 02, 2021
Samuel C. Hoffman, Vijil Chenthamarakshan, Dmitry Yu. Zubarev, Daniel P. Sanders, Payel Das

Photo-acid generators (PAGs) are compounds that release acids ($H^+$ ions) when exposed to light. These compounds are critical components of the photolithography processes that are used in the manufacture of semiconductor logic and memory chips. The exponential increase in the demand for semiconductors has highlighted the need for discovering novel photo-acid generators. While de novo molecule design using deep generative models has been widely employed for drug discovery and material design, its application to the creation of novel photo-acid generators poses several unique challenges, such as lack of property labels. In this paper, we highlight these challenges and propose a generative modeling approach that utilizes conditional generation from a pre-trained deep autoencoder and expert-in-the-loop techniques. The validity of the proposed approach was evaluated with the help of subject matter experts, indicating the promise of such an approach for applications beyond the creation of novel photo-acid generators.