Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"photo": models, code, and papers

Fine-to-coarse Knowledge Transfer For Low-Res Image Classification

May 21, 2016
Xingchao Peng, Judy Hoffman, Stella X. Yu, Kate Saenko

We address the difficult problem of distinguishing fine-grained object categories in low resolution images. Wepropose a simple an effective deep learning approach that transfers fine-grained knowledge gained from high resolution training data to the coarse low-resolution test scenario. Such fine-to-coarse knowledge transfer has many real world applications, such as identifying objects in surveillance photos or satellite images where the image resolution at the test time is very low but plenty of high resolution photos of similar objects are available. Our extensive experiments on two standard benchmark datasets containing fine-grained car models and bird species demonstrate that our approach can effectively transfer fine-detail knowledge to coarse-detail imagery.

* 5 pages, accepted by ICIP 2016 
  

AQPDBJUT Dataset: Picture-Based PM2.5 Monitoring in the Campus of BJUT

Mar 19, 2020
Yonghui Zhang, Ke Gu, Zhifang Xia, Junfei Qiao

Ensuring the students in good physical levels is imperative for their future health. In recent years, the continually growing concentration of Particulate Matter (PM) has done increasingly serious harm to student health. Hence, it is highly required to prevent and control PM concentrations in the campus. As the source of PM prevention and control, developing a good model for PM monitoring is extremely urgent and has posed a big challenge. It has been found in prior works that photo-based methods are available for PM monitoring. To verify the effectiveness of existing PM monitoring methods in the campus, we establish a new dataset which includes 1,500 photos collected in the Beijing University of Technology. Experiments show that stated-of-the-art methods are far from ideal for PM2.5 monitoring in the campus.

  

WarpGAN: Automatic Caricature Generation

Nov 28, 2018
Yichun Shi, Debayan Deb, Anil K. Jain

We propose, WarpGAN, a fully automatic network that can generate caricatures given an input face photo. Besides transferring rich texture styles, WarpGAN learns to automatically predict a set of control points that can warp the photo into a caricature, while preserving identity. We introduce an identity-preserving adversarial loss that aids the discriminator to distinguish between different subjects. Moreover, WarpGAN allows customization of the generated caricatures by controlling the exaggeration extent and the visual styles. Experimental results on a public domain dataset, WebCaricature, show that WarpGAN is capable of generating a diverse set of caricatures while preserving the identities. Five caricature experts suggest that caricatures generated by WarpGAN are visually similar to hand-drawn ones and only prominent facial features are exaggerated.

  

What Looks Good with my Sofa: Multimodal Search Engine for Interior Design

Jan 08, 2018
Ivona Tautkute, Aleksandra Możejko, Wojciech Stokowiec, Tomasz Trzciński, Łukasz Brocki, Krzysztof Marasek

In this paper, we propose a multi-modal search engine for interior design that combines visual and textual queries. The goal of our engine is to retrieve interior objects, e.g. furniture or wall clocks, that share visual and aesthetic similarities with the query. Our search engine allows the user to take a photo of a room and retrieve with a high recall a list of items identical or visually similar to those present in the photo. Additionally, it allows to return other items that aesthetically and stylistically fit well together. To achieve this goal, our system blends the results obtained using textual and visual modalities. Thanks to this blending strategy, we increase the average style similarity score of the retrieved items by 11%. Our work is implemented as a Web-based application and it is planned to be opened to the public.

* Proceedings of the 2017 Federated Conference on Computer Science and Information Systems 
* FEDCSIS 5th Conference on Multimedia, Interaction, Design and Innovation (MIDI), 2017 
  

SwiDeN : Convolutional Neural Networks For Depiction Invariant Object Recognition

Jul 29, 2016
Ravi Kiran Sarvadevabhatla, Shiv Surya, Srinivas S S Kruthiventi, Venkatesh Babu R

Current state of the art object recognition architectures achieve impressive performance but are typically specialized for a single depictive style (e.g. photos only, sketches only). In this paper, we present SwiDeN : our Convolutional Neural Network (CNN) architecture which recognizes objects regardless of how they are visually depicted (line drawing, realistic shaded drawing, photograph etc.). In SwiDeN, we utilize a novel `deep' depictive style-based switching mechanism which appropriately addresses the depiction-specific and depiction-invariant aspects of the problem. We compare SwiDeN with alternative architectures and prior work on a 50-category Photo-Art dataset containing objects depicted in multiple styles. Experimental results show that SwiDeN outperforms other approaches for the depiction-invariant object recognition problem.

* Accepted at ACMMM 2016. The first two authors contributed equally. Code and models at https://github.com/val-iisc/swiden 
  

Improved Wavelets for Image Compression from Unitary Circuits

Mar 04, 2022
James C. McCord, Glen Evenbly

We benchmark the efficacy of several novel orthogonal, symmetric, dilation-3 wavelets, derived from a unitary circuit based construction, towards image compression. The performance of these wavelets is compared across several photo databases against the CDF-9/7 wavelets in terms of the minimum number of non-zero wavelet coefficients needed to obtain a specified image quality, as measured by the multi-scale structural similarity index (MS-SSIM). The new wavelets are found to consistently offer better compression efficiency than the CDF-9/7 wavelets across a broad range of image resolutions and quality requirements, averaging 7-8% improved compression efficiency on high-resolution photo images when high-quality (MS-SSIM = 0.99) is required.

* 10 pages, 6 figures 
  

An Automatic Reader of Identity Documents

Jun 26, 2020
Filippo Attivissimo, Nicola Giaquinto, Marco Scarpetta, Maurizio Spadavecchia

Identity documents automatic reading and verification is an appealing technology for nowadays service industry, since this task is still mostly performed manually, leading to waste of economic and time resources. In this paper the prototype of a novel automatic reading system of identity documents is presented. The system has been thought to extract data of the main Italian identity documents from photographs of acceptable quality, like those usually required to online subscribers of various services. The document is first localized inside the photo, and then classified; finally, text recognition is executed. A synthetic dataset has been used, both for neural networks training, and for performance evaluation of the system. The synthetic dataset avoided privacy issues linked to the use of real photos of real documents, which will be used, instead, for future developments of the system.

* 6 pages, 9 figures 
  

Enhancing temporal segmentation by nonlocal self-similarity

Jun 14, 2019
Mariella Dimiccoli, Herwig Wendt

Temporal segmentation of untrimmed videos and photo-streams is currently an active area of research in computer vision and image processing. This paper proposes a new approach to improve the temporal segmentation of photo-streams. The method consists in enhancing image representations by encoding long-range temporal dependencies. Our key contribution is to take advantage of the temporal stationarity assumption of photostreams for modeling each frame by its nonlocal self-similarity function. The proposed approach is put to test on the EDUB-Seg dataset, a standard benchmark for egocentric photostream temporal segmentation. Starting from seven different (CNN based) image features, the method yields consistent improvements in event segmentation quality, leading to an average increase of F-measure of 3.71% with respect to the state of the art.

* Accepted to ICIP 2019 
  

AQPDCITY Dataset: Picture-Based PM Monitoring in the Urban Area of Big Cities

Apr 06, 2020
Yonghui Zhang, Ke Gu

Since Particulate Matters (PMs) are closely related to people's living and health, it has become one of the most important indicator of air quality monitoring around the world. But the existing sensor-based methods for PM monitoring have remarkable disadvantages, such as low-density monitoring stations and high-requirement monitoring conditions. It is highly desired to devise a method that can obtain the PM concentration at any location for the following air quality control in time. The prior works indicate that the PM concentration can be monitored by using ubiquitous photos. To further investigate such issue, we gathered 1,500 photos in big cities to establish a new AQPDCITY dataset. Experiments conducted to check nine state-of-the-art methods on this dataset show that the performance of those above methods perform poorly in the AQPDCITY dataset.

  

AQPDCITY Dataset: Picture-Based PM2.5 Monitoring in the Urban Area of Big Cities

Mar 22, 2020
Yonghui Zhang, Ke Gu

Since Particulate Matters (PMs) are closely related to people's living and health, it has become one of the most important indicator of air quality monitoring around the world. But the existing sensor-based methods for PM monitoring have remarkable disadvantages, such as low-density monitoring stations and high-requirement monitoring conditions. It is highly desired to devise a method that can obtain the PM concentration at any location for the following air quality control in time. The prior works indicate that the PM concentration can be monitored by using ubiquitous photos. To further investigate such issue, we gathered 1,500 photos in big cities to establish a new AQPDCITY dataset. Experiments conducted to check nine state-of-the-art methods on this dataset show that the performance of those above methods perform poorly in the AQPDCITY dataset.

  
<<
28
29
30
31
32
33
34
35
36
37
38
39
40
>>