Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"photo": models, code, and papers

AQPDCITY Dataset: Picture-Based PM2.5 Monitoring in the Urban Area of Big Cities

Mar 22, 2020
Yonghui Zhang, Ke Gu

Since Particulate Matters (PMs) are closely related to people's living and health, it has become one of the most important indicator of air quality monitoring around the world. But the existing sensor-based methods for PM monitoring have remarkable disadvantages, such as low-density monitoring stations and high-requirement monitoring conditions. It is highly desired to devise a method that can obtain the PM concentration at any location for the following air quality control in time. The prior works indicate that the PM concentration can be monitored by using ubiquitous photos. To further investigate such issue, we gathered 1,500 photos in big cities to establish a new AQPDCITY dataset. Experiments conducted to check nine state-of-the-art methods on this dataset show that the performance of those above methods perform poorly in the AQPDCITY dataset.

  
Access Paper or Ask Questions

Recent Progress of Face Image Synthesis

Jun 15, 2017
Zhihe Lu, Zhihang Li, Jie Cao, Ran He, Zhenan Sun

Face synthesis has been a fascinating yet challenging problem in computer vision and machine learning. Its main research effort is to design algorithms to generate photo-realistic face images via given semantic domain. It has been a crucial prepossessing step of main-stream face recognition approaches and an excellent test of AI ability to use complicated probability distributions. In this paper, we provide a comprehensive review of typical face synthesis works that involve traditional methods as well as advanced deep learning approaches. Particularly, Generative Adversarial Net (GAN) is highlighted to generate photo-realistic and identity preserving results. Furthermore, the public available databases and evaluation metrics are introduced in details. We end the review with discussing unsolved difficulties and promising directions for future research.

* 17 pages, 10 figures 
  
Access Paper or Ask Questions

Multi-View Image-to-Image Translation Supervised by 3D Pose

Apr 12, 2021
Idit Diamant, Oranit Dror, Hai Victor Habi, Arnon Netzer

We address the task of multi-view image-to-image translation for person image generation. The goal is to synthesize photo-realistic multi-view images with pose-consistency across all views. Our proposed end-to-end framework is based on a joint learning of multiple unpaired image-to-image translation models, one per camera viewpoint. The joint learning is imposed by constraints on the shared 3D human pose in order to encourage the 2D pose projections in all views to be consistent. Experimental results on the CMU-Panoptic dataset demonstrate the effectiveness of the suggested framework in generating photo-realistic images of persons with new poses that are more consistent across all views in comparison to a standard Image-to-Image baseline. The code is available at: https://github.com/sony-si/MultiView-Img2Img

* *equal contribution 
  
Access Paper or Ask Questions

A Hybrid Model for Identity Obfuscation by Face Replacement

Jul 24, 2018
Qianru Sun, Ayush Tewari, Weipeng Xu, Mario Fritz, Christian Theobalt, Bernt Schiele

As more and more personal photos are shared and tagged in social media, avoiding privacy risks such as unintended recognition becomes increasingly challenging. We propose a new hybrid approach to obfuscate identities in photos by head replacement. Our approach combines state of the art parametric face synthesis with latest advances in Generative Adversarial Networks (GAN) for data-driven image synthesis. On the one hand, the parametric part of our method gives us control over the facial parameters and allows for explicit manipulation of the identity. On the other hand, the data-driven aspects allow for adding fine details and overall realism as well as seamless blending into the scene context. In our experiments, we show highly realistic output of our system that improves over the previous state of the art in obfuscation rate while preserving a higher similarity to the original image content.

* ECCV'18, camera-ready version 
  
Access Paper or Ask Questions

Improving Raw Image Storage Efficiency by Exploiting Similarity

Apr 19, 2016
Binqi Zhang, Chen Wang, Bing Bing Zhou, Albert Y. Zomaya

To improve the temporal and spatial storage efficiency, researchers have intensively studied various techniques, including compression and deduplication. Through our evaluation, we find that methods such as photo tags or local features help to identify the content-based similar- ity between raw images. The images can then be com- pressed more efficiently to get better storage space sav- ings. Furthermore, storing similar raw images together enables rapid data sorting, searching and retrieval if the images are stored in a distributed and large-scale envi- ronment by reducing fragmentation. In this paper, we evaluated the compressibility by designing experiments and observing the results. We found that on a statistical basis the higher similarity photos have, the better com- pression results are. This research helps provide a clue for future large-scale storage system design.

  
Access Paper or Ask Questions

toon2real: Translating Cartoon Images to Realistic Images

Feb 01, 2021
K. M. Arefeen Sultan, Mohammad Imrul Jubair, MD. Nahidul Islam, Sayed Hossain Khan

In terms of Image-to-image translation, Generative Adversarial Networks (GANs) has achieved great success even when it is used in the unsupervised dataset. In this work, we aim to translate cartoon images to photo-realistic images using GAN. We apply several state-of-the-art models to perform this task; however, they fail to perform good quality translations. We observe that the shallow difference between these two domains causes this issue. Based on this idea, we propose a method based on CycleGAN model for image translation from cartoon domain to photo-realistic domain. To make our model efficient, we implemented Spectral Normalization which added stability in our model. We demonstrate our experimental results and show that our proposed model has achieved the lowest Frechet Inception Distance score and better results compared to another state-of-the-art technique, UNIT.

* Accepted as a short paper at ICTAI 2020 
  
Access Paper or Ask Questions

3D Reconstruction of Temples in the Special Region of Yogyakarta By Using Close-Range Photogrammetry

Feb 22, 2017
Adityo Priyandito Utomo, Canggih Puspo Wibowo

Object reconstruction is one of the main problems in cultural heritage preservation. This problem is due to lack of data in documentation. Thus in this research we presented a method of 3D reconstruction using close-range photogrammetry. We collected 1319 photos from five temples in Yogyakarta. Using A-KAZE algorithm, keypoints of each image were obtained. Then we employed LIOP to create feature descriptor from it. After performing feature matching, L1RA was utilized to create sparse point clouds. In order to generate the geometry shape, MVS was used. Finally, FSSR and Large Scale Texturing were employed to deal with the surface and texture of the object. The quality of the reconstructed 3D model was measured by comparing the 3D images of the model with the original photos utilizing SSIM. The results showed that in terms of quality, our method was on par with other commercial method such as PhotoModeler and PhotoScan.

* Semnasteknomedia 2017, 5 pages 
  
Access Paper or Ask Questions

SRGAN: Training Dataset Matters

Mar 24, 2019
Nao Takano, Gita Alaghband

Generative Adversarial Networks (GANs) in supervised settings can generate photo-realistic corresponding output from low-definition input (SRGAN). Using the architecture presented in the SRGAN original paper [2], we explore how selecting a dataset affects the outcome by using three different datasets to see that SRGAN fundamentally learns objects, with their shape, color, and texture, and redraws them in the output rather than merely attempting to sharpen edges. This is further underscored with our demonstration that once the network learns the images of the dataset, it can generate a photo-like image with even a slight hint of what it might look like for the original from a very blurry edged sketch. Given a set of inference images, the network trained with the same dataset results in a better outcome over the one trained with arbitrary set of images, and we report its significance numerically with Frechet Inception Distance score [22].

  
Access Paper or Ask Questions

Disaster Monitoring using Unmanned Aerial Vehicles and Deep Learning

Aug 08, 2018
Andreas Kamilaris, Francesc X. Prenafeta-Boldú

Monitoring of disasters is crucial for mitigating their effects on the environment and human population, and can be facilitated by the use of unmanned aerial vehicles (UAV), equipped with camera sensors that produce aerial photos of the areas of interest. A modern technique for recognition of events based on aerial photos is deep learning. In this paper, we present the state of the art work related to the use of deep learning techniques for disaster identification. We demonstrate the potential of this technique in identifying disasters with high accuracy, by means of a relatively simple deep learning model. Based on a dataset of 544 images (containing disaster images such as fires, earthquakes, collapsed buildings, tsunami and flooding, as well as non-disaster scenes), our results show an accuracy of 91% achieved, indicating that deep learning, combined with UAV equipped with camera sensors, have the potential to predict disasters with high accuracy.

* Disaster Management for Resilience and Public Safety Workshop, Proc. of EnviroInfo 2017 
  
Access Paper or Ask Questions

Reconstructing NBA Players

Jul 27, 2020
Luyang Zhu, Konstantinos Rematas, Brian Curless, Steve Seitz, Ira Kemelmacher-Shlizerman

Great progress has been made in 3D body pose and shape estimation from a single photo. Yet, state-of-the-art results still suffer from errors due to challenging body poses, modeling clothing, and self occlusions. The domain of basketball games is particularly challenging, as it exhibits all of these challenges. In this paper, we introduce a new approach for reconstruction of basketball players that outperforms the state-of-the-art. Key to our approach is a new method for creating poseable, skinned models of NBA players, and a large database of meshes (derived from the NBA2K19 video game), that we are releasing to the research community. Based on these models, we introduce a new method that takes as input a single photo of a clothed player in any basketball pose and outputs a high resolution mesh and 3D pose for that player. We demonstrate substantial improvement over state-of-the-art, single-image methods for body shape reconstruction.

* ECCV 2020 
  
Access Paper or Ask Questions
<<
28
29
30
31
32
33
34
35
36
37
38
39
40
>>