Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Domenec Puig

Hierarchical approach to classify food scenes in egocentric photo-streams

May 10, 2019
Estefania Talavera, Maria Leyva-Vallina, Md. Mostafa Kamal Sarker, Domenec Puig, Nicolai Petkov, Petia Radeva

Figure 1 for Hierarchical approach to classify food scenes in egocentric photo-streams

Figure 2 for Hierarchical approach to classify food scenes in egocentric photo-streams

Figure 3 for Hierarchical approach to classify food scenes in egocentric photo-streams

Figure 4 for Hierarchical approach to classify food scenes in egocentric photo-streams

Recent studies have shown that the environment where people eat can affect their nutritional behaviour. In this work, we provide automatic tools for a personalised analysis of a person's health habits by the examination of daily recorded egocentric photo-streams. Specifically, we propose a new automatic approach for the classification of food-related environments, that is able to classify up to 15 such scenes. In this way, people can monitor the context around their food intake in order to get an objective insight into their daily eating routine. We propose a model that classifies food-related scenes organized in a semantic hierarchy. Additionally, we present and make available a new egocentric dataset composed of more than 33000 images recorded by a wearable camera, over which our proposed model has been tested. Our approach obtains an accuracy and F-score of 56\% and 65\%, respectively, clearly outperforming the baseline methods.

Via

Access Paper or Ask Questions

Breast Tumor Segmentation and Shape Classification in Mammograms using Generative Adversarial and Convolutional Neural Network

Oct 23, 2018
Vivek Kumar Singh, Hatem A. Rashwan, Santiago Romani, Farhan Akram, Nidhi Pandey, Md. Mostafa Kamal Sarker, Adel Saleh, Meritexell Arenas, Miguel Arquez, Domenec Puig, Jordina Torrents-Barrena

Figure 1 for Breast Tumor Segmentation and Shape Classification in Mammograms using Generative Adversarial and Convolutional Neural Network

Figure 2 for Breast Tumor Segmentation and Shape Classification in Mammograms using Generative Adversarial and Convolutional Neural Network

Figure 3 for Breast Tumor Segmentation and Shape Classification in Mammograms using Generative Adversarial and Convolutional Neural Network

Figure 4 for Breast Tumor Segmentation and Shape Classification in Mammograms using Generative Adversarial and Convolutional Neural Network

Mammogram inspection in search of breast tumors is a tough assignment that radiologists must carry out frequently. Therefore, image analysis methods are needed for the detection and delineation of breast masses, which portray crucial morphological information that will support reliable diagnosis. In this paper, we proposed a conditional Generative Adversarial Network (cGAN) devised to segment a breast mass within a region of interest (ROI) in a mammogram. The generative network learns to recognize the breast mass area and to create the binary mask that outlines the breast mass. In turn, the adversarial network learns to distinguish between real (ground truth) and synthetic segmentations, thus enforcing the generative network to create binary masks as realistic as possible. The cGAN works well even when the number of training samples are limited. Therefore, the proposed method outperforms several state-of-the-art approaches. This hypothesis is corroborated by diverse experiments performed on two datasets, the public INbreast and a private in-house dataset. The proposed segmentation model provides a high Dice coefficient and Intersection over Union (IoU) of 94% and 87%, respectively. In addition, a shape descriptor based on a Convolutional Neural Network (CNN) is proposed to classify the generated masks into four mass shapes: irregular, lobular, oval and round. The proposed shape descriptor was trained on Digital Database for Screening Mammography (DDSM) yielding an overall accuracy of 80%, which outperforms the current state-of-the-art.

* 33 pages, Submitted to Expert Systems with Applications

Via

Access Paper or Ask Questions

Identification and Visualization of the Underlying Independent Causes of the Diagnostic of Diabetic Retinopathy made by a Deep Learning Classifier

Sep 23, 2018
Jordi de la Torre, Aida Valls, Domenec Puig, Pere Romero-Aroca

Figure 1 for Identification and Visualization of the Underlying Independent Causes of the Diagnostic of Diabetic Retinopathy made by a Deep Learning Classifier

Figure 2 for Identification and Visualization of the Underlying Independent Causes of the Diagnostic of Diabetic Retinopathy made by a Deep Learning Classifier

Figure 3 for Identification and Visualization of the Underlying Independent Causes of the Diagnostic of Diabetic Retinopathy made by a Deep Learning Classifier

Figure 4 for Identification and Visualization of the Underlying Independent Causes of the Diagnostic of Diabetic Retinopathy made by a Deep Learning Classifier

Interpretability is a key factor in the design of automatic classifiers for medical diagnosis. Deep learning models have been proven to be a very effective classification algorithm when trained in a supervised way with enough data. The main concern is the difficulty of inferring rationale interpretations from them. Different attempts have been done in last years in order to convert deep learning classifiers from high confidence statistical black box machines into self-explanatory models. In this paper we go forward into the generation of explanations by identifying the independent causes that use a deep learning model for classifying an image into a certain class. We use a combination of Independent Component Analysis with a Score Visualization technique. In this paper we study the medical problem of classifying an eye fundus image into 5 levels of Diabetic Retinopathy. We conclude that only 3 independent components are enough for the differentiation and correct classification between the 5 disease standard classes. We propose a method for visualizing them and detecting lesions from the generated visual maps.

Via

Access Paper or Ask Questions

MACNet: Multi-scale Atrous Convolution Networks for Food Places Classification in Egocentric Photo-streams

Aug 29, 2018
Md. Mostafa Kamal Sarker, Hatem A. Rashwan, Estefania Talavera, Syeda Furruka Banu, Petia Radeva, Domenec Puig

Figure 1 for MACNet: Multi-scale Atrous Convolution Networks for Food Places Classification in Egocentric Photo-streams

Figure 2 for MACNet: Multi-scale Atrous Convolution Networks for Food Places Classification in Egocentric Photo-streams

Figure 3 for MACNet: Multi-scale Atrous Convolution Networks for Food Places Classification in Egocentric Photo-streams

Figure 4 for MACNet: Multi-scale Atrous Convolution Networks for Food Places Classification in Egocentric Photo-streams

First-person (wearable) camera continually captures unscripted interactions of the camera user with objects, people, and scenes reflecting his personal and relational tendencies. One of the preferences of people is their interaction with food events. The regulation of food intake and its duration has a great importance to protect against diseases. Consequently, this work aims to develop a smart model that is able to determine the recurrences of a person on food places during a day. This model is based on a deep end-to-end model for automatic food places recognition by analyzing egocentric photo-streams. In this paper, we apply multi-scale Atrous convolution networks to extract the key features related to food places of the input images. The proposed model is evaluated on an in-house private dataset called "EgoFoodPlaces". Experimental results shows promising results of food places classification recognition in egocentric photo-streams.

* 10 pages, accepted in ECCV at EPIC 2018

Via

Access Paper or Ask Questions

Retinal Optic Disc Segmentation using Conditional Generative Adversarial Network

Jun 11, 2018
Vivek Kumar Singh, Hatem Rashwan, Farhan Akram, Nidhi Pandey, Md. Mostaf Kamal Sarker, Adel Saleh, Saddam Abdulwahab, Najlaa Maaroof, Santiago Romani, Domenec Puig

Figure 1 for Retinal Optic Disc Segmentation using Conditional Generative Adversarial Network

Figure 2 for Retinal Optic Disc Segmentation using Conditional Generative Adversarial Network

Figure 3 for Retinal Optic Disc Segmentation using Conditional Generative Adversarial Network

Figure 4 for Retinal Optic Disc Segmentation using Conditional Generative Adversarial Network

This paper proposed a retinal image segmentation method based on conditional Generative Adversarial Network (cGAN) to segment optic disc. The proposed model consists of two successive networks: generator and discriminator. The generator learns to map information from the observing input (i.e., retinal fundus color image), to the output (i.e., binary mask). Then, the discriminator learns as a loss function to train this mapping by comparing the ground-truth and the predicted output with observing the input image as a condition.Experiments were performed on two publicly available dataset; DRISHTI GS1 and RIM-ONE. The proposed model outperformed state-of-the-art-methods by achieving around 0.96% and 0.98% of Jaccard and Dice coefficients, respectively. Moreover, an image segmentation is performed in less than a second on recent GPU.

* 8 pages, Submitted to 21st International Conference of the Catalan Association for Artificial Intelligence (CCIA 2018)

Via

Access Paper or Ask Questions

Conditional Generative Adversarial and Convolutional Networks for X-ray Breast Mass Segmentation and Shape Classification

Jun 10, 2018
Vivek Kumar Singh, Santiago Romani, Hatem A. Rashwan, Farhan Akram, Nidhi Pandey, Md. Mostafa Kamal Sarker, Jordina Torrents Barrena, Saddam Abdulwahab, Adel Saleh, Miguel Arquez, Meritxell Arenas, Domenec Puig

Figure 1 for Conditional Generative Adversarial and Convolutional Networks for X-ray Breast Mass Segmentation and Shape Classification

Figure 2 for Conditional Generative Adversarial and Convolutional Networks for X-ray Breast Mass Segmentation and Shape Classification

Figure 3 for Conditional Generative Adversarial and Convolutional Networks for X-ray Breast Mass Segmentation and Shape Classification

This paper proposes a novel approach based on conditional Generative Adversarial Networks (cGAN) for breast mass segmentation in mammography. We hypothesized that the cGAN structure is well-suited to accurately outline the mass area, especially when the training data is limited. The generative network learns intrinsic features of tumors while the adversarial network enforces segmentations to be similar to the ground truth. Experiments performed on dozens of malignant tumors extracted from the public DDSM dataset and from our in-house private dataset confirm our hypothesis with very high Dice coefficient and Jaccard index (>94% and >89%, respectively) outperforming the scores obtained by other state-of-the-art approaches. Furthermore, in order to detect portray significant morphological features of the segmented tumor, a specific Convolutional Neural Network (CNN) have also been designed for classifying the segmented tumor areas into four types (irregular, lobular, oval and round), which provides an overall accuracy about 72% with the DDSM dataset.

* 8 pages, Accepted at Medical Image Computing and Computer Assisted Intervention (MICCAI) 2018

Via

Access Paper or Ask Questions

CuisineNet: Food Attributes Classification using Multi-scale Convolution Network

Jun 08, 2018
Md. Mostafa Kamal Sarker, Mohammed Jabreel, Hatem A. Rashwan, Syeda Furruka Banu, Antonio Moreno, Petia Radeva, Domenec Puig

Figure 1 for CuisineNet: Food Attributes Classification using Multi-scale Convolution Network

Figure 2 for CuisineNet: Food Attributes Classification using Multi-scale Convolution Network

Figure 3 for CuisineNet: Food Attributes Classification using Multi-scale Convolution Network

Figure 4 for CuisineNet: Food Attributes Classification using Multi-scale Convolution Network

Diversity of food and its attributes represents the culinary habits of peoples from different countries. Thus, this paper addresses the problem of identifying food culture of people around the world and its flavor by classifying two main food attributes, cuisine and flavor. A deep learning model based on multi-scale convotuional networks is proposed for extracting more accurate features from input images. The aggregation of multi-scale convolution layers with different kernel size is also used for weighting the features results from different scales. In addition, a joint loss function based on Negative Log Likelihood (NLL) is used to fit the model probability to multi labeled classes for multi-modal classification task. Furthermore, this work provides a new dataset for food attributes, so-called Yummly48K, extracted from the popular food website, Yummly. Our model is assessed on the constructed Yummly48K dataset. The experimental results show that our proposed method yields 65% and 62% average F1 score on validation and test set which outperforming the state-of-the-art models.

* 8 pages, Submitted in CCIA 2018

Via

Access Paper or Ask Questions

SLSDeep: Skin Lesion Segmentation Based on Dilated Residual and Pyramid Pooling Networks

May 31, 2018
Md. Mostafa Kamal Sarker, Hatem A. Rashwan, Farhan Akram, Syeda Furruka Banu, Adel Saleh, Vivek Kumar Singh, Forhad U H Chowdhury, Saddam Abdulwahab, Santiago Romani, Petia Radeva, Domenec Puig

Figure 1 for SLSDeep: Skin Lesion Segmentation Based on Dilated Residual and Pyramid Pooling Networks

Figure 2 for SLSDeep: Skin Lesion Segmentation Based on Dilated Residual and Pyramid Pooling Networks

Figure 3 for SLSDeep: Skin Lesion Segmentation Based on Dilated Residual and Pyramid Pooling Networks

Figure 4 for SLSDeep: Skin Lesion Segmentation Based on Dilated Residual and Pyramid Pooling Networks

Skin lesion segmentation (SLS) in dermoscopic images is a crucial task for automated diagnosis of melanoma. In this paper, we present a robust deep learning SLS model, so-called SLSDeep, which is represented as an encoder-decoder network. The encoder network is constructed by dilated residual layers, in turn, a pyramid pooling network followed by three convolution layers is used for the decoder. Unlike the traditional methods employing a cross-entropy loss, we investigated a loss function by combining both Negative Log Likelihood (NLL) and End Point Error (EPE) to accurately segment the melanoma regions with sharp boundaries. The robustness of the proposed model was evaluated on two public databases: ISBI 2016 and 2017 for skin lesion analysis towards melanoma detection challenge. The proposed model outperforms the state-of-the-art methods in terms of segmentation accuracy. Moreover, it is capable to segment more than $100$ images of size 384x384 per second on a recent GPU.

* Accepted in MICCAI 2018, 9 pages

Via

Access Paper or Ask Questions

A Deep Learning Interpretable Classifier for Diabetic Retinopathy Disease Grading

Dec 21, 2017
Jordi de la Torre, Aida Valls, Domenec Puig

Figure 1 for A Deep Learning Interpretable Classifier for Diabetic Retinopathy Disease Grading

Figure 2 for A Deep Learning Interpretable Classifier for Diabetic Retinopathy Disease Grading

Figure 3 for A Deep Learning Interpretable Classifier for Diabetic Retinopathy Disease Grading

Figure 4 for A Deep Learning Interpretable Classifier for Diabetic Retinopathy Disease Grading

Deep neural network models have been proven to be very successful in image classification tasks, also for medical diagnosis, but their main concern is its lack of interpretability. They use to work as intuition machines with high statistical confidence but unable to give interpretable explanations about the reported results. The vast amount of parameters of these models make difficult to infer a rationale interpretation from them. In this paper we present a diabetic retinopathy interpretable classifier able to classify retine images into the different levels of disease severity and of explaining its results by assigning a score for every point in the hidden and input space, evaluating its contribution to the final classification in a linear way. The generated visual maps can be interpreted by an expert in order to compare its own knowledge with the interpretation given by the model.

* Submitted to Elsevier

Via

Access Paper or Ask Questions

Analyzing Stability of Convolutional Neural Networks in the Frequency Domain

Nov 16, 2015
Elnaz J. Heravi, Hamed H. Aghdam, Domenec Puig

Figure 1 for Analyzing Stability of Convolutional Neural Networks in the Frequency Domain

Figure 2 for Analyzing Stability of Convolutional Neural Networks in the Frequency Domain

Figure 3 for Analyzing Stability of Convolutional Neural Networks in the Frequency Domain

Figure 4 for Analyzing Stability of Convolutional Neural Networks in the Frequency Domain

Understanding the internal process of ConvNets is commonly done using visualization techniques. However, these techniques do not usually provide a tool for estimating the stability of a ConvNet against noise. In this paper, we show how to analyze a ConvNet in the frequency domain using a 4-dimensional visualization technique. Using the frequency domain analysis, we show the reason that a ConvNet might be sensitive to a very low magnitude additive noise. Our experiments on a few ConvNets trained on different datasets revealed that convolution kernels of a trained ConvNet usually pass most of the frequencies and they are not able to effectively eliminate the effect of high frequencies. Our next experiments shows that a convolution kernel which has a more concentrated frequency response could be more stable. Finally, we show that fine-tuning a ConvNet using a training set augmented with noisy images can produce more stable ConvNets.

* Under review as a conference paper at ICLR2016, minor changes in the text

Via

Access Paper or Ask Questions