Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stephen Baek

On the role of depth predictions for 3D human pose estimation

Mar 03, 2021

Alec Diaz-Arias, Mitchell Messmore, Dmitriy Shin, Stephen Baek

Figure 1 for On the role of depth predictions for 3D human pose estimation

Figure 2 for On the role of depth predictions for 3D human pose estimation

Figure 3 for On the role of depth predictions for 3D human pose estimation

Figure 4 for On the role of depth predictions for 3D human pose estimation

Abstract:Following the successful application of deep convolutional neural networks to 2d human pose estimation, the next logical problem to solve is 3d human pose estimation from monocular images. While previous solutions have shown some success, they do not fully utilize the depth information from the 2d inputs. With the goal of addressing this depth ambiguity, we build a system that takes 2d joint locations as input along with their estimated depth value and predicts their 3d positions in camera coordinates. Given the inherent noise and inaccuracy from estimating depth maps from monocular images, we perform an extensive statistical analysis showing that given this noise there is still a statistically significant correlation between the predicted depth values and the third coordinate of camera coordinates. We further explain how the state-of-the-art results we achieve on the H3.6M validation set are due to the additional input of depth. Notably, our results are produced on neural network that accepts a low dimensional input and be integrated into a real-time system. Furthermore, our system can be combined with an off-the-shelf 2d pose detector and a depth map predictor to perform 3d pose estimation in the wild.

* 13 pages, 6 figures, and 8 tables

Via

Access Paper or Ask Questions

Incremental ELMVIS for unsupervised learning

Dec 18, 2019

Anton Akusok, Emil Eirola, Yoan Miche, Ian Oliver, Kaj-Mikael Björk, Andrey Gritsenko, Stephen Baek, Amaury Lendasse

Figure 1 for Incremental ELMVIS for unsupervised learning

Figure 2 for Incremental ELMVIS for unsupervised learning

Figure 3 for Incremental ELMVIS for unsupervised learning

Figure 4 for Incremental ELMVIS for unsupervised learning

Abstract:An incremental version of the ELMVIS+ method is proposed in this paper. It iteratively selects a few best fitting data samples from a large pool, and adds them to the model. The method keeps high speed of ELMVIS+ while allowing for much larger possible sample pools due to lower memory requirements. The extension is useful for reaching a better local optimum with greedy optimization of ELMVIS, and the data structure can be specified in semi-supervised optimization. The major new application of incremental ELMVIS is not to visualization, but to a general dataset processing. The method is capable of learning dependencies from non-organized unsupervised data -- either reconstructing a shuffled dataset, or learning dependencies in complex high-dimensional space. The results are interesting and promising, although there is space for improvements.

* Proceedings of ELM-2016 (pp. 183-193). Springer, Cham

Via

Access Paper or Ask Questions

Multi-scale Embedded CNN for Music Tagging

Jun 16, 2019

Nima Hamidi, Mohsen Vahidzadeh, Stephen Baek

Figure 1 for Multi-scale Embedded CNN for Music Tagging

Figure 2 for Multi-scale Embedded CNN for Music Tagging

Abstract:Convolutional neural networks (CNN) recently gained notable attraction in a variety of machine learning tasks: including music classification and style tagging. In this work, we propose implementing intermediate connections to the CNN architecture to facilitate the transfer of multi-scale/level knowledge between different layers. Our novel model for music tagging shows significant improvement in comparison to the proposed approaches in the literature, due to its ability to carry low-level timbral features to the last layer.

* Proceedings of the 36th International Conference on Machine Learning (ICML)

Via

Access Paper or Ask Questions

What does AI see? Deep segmentation networks discover biomarkers for lung cancer survival

Mar 26, 2019

Stephen Baek, Yusen He, Bryan G. Allen, John M. Buatti, Brian J. Smith, Kristin A. Plichta, Steven N. Seyedin, Maggie Gannon, Katherine R. Cabel, Yusung Kim(+1 more)

Figure 1 for What does AI see? Deep segmentation networks discover biomarkers for lung cancer survival

Figure 2 for What does AI see? Deep segmentation networks discover biomarkers for lung cancer survival

Figure 3 for What does AI see? Deep segmentation networks discover biomarkers for lung cancer survival

Figure 4 for What does AI see? Deep segmentation networks discover biomarkers for lung cancer survival

Abstract:Non-small-cell lung cancer (NSCLC) represents approximately 80-85% of lung cancer diagnoses and is the leading cause of cancer-related death worldwide. Recent studies indicate that image-based radiomics features from positron emission tomography-computed tomography (PET/CT) images have predictive power on NSCLC outcomes. To this end, easily calculated functional features such as the maximum and the mean of standard uptake value (SUV) and total lesion glycolysis (TLG) are most commonly used for NSCLC prognostication, but their prognostic value remains controversial. Meanwhile, convolutional neural networks (CNN) are rapidly emerging as a new premise for cancer image analysis, with significantly enhanced predictive power compared to other hand-crafted radiomics features. Here we show that CNN trained to perform the tumor segmentation task, with no other information than physician contours, identify a rich set of survival-related image features with remarkable prognostic value. In a retrospective study on 96 NSCLC patients before stereotactic-body radiotherapy (SBRT), we found that the CNN segmentation algorithm (U-Net) trained for tumor segmentation in PET/CT images, contained features having strong correlation with 2- and 5-year overall and disease-specific survivals. The U-net algorithm has not seen any other clinical information (e.g. survival, age, smoking history) than the images and the corresponding tumor contours provided by physicians. Furthermore, through visualization of the U-Net, we also found convincing evidence that the regions of progression appear to match with the regions where the U-Net features identified patterns that predicted higher likelihood of death. We anticipate our findings will be a starting point for more sophisticated non-intrusive patient specific cancer prognosis determination.

Via

Access Paper or Ask Questions

ZerNet: Convolutional Neural Networks on Arbitrary Surfaces via Zernike Local Tangent Space Estimation

Dec 03, 2018

Zhiyu Sun, Jia Lu, Stephen Baek

Figure 1 for ZerNet: Convolutional Neural Networks on Arbitrary Surfaces via Zernike Local Tangent Space Estimation

Figure 2 for ZerNet: Convolutional Neural Networks on Arbitrary Surfaces via Zernike Local Tangent Space Estimation

Figure 3 for ZerNet: Convolutional Neural Networks on Arbitrary Surfaces via Zernike Local Tangent Space Estimation

Figure 4 for ZerNet: Convolutional Neural Networks on Arbitrary Surfaces via Zernike Local Tangent Space Estimation

Abstract:The research community has observed a massive success of convolutional neural networks (CNN) in visual recognition tasks. Such powerful CNNs, however, do not generalize well to arbitrary-shaped mainfold domains. Thus, still many visual recognition problems defined on arbitrary manifolds cannot benefit much from the success of CNNs, if at all. Technical difficulties hindering generalization of CNNs are rooted in the lack of a canonical grid-like representation, the notion of consistent orientation, and a compatible local topology across the domain. Unfortunately, except for a few pioneering works, only very little has been studied in this regard. To this end, in this paper, we propose a novel mathematical formulation to extend CNNs onto two-dimensional (2D) manifold domains. More specifically, we approximate a tensor field defined over a manifold using orthogonal basis functions, called Zernike polynomials, on local tangent spaces. We prove that the convolution of two functions can be represented as a simple dot product between Zernike polynomial coefficients. We also prove that a rotation of a convolution kernel equates to a 2 by 2 rotation matrix applied to Zernike polynomial coefficients, which can be critical in manifold domains. As such, the key contribution of this work resides in a concise but rigorous mathematical generalization of the CNN building blocks. Furthermore, comparative to the other state-of-the-art methods, our method demonstrates substantially better performance on both classification and regression tasks.

Via

Access Paper or Ask Questions

Deep Spectral Descriptors: Learning the point-wise correspondence metric via Siamese deep neural networks

Jun 25, 2018

Zhiyu Sun, Yusen He, Andrey Gritsenko, Amaury Lendasse, Stephen Baek

Figure 1 for Deep Spectral Descriptors: Learning the point-wise correspondence metric via Siamese deep neural networks

Figure 2 for Deep Spectral Descriptors: Learning the point-wise correspondence metric via Siamese deep neural networks

Figure 3 for Deep Spectral Descriptors: Learning the point-wise correspondence metric via Siamese deep neural networks

Figure 4 for Deep Spectral Descriptors: Learning the point-wise correspondence metric via Siamese deep neural networks

Abstract:A robust and informative local shape descriptor plays an important role in mesh registration. In this regard, spectral descriptors that are based on the spectrum of the Laplace-Beltrami operator have gained a spotlight among the researchers for the last decade due to their desirable properties, such as isometry invariance. Despite such, however, spectral descriptors often fail to give a correct similarity measure for non-isometric cases where the metric distortion between the models is large. Hence, they are in general not suitable for the registration problems, except for the special cases when the models are near-isometry. In this paper, we investigate a way to develop shape descriptors for non-isometric registration tasks by embedding the spectral shape descriptors into a different metric space where the Euclidean distance between the elements directly indicates the geometric dissimilarity. We design and train a Siamese deep neural network to find such an embedding, where the embedded descriptors are promoted to rearrange based on the geometric similarity. We found our approach can significantly enhance the performance of the conventional spectral descriptors for the non-isometric registration tasks, and outperforms recent state-of-the-art method reported in literature.

* Submitted to Computer-Aided Design

Via

Access Paper or Ask Questions

Wall Stress Estimation of Cerebral Aneurysm based on Zernike Convolutional Neural Networks

Jun 20, 2018

Zhiyu Sun, Jia Lu, Stephen Baek

Figure 1 for Wall Stress Estimation of Cerebral Aneurysm based on Zernike Convolutional Neural Networks

Figure 2 for Wall Stress Estimation of Cerebral Aneurysm based on Zernike Convolutional Neural Networks

Figure 3 for Wall Stress Estimation of Cerebral Aneurysm based on Zernike Convolutional Neural Networks

Figure 4 for Wall Stress Estimation of Cerebral Aneurysm based on Zernike Convolutional Neural Networks

Abstract:Convolutional neural networks (ConvNets) have demonstrated an exceptional capacity to discern visual patterns from digital images and signals. Unfortunately, such powerful ConvNets do not generalize well to arbitrary-shaped manifolds, where data representation does not fit into a tensor-like grid. Hence, many fields of science and engineering, where data points possess some manifold structure, cannot enjoy the full benefits of the recent advances in ConvNets. The aneurysm wall stress estimation problem introduced in this paper is one of many such problems. The problem is well-known to be of a paramount clinical importance, but yet, traditional ConvNets cannot be applied due to the manifold structure of the data, neither does the state-of-the-art geometric ConvNets perform well. Motivated by this, we propose a new geometric ConvNet method named ZerNet, which builds upon our novel mathematical generalization of convolution and pooling operations on manifolds. Our study shows that the ZerNet outperforms the other state-of-the-art geometric ConvNets in terms of accuracy.

* 10 pages

Via

Access Paper or Ask Questions