Models, code, and papers for "deep learning":

Deep learning observables in computational fluid dynamics

Mar 07, 2019
Kjetil O. Lye, Siddhartha Mishra, Deep Ray

Many large scale problems in computational fluid dynamics such as uncertainty quantification, Bayesian inversion, data assimilation and PDE constrained optimization are considered very challenging computationally as they require a large number of expensive (forward) numerical solutions of the corresponding PDEs. We propose a machine learning algorithm, based on deep artificial neural networks, that learns the underlying input parameters to observable map from a few training samples (computed realizations of this map). By a judicious combination of theoretical arguments and empirical observations, we find suitable network architectures and training hyperparameters that result in robust and efficient neural network approximations of the parameters to observable map. Numerical experiments for realistic high dimensional test problems, demonstrate that even with approximately 100 training samples, the resulting neural networks have a prediction error of less than one to two percent, at a computational cost which is several orders of magnitude lower than the cost of the underlying PDE solver. Moreover, we combine the proposed deep learning algorithm with Monte Carlo (MC) and Quasi-Monte Carlo (QMC) methods to efficiently compute uncertainty propagation for nonlinear PDEs. Under the assumption that the underlying neural networks generalize well, we prove that the deep learning MC and QMC algorithms are guaranteed to be faster than the baseline (quasi-) Monte Carlo methods. Numerical experiments demonstrating one to two orders of magnitude speed up over baseline QMC and MC algorithms, for the intricate problem of computing probability distributions of the observable, are also presented.


  Access Model/Code and Paper
DCIL: Deep Contextual Internal Learning for Image Restoration and Image Retargeting

Dec 09, 2019
Indra Deep Mastan, Shanmuganathan Raman

Recently, there is a vast interest in developing methods which are independent of the training samples such as deep image prior, zero-shot learning, and internal learning. The methods above are based on the common goal of maximizing image features learning from a single image despite inherent technical diversity. In this work, we bridge the gap between the various unsupervised approaches above and propose a general framework for image restoration and image retargeting. We use contextual feature learning and internal learning to improvise the structure similarity between the source and the target images. We perform image resize application in the following setups: classical image resize using super-resolution, a challenging image resize where the low-resolution image contains noise, and content-aware image resize using image retargeting. We also provide comparisons to the relevant state-of-the-art methods.


  Access Model/Code and Paper
Multi-level Encoder-Decoder Architectures for Image Restoration

May 06, 2019
Indra Deep Mastan, Shanmuganathan Raman

Many real-world solutions for image restoration are learning-free and based on handcrafted image priors such as self-similarity. Recently, deep-learning methods that use training data have achieved state-of-the-art results in various image restoration tasks (e.g., super-resolution and inpainting). Ulyanov et al. bridge the gap between these two families of methods (CVPR 18). They have shown that learning-free methods perform close to the state-of-the-art learning-based methods (approximately 1 PSNR). Their approach benefits from the encoder-decoder network. In this paper, we propose a framework based on the multi-level extensions of the encoder-decoder network, to investigate interesting aspects of the relationship between image restoration and network construction independent of learning. Our framework allows various network structures by modifying the following network components: skip links, cascading of the network input into intermediate layers, a composition of the encoder-decoder subnetworks, and network depth. These handcrafted network structures illustrate how the construction of untrained networks influence the following image restoration tasks: denoising, super-resolution, and inpainting. We also demonstrate image reconstruction using flash and no-flash image pairs. We provide performance comparisons with the state-of-the-art methods for all the restoration tasks above.

* Accepted in the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop: "New Trends in Image Restoration and Enhancement workshop (NTIRE) 2019" 

  Access Model/Code and Paper
Self-Supervised Relative Depth Learning for Urban Scene Understanding

Apr 02, 2018
Huaizu Jiang, Erik Learned-Miller, Gustav Larsson, Michael Maire, Greg Shakhnarovich

As an agent moves through the world, the apparent motion of scene elements is (usually) inversely proportional to their depth. It is natural for a learning agent to associate image patterns with the magnitude of their displacement over time: as the agent moves, faraway mountains don't move much; nearby trees move a lot. This natural relationship between the appearance of objects and their motion is a rich source of information about the world. In this work, we start by training a deep network, using fully automatic supervision, to predict relative scene depth from single images. The relative depth training images are automatically derived from simple videos of cars moving through a scene, using recent motion segmentation techniques, and no human-provided labels. This proxy task of predicting relative depth from a single image induces features in the network that result in large improvements in a set of downstream tasks including semantic segmentation, joint road segmentation and car detection, and monocular (absolute) depth estimation, over a network trained from scratch. The improvement on the semantic segmentation task is greater than those produced by any other automatically supervised methods. Moreover, for monocular depth estimation, our unsupervised pre-training method even outperforms supervised pre-training with ImageNet. In addition, we demonstrate benefits from learning to predict (unsupervised) relative depth in the specific videos associated with various downstream tasks. We adapt to the specific scenes in those tasks in an unsupervised manner to improve performance. In summary, for semantic segmentation, we present state-of-the-art results among methods that do not use supervised pre-training, and we even exceed the performance of supervised ImageNet pre-trained models for monocular depth estimation, achieving results that are comparable with state-of-the-art methods.


  Access Model/Code and Paper
One-to-many face recognition with bilinear CNNs

Mar 28, 2016
Aruni RoyChowdhury, Tsung-Yu Lin, Subhransu Maji, Erik Learned-Miller

The recent explosive growth in convolutional neural network (CNN) research has produced a variety of new architectures for deep learning. One intriguing new architecture is the bilinear CNN (B-CNN), which has shown dramatic performance gains on certain fine-grained recognition problems [15]. We apply this new CNN to the challenging new face recognition benchmark, the IARPA Janus Benchmark A (IJB-A) [12]. It features faces from a large number of identities in challenging real-world conditions. Because the face images were not identified automatically using a computerized face detection system, it does not have the bias inherent in such a database. We demonstrate the performance of the B-CNN model beginning from an AlexNet-style network pre-trained on ImageNet. We then show results for fine-tuning using a moderate-sized and public external database, FaceScrub [17]. We also present results with additional fine-tuning on the limited training data provided by the protocol. In each case, the fine-tuned bilinear model shows substantial improvements over the standard CNN. Finally, we demonstrate how a standard CNN pre-trained on a large face database, the recently released VGG-Face model [20], can be converted into a B-CNN without any additional feature training. This B-CNN improves upon the CNN performance on the IJB-A benchmark, achieving 89.5% rank-1 recall.

* Published version at WACV 2016 

  Access Model/Code and Paper
Why & When Deep Learning Works: Looking Inside Deep Learnings

May 10, 2017
Ronny Ronen

The Intel Collaborative Research Institute for Computational Intelligence (ICRI-CI) has been heavily supporting Machine Learning and Deep Learning research from its foundation in 2012. We have asked six leading ICRI-CI Deep Learning researchers to address the challenge of "Why & When Deep Learning works", with the goal of looking inside Deep Learning, providing insights on how deep networks function, and uncovering key observations on their expressiveness, limitations, and potential. The output of this challenge resulted in five papers that address different facets of deep learning. These different facets include a high-level understating of why and when deep networks work (and do not work), the impact of geometry on the expressiveness of deep networks, and making deep networks interpretable.

* This paper is the preface part of the "Why & When Deep Learning works looking inside Deep Learning" ICRI-CI paper bundle 

  Access Model/Code and Paper
EVDodge: Embodied AI For High-Speed Dodging On A Quadrotor Using Event Cameras

Jun 07, 2019
Nitin J. Sanket, Chethan M. Parameshwara, Chahat Deep Singh, Ashwin V. Kuruttukulam, Cornelia Fermüller, Davide Scaramuzza, Yiannis Aloimonos

The human fascination to understand ultra-efficient agile flying beings like birds and bees have propelled decades of research on trying to solve the problem of obstacle avoidance on micro aerial robots. However, most of the prior research has focused on static obstacle avoidance. This is due to the lack of high-speed visual sensors and scalable visual algorithms. The last decade has seen an exponential growth of neuromorphic sensors which are inspired by nature and have the potential to be the de facto standard for visual motion estimation problems. After re-imagining the navigation stack of a micro air vehicle as a series of hierarchical competences, we develop a purposive artificial intelligence based formulation for the problem of general navigation. We call this AI framework "Embodied AI" - AI design based on the knowledge of agent's hardware limitations and timing/computation constraints. Following this design philosophy we develop a complete AI navigation stack for dodging multiple dynamic obstacles on a quadrotor with a monocular event camera and computation. We also present an approach to directly transfer the shallow neural networks trained in simulation to the real world by subsuming pre-processing using a neural network into the pipeline. We successfully evaluate and demonstrate the proposed approach in many real-world experiments with obstacles of different shapes and sizes, achieving an overall success rate of 70% including objects of unknown shape and a low light testing scenario. To our knowledge, this is the first deep learning based solution to the problem of dynamic obstacle avoidance using event cameras on a quadrotor. Finally, we also extend our work to the pursuit task by merely reversing the control policy proving that our navigation stack can cater to different scenarios.

* 10 pages, 11 figures, Supplementary can be found at: https://prg.cs.umd.edu/EVDodge 

  Access Model/Code and Paper
Related Tasks can Share! A Multi-task Framework for Affective language

Feb 06, 2020
Kumar Shikhar Deep, Md Shad Akhtar, Asif Ekbal, Pushpak Bhattacharyya

Expressing the polarity of sentiment as 'positive' and 'negative' usually have limited scope compared with the intensity/degree of polarity. These two tasks (i.e. sentiment classification and sentiment intensity prediction) are closely related and may offer assistance to each other during the learning process. In this paper, we propose to leverage the relatedness of multiple tasks in a multi-task learning framework. Our multi-task model is based on convolutional-Gated Recurrent Unit (GRU) framework, which is further assisted by a diverse hand-crafted feature set. Evaluation and analysis suggest that joint-learning of the related tasks in a multi-task framework can outperform each of the individual tasks in the single-task frameworks.

* 12 pages, 3 figures and 3 tables. Accepted in 20th International Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2019. To be published in Springer LNCS volume 

  Access Model/Code and Paper
Deep Meta-Learning: Learning to Learn in the Concept Space

Feb 10, 2018
Fengwei Zhou, Bin Wu, Zhenguo Li

Few-shot learning remains challenging for meta-learning that learns a learning algorithm (meta-learner) from many related tasks. In this work, we argue that this is due to the lack of a good representation for meta-learning, and propose deep meta-learning to integrate the representation power of deep learning into meta-learning. The framework is composed of three modules, a concept generator, a meta-learner, and a concept discriminator, which are learned jointly. The concept generator, e.g. a deep residual net, extracts a representation for each instance that captures its high-level concept, on which the meta-learner performs few-shot learning, and the concept discriminator recognizes the concepts. By learning to learn in the concept space rather than in the complicated instance space, deep meta-learning can substantially improve vanilla meta-learning, which is demonstrated on various few-shot image recognition problems. For example, on 5-way-1-shot image recognition on CIFAR-100 and CUB-200, it improves Matching Nets from 50.53% and 56.53% to 58.18% and 63.47%, improves MAML from 49.28% and 50.45% to 56.65% and 64.63%, and improves Meta-SGD from 53.83% and 53.34% to 61.62% and 66.95%, respectively.


  Access Model/Code and Paper
What Really is Deep Learning Doing?

Nov 06, 2017
Chuyu Xiong

Deep learning has achieved a great success in many areas, from computer vision to natural language processing, to game playing, and much more. Yet, what deep learning is really doing is still an open question. There are a lot of works in this direction. For example, [5] tried to explain deep learning by group renormalization, and [6] tried to explain deep learning from the view of functional approximation. In order to address this very crucial question, here we see deep learning from perspective of mechanical learning and learning machine (see [1], [2]). From this particular angle, we can see deep learning much better and answer with confidence: What deep learning is really doing? why it works well, how it works, and how much data is necessary for learning. We also will discuss advantages and disadvantages of deep learning at the end of this work.


  Access Model/Code and Paper
Deep learning research landscape & roadmap in a nutshell: past, present and future -- Towards deep cortical learning

Jul 30, 2019
Aras R. Dargazany

The past, present and future of deep learning is presented in this work. Given this landscape & roadmap, we predict that deep cortical learning will be the convergence of deep learning & cortical learning which builds an artificial cortical column ultimately.


  Access Model/Code and Paper
How deep learning works --The geometry of deep learning

Oct 30, 2017
Xiao Dong, Jiasong Wu, Ling Zhou

Why and how that deep learning works well on different tasks remains a mystery from a theoretical perspective. In this paper we draw a geometric picture of the deep learning system by finding its analogies with two existing geometric structures, the geometry of quantum computations and the geometry of the diffeomorphic template matching. In this framework, we give the geometric structures of different deep learning systems including convolutional neural networks, residual networks, recursive neural networks, recurrent neural networks and the equilibrium prapagation framework. We can also analysis the relationship between the geometrical structures and their performance of different networks in an algorithmic level so that the geometric framework may guide the design of the structures and algorithms of deep learning systems.

* 16 pages, 13 figures 

  Access Model/Code and Paper
Simultaneous Detection of Multiple Appliances from Smart-meter Measurements via Multi-Label Consistent Deep Dictionary Learning and Deep Transform Learning

Dec 11, 2019
Vanika Singhal, Jyoti Maggu, Angshul Majumdar

Currently there are several well-known approaches to non-intrusive appliance load monitoring rule based, stochastic finite state machines, neural networks and sparse coding. Recently several studies have proposed a new approach based on multi label classification. Different appliances are treated as separate classes, and the task is to identify the classes given the aggregate smart-meter reading. Prior studies in this area have used off the shelf algorithms like MLKNN and RAKEL to address this problem. In this work, we propose a deep learning based technique. There are hardly any studies in deep learning based multi label classification; two new deep learning techniques to solve the said problem are fundamental contributions of this work. These are deep dictionary learning and deep transform learning. Thorough experimental results on benchmark datasets show marked improvement over existing studies.

* Final paper accepted at IEEE Transactions on Smart Grid 

  Access Model/Code and Paper
Online Deep Learning: Learning Deep Neural Networks on the Fly

Nov 10, 2017
Doyen Sahoo, Quang Pham, Jing Lu, Steven C. H. Hoi

Deep Neural Networks (DNNs) are typically trained by backpropagation in a batch learning setting, which requires the entire training data to be made available prior to the learning task. This is not scalable for many real-world scenarios where new data arrives sequentially in a stream form. We aim to address an open challenge of "Online Deep Learning" (ODL) for learning DNNs on the fly in an online setting. Unlike traditional online learning that often optimizes some convex objective function with respect to a shallow model (e.g., a linear/kernel-based hypothesis), ODL is significantly more challenging since the optimization of the DNN objective function is non-convex, and regular backpropagation does not work well in practice, especially for online learning settings. In this paper, we present a new online deep learning framework that attempts to tackle the challenges by learning DNN models of adaptive depth from a sequence of training data in an online learning setting. In particular, we propose a novel Hedge Backpropagation (HBP) method for online updating the parameters of DNN effectively, and validate the efficacy of our method on large-scale data sets, including both stationary and concept drifting scenarios.


  Access Model/Code and Paper
Opening the black box of deep learning

May 22, 2018
Dian Lei, Xiaoxiao Chen, Jianfei Zhao

The great success of deep learning shows that its technology contains profound truth, and understanding its internal mechanism not only has important implications for the development of its technology and effective application in various fields, but also provides meaningful insights into the understanding of human brain mechanism. At present, most of the theoretical research on deep learning is based on mathematics. This dissertation proposes that the neural network of deep learning is a physical system, examines deep learning from three different perspectives: microscopic, macroscopic, and physical world views, answers multiple theoretical puzzles in deep learning by using physics principles. For example, from the perspective of quantum mechanics and statistical physics, this dissertation presents the calculation methods for convolution calculation, pooling, normalization, and Restricted Boltzmann Machine, as well as the selection of cost functions, explains why deep learning must be deep, what characteristics are learned in deep learning, why Convolutional Neural Networks do not have to be trained layer by layer, and the limitations of deep learning, etc., and proposes the theoretical direction and basis for the further development of deep learning now and in the future. The brilliance of physics flashes in deep learning, we try to establish the deep learning technology based on the scientific theory of physics.


  Access Model/Code and Paper
In-Machine-Learning Database: Reimagining Deep Learning with Old-School SQL

Apr 14, 2020
Len Du

In-database machine learning has been very popular, almost being a cliche. However, can we do it the other way around? In this work, we say "yes" by applying plain old SQL to deep learning, in a sense implementing deep learning algorithms with SQL. Most deep learning frameworks, as well as generic machine learning ones, share a de facto standard of multidimensional array operations, underneath fancier infrastructure such as automatic differentiation. As SQL tables can be regarded as generalisations of (multi-dimensional) arrays, we have found a way to express common deep learning operations in SQL, encouraging a different way of thinking and thus potentially novel models. In particular, one of the latest trend in deep learning was the introduction of sparsity in the name of graph convolutional networks, whereas we take sparsity almost for granted in the database world. As both databases and machine learning involve transformation of datasets, we hope this work can inspire further works utilizing the large body of existing wisdom, algorithms and technologies in the database field to advance the state of the art in machine learning, rather than merely integerating machine learning into databases.


  Access Model/Code and Paper
Discriminative Learning via Adaptive Questioning

Apr 11, 2020
Achal Bassamboo, Vikas Deep, Sandeep Juneja, Assaf Zeevi

We consider the problem of designing an adaptive sequence of questions that optimally classify a candidate's ability into one of several categories or discriminative grades. A candidate's ability is modeled as an unknown parameter, which, together with the difficulty of the question asked, determines the likelihood with which s/he is able to answer a question correctly. The learning algorithm is only able to observe these noisy responses to its queries. We consider this problem from a fixed confidence-based $\delta$-correct framework, that in our setting seeks to arrive at the correct ability discrimination at the fastest possible rate while guaranteeing that the probability of error is less than a pre-specified and small $\delta$. In this setting we develop lower bounds on any sequential questioning strategy and develop geometrical insights into the problem structure both from primal and dual formulation. In addition, we arrive at algorithms that essentially match these lower bounds. Our key conclusions are that, asymptotically, any candidate needs to be asked questions at most at two (candidate ability-specific) levels, although, in a reasonably general framework, questions need to be asked only at a single level. Further, and interestingly, the problem structure facilitates endogenous exploration, so there is no need for a separately designed exploration stage in the algorithm.

* 3 figures 

  Access Model/Code and Paper
Deep Reinforcement Learning for Conversational AI

Sep 15, 2017
Mahipal Jadeja, Neelanshi Varia, Agam Shah

Deep reinforcement learning is revolutionizing the artificial intelligence field. Currently, it serves as a good starting point for constructing intelligent autonomous systems which offer a better knowledge of the visual world. It is possible to scale deep reinforcement learning with the use of deep learning and do amazing tasks such as use of pixels in playing video games. In this paper, key concepts of deep reinforcement learning including reward function, differences between reinforcement learning and supervised learning and models for implementation of reinforcement are discussed. Key challenges related to the implementation of reinforcement learning in conversational AI domain are identified as well as discussed in detail. Various conversational models which are based on deep reinforcement learning (as well as deep learning) are also discussed. In summary, this paper discusses key aspects of deep reinforcement learning which are crucial for designing an efficient conversational AI.

* SCAI'17-Search-Oriented Conversational AI (@ICTIR) 

  Access Model/Code and Paper
Recent Advances in Deep Learning: An Overview

Jul 21, 2018
Matiur Rahman Minar, Jibon Naher

Deep Learning is one of the newest trends in Machine Learning and Artificial Intelligence research. It is also one of the most popular scientific research trends now-a-days. Deep learning methods have brought revolutionary advances in computer vision and machine learning. Every now and then, new and new deep learning techniques are being born, outperforming state-of-the-art machine learning and even existing deep learning techniques. In recent years, the world has seen many major breakthroughs in this field. Since deep learning is evolving at a huge speed, its kind of hard to keep track of the regular advances especially for new researchers. In this paper, we are going to briefly discuss about recent advances in Deep Learning for past few years.

* 31 pages including bibliography 

  Access Model/Code and Paper
Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning

Sep 14, 2017
Briland Hitaj, Giuseppe Ateniese, Fernando Perez-Cruz

Deep Learning has recently become hugely popular in machine learning, providing significant improvements in classification accuracy in the presence of highly-structured and large databases. Researchers have also considered privacy implications of deep learning. Models are typically trained in a centralized manner with all the data being processed by the same training algorithm. If the data is a collection of users' private data, including habits, personal pictures, geographical positions, interests, and more, the centralized server will have access to sensitive information that could potentially be mishandled. To tackle this problem, collaborative deep learning models have recently been proposed where parties locally train their deep learning structures and only share a subset of the parameters in the attempt to keep their respective training sets private. Parameters can also be obfuscated via differential privacy (DP) to make information extraction even more challenging, as proposed by Shokri and Shmatikov at CCS'15. Unfortunately, we show that any privacy-preserving collaborative deep learning is susceptible to a powerful attack that we devise in this paper. In particular, we show that a distributed, federated, or decentralized deep learning approach is fundamentally broken and does not protect the training sets of honest participants. The attack we developed exploits the real-time nature of the learning process that allows the adversary to train a Generative Adversarial Network (GAN) that generates prototypical samples of the targeted training set that was meant to be private (the samples generated by the GAN are intended to come from the same distribution as the training data). Interestingly, we show that record-level DP applied to the shared parameters of the model, as suggested in previous work, is ineffective (i.e., record-level DP is not designed to address our attack).

* ACM CCS'17, 16 pages, 18 figures 

  Access Model/Code and Paper