We suggest to employ techniques from Natural Language Processing (NLP) and Knowledge Representation (KR) to transform existing documents into documents amenable for the Semantic Web. Semantic Web documents have at least part of their semantics and pragmatics marked up explicitly in both a machine processable as well as human readable manner. XML and its related standards (XSLT, RDF, Topic Maps etc.) are the unifying platform for the tools and methodologies developed for different application scenarios.
Adversarial robustness studies the worst-case performance of a machine learning model to ensure safety and reliability. With the proliferation of deep-learning based technology, the potential risks associated with model development and deployment can be amplified and become dreadful vulnerabilities. This paper provides a comprehensive overview of research topics and foundational principles of research methods for adversarial robustness of deep learning models, including attacks, defenses, verification, and novel applications.
We present a survey covering the state of the art in low-resource machine translation. There are currently around 7000 languages spoken in the world and almost all language pairs lack significant resources for training machine translation models. There has been increasing interest in research addressing the challenge of producing useful translation models when very little translated training data is available. We present a high level summary of this topical field and provide an overview of best practices.
Density estimation is an interdisciplinary topic at the intersection of statistics, theoretical computer science and machine learning. We review some old and new techniques for bounding the sample complexity of estimating densities of continuous distributions, focusing on the class of mixtures of Gaussians and its subclasses. In particular, we review the main techniques used to prove the new sample complexity bounds for mixtures of Gaussians by Ashtiani, Ben-David, Harvey, Liaw, Mehrabian, and Plan arXiv:1710.05209.
Compression of Neural Networks (NN) has become a highly studied topic in recent years. The main reason for this is the demand for industrial scale usage of NNs such as deploying them on mobile devices, storing them efficiently, transmitting them via band-limited channels and most importantly doing inference at scale. In this work, we propose to join the Soft-Weight Sharing and Variational Dropout approaches that show strong results to define a new state-of-the-art in terms of model compression.
Scientific publishing conveys the outputs of an academic or research activity, in this sense; it also reflects the efforts and issues in which people engage. To identify potential collaborative networks one of the simplest approaches is to leverage the co-authorship relations. In this approach, semantic and hierarchic relationships defined by a Knowledge Organization System are used in order to improve the system's ability to recommend potential networks beyond the lexical or syntactic analysis of the topics or concepts that are of interest to academics.
In this paper, we demonstrate the importance of coreference resolution for natural language processing on the example of the TAC Slot Filling shared task. We illustrate the strengths and weaknesses of automatic coreference resolution systems and provide experimental results to show that they improve performance in the slot filling end-to-end setting. Finally, we publish KBPchains, a resource containing automatically extracted coreference chains from the TAC source corpus in order to support other researchers working on this topic.
This paper presents a system which learns to answer questions on a broad range of topics from a knowledge base using few hand-crafted features. Our model learns low-dimensional embeddings of words and knowledge base constituents; these representations are used to score natural language questions against candidate answers. Training our system using pairs of questions and structured representations of their answers, and pairs of question paraphrases, yields competitive results on a competitive benchmark of the literature.
Evidential reasoning is now a leading topic in Artificial Intelligence. Evidence is represented by a variety of evidential functions. Evidential reasoning is carried out by certain kinds of fundamental operation on these functions. This paper discusses two of the basic operations on evidential functions, the discount operation and the well-known orthogonal sum operation. We show that the discount operation is not commutative with the orthogonal sum operation, and derive expressions for the two operations applied to the various evidential function.
A review of Bayesian restoration of digital images based on Monte Carlo techniques is presented. The topics covered include Likelihood, Prior and Posterior distributions, Poisson, Binay symmetric channel, and Gaussian channel models of Likelihood distribution,Ising and Potts spin models of Prior distribution, restoration of an image through Posterior maximization, statistical estimation of a true image from Posterior ensembles, Markov Chain Monte Carlo methods and cluster algorithms.