Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Convolutional Neural Networks over Tree Structures for Programming Language Processing

Dec 08, 2015
Lili Mou, Ge Li, Lu Zhang, Tao Wang, Zhi Jin

Programming language processing (similar to natural language processing) is a hot research topic in the field of software engineering; it has also aroused growing interest in the artificial intelligence community. However, different from a natural language sentence, a program contains rich, explicit, and complicated structural information. Hence, traditional NLP models may be inappropriate for programs. In this paper, we propose a novel tree-based convolutional neural network (TBCNN) for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information. TBCNN is a generic architecture for programming language processing; our experiments show its effectiveness in two different program analysis tasks: classifying programs according to functionality, and detecting code snippets of certain patterns. TBCNN outperforms baseline methods, including several neural models for NLP.

* Accepted at AAAI-16 

  Access Paper or Ask Questions

From End-User's Requirements to Web Services Retrieval: A Semantic and Intention-Driven Approach

Apr 23, 2015
Isabelle Mirbel, Pierre Crescenzo

In this paper, we present SATIS, a framework to derive Web Service specifications from end-user's requirements in order to opera-tionalise business processes in the context of a specific application domain. The aim of SATIS is to provide to neuroscientists, which are not familiar with computer science, a complete solution to easily find a set of Web Services to implement an image processing pipeline. More precisely, our framework offers the capability to capture high-level end-user's requirements in an iterative and incremental way and to turn them into queries to retrieve Web Services description. The whole framework relies on reusable and combinable elements which can be shared out by a community of users sharing some interest or problems for a given topic. In our approach, we adopt Web semantic languages and models as a unified framework to deal with end-user's requirements and Web Service descriptions in order to take advantage of their reasoning and traceability capabilities.

* {\'e}galement rapport de recherche I3S/RR--2010-03--FR in Computational Materials Science (2015). arXiv admin note: substantial text overlap with arXiv:1502.06735 

  Access Paper or Ask Questions

Sentiment Analysis based on User Tag for Traditional Chinese Medicine in Weibo

Oct 13, 2014
Junhui Shen, Peiyan Zhu, Rui Fan, Wei Tan

With the acceptance of Western culture and science, Traditional Chinese Medicine (TCM) has become a controversial issue in China. So, it's important to study the public's sentiment and opinion on TCM. The rapid development of online social network, such as twitter, make it convenient and efficient to sample hundreds of millions of people for the aforementioned sentiment study. To the best of our knowledge, the present work is the first attempt that applies sentiment analysis to the domain of TCM on Sina Weibo (a twitter-like microblogging service in China). In our work, firstly we collect tweets topic about TCM from Sina Weibo, and label the tweets as supporting TCM and opposing TCM automatically based on user tag. Then, a support vector machine classifier has been built to predict the sentiment of TCM tweets without labels. Finally, we present a method to adjust the classifier result. The performance of F-measure attained with our method is 97%.

* 7 pages, 8 figures,3 tables 

  Access Paper or Ask Questions

MANCaLog: A Logic for Multi-Attribute Network Cascades (Technical Report)

Jan 19, 2013
Paulo Shakarian, Gerardo I. Simari, Robert Schroeder

The modeling of cascade processes in multi-agent systems in the form of complex networks has in recent years become an important topic of study due to its many applications: the adoption of commercial products, spread of disease, the diffusion of an idea, etc. In this paper, we begin by identifying a desiderata of seven properties that a framework for modeling such processes should satisfy: the ability to represent attributes of both nodes and edges, an explicit representation of time, the ability to represent non-Markovian temporal relationships, representation of uncertain information, the ability to represent competing cascades, allowance of non-monotonic diffusion, and computational tractability. We then present the MANCaLog language, a formalism based on logic programming that satisfies all these desiderata, and focus on algorithms for finding minimal models (from which the outcome of cascades can be obtained) as well as how this formalism can be applied in real world scenarios. We are not aware of any other formalism in the literature that meets all of the above requirements.

  Access Paper or Ask Questions

Collaborative Filtering and the Missing at Random Assumption

Jun 20, 2012
Benjamin Marlin, Richard S. Zemel, Sam Roweis, Malcolm Slaney

Rating prediction is an important application, and a popular research topic in collaborative filtering. However, both the validity of learning algorithms, and the validity of standard testing procedures rest on the assumption that missing ratings are missing at random (MAR). In this paper we present the results of a user study in which we collect a random sample of ratings from current users of an online radio service. An analysis of the rating data collected in the study shows that the sample of random ratings has markedly different properties than ratings of user-selected songs. When asked to report on their own rating behaviour, a large number of users indicate they believe their opinion of a song does affect whether they choose to rate that song, a violation of the MAR condition. Finally, we present experimental results showing that incorporating an explicit model of the missing data mechanism can lead to significant improvements in prediction performance on the random sample of ratings.

* Appears in Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI2007) 

  Access Paper or Ask Questions

Métodos para la Selección y el Ajuste de Características en el Problema de la Detección de Spam

Oct 14, 2010
Carlos M. Lorenzetti, Rocío L. Cecchini, Ana G. Maguitman, András A. Benczúr

The email is used daily by millions of people to communicate around the globe and it is a mission-critical application for many businesses. Over the last decade, unsolicited bulk email has become a major problem for email users. An overwhelming amount of spam is flowing into users' mailboxes daily. In 2004, an estimated 62% of all email was attributed to spam. Spam is not only frustrating for most email users, it strains the IT infrastructure of organizations and costs businesses billions of dollars in lost productivity. In recent years, spam has evolved from an annoyance into a serious security threat, and is now a prime medium for phishing of sensitive information, as well the spread of malicious software. This work presents a first approach to attack the spam problem. We propose an algorithm that will improve a classifier's results by adjusting its training set data. It improves the document's vocabulary representation by detecting good topic descriptors and discriminators.

* Workshop de Investigadores en Ciencias de la Computacion, WICC 2010, El Calafate, Santa Cruz, Argentina 
* 5 pages, 1 figure, Workshop de Investigadores en Ciencias de la Computaci\'{o}n, WICC 2010, pp 48-52 

  Access Paper or Ask Questions

Modeling the Dynamics of Social Networks

May 24, 2006
Victor V. Kryssanov, Frank J. Rinaldo, Evgeny L. Kuleshov, Hitoshi Ogawa

Modeling human dynamics responsible for the formation and evolution of the so-called social networks - structures comprised of individuals or organizations and indicating connectivities existing in a community - is a topic recently attracting a significant research interest. It has been claimed that these dynamics are scale-free in many practically important cases, such as impersonal and personal communication, auctioning in a market, accessing sites on the WWW, etc., and that human response times thus conform to the power law. While a certain amount of progress has recently been achieved in predicting the general response rate of a human population, existing formal theories of human behavior can hardly be found satisfactory to accommodate and comprehensively explain the scaling observed in social networks. In the presented study, a novel system-theoretic modeling approach is proposed and successfully applied to determine important characteristics of a communication network and to analyze consumer behavior on the WWW.

* 8 pages, 3 figures. Preprint (as of May 24, 2006) 

  Access Paper or Ask Questions

ACM -- Attribute Conditioning for Abstractive Multi Document Summarization

May 09, 2022
Aiswarya Sankar, Ankit Chadha

Abstractive multi document summarization has evolved as a task through the basic sequence to sequence approaches to transformer and graph based techniques. Each of these approaches has primarily focused on the issues of multi document information synthesis and attention based approaches to extract salient information. A challenge that arises with multi document summarization which is not prevalent in single document summarization is the need to effectively summarize multiple documents that might have conflicting polarity, sentiment or subjective information about a given topic. In this paper we propose ACM, attribute conditioned multi document summarization,a model that incorporates attribute conditioning modules in order to decouple conflicting information by conditioning for a certain attribute in the output summary. This approach shows strong gains in ROUGE score over baseline multi document summarization approaches and shows gains in fluency, informativeness and reduction in repetitiveness as shown through a human annotation analysis study.

  Access Paper or Ask Questions

Action Languages Based Actual Causality in Ethical Decision Making Contexts

May 05, 2022
Camilo Sarmiento, Gauvain Bourgne, Daniele Cavalli, Katsumi Inoue, Jean-Gabriel Ganascia

Moral responsibility is closely intermixed with causality, even if it cannot be reduced to it. Besides, rationally understanding the evolution of the physical world is inherently linked with the idea of causality. It follows that decision making applications based on automated planning, especially if they integrate references to ethical norms, have inevitably to deal with causality. Despite these considerations, much of the work in computational ethics relegates causality to the background, if not ignores it completely. This paper contribution is double. The first one is to link up two research topics$\unicode{x2014}$automated planning and causality$\unicode{x2014}$by proposing an actual causation definition suitable for action languages. This definition is a formalisation of Wright's NESS test of causation. The second is to link up computational ethics and causality by showing the importance of causality in the simulation of ethical reasoning and by enabling the domain to deal with situations that were previously out of reach thanks to the actual causation definition proposed.

* 19 pages, 5 figures 

  Access Paper or Ask Questions

Leveraging Unlabeled Data for Sketch-based Understanding

Apr 26, 2022
Javier Morales, Nils Murrugarra-Llerena, Jose M. Saavedra

Sketch-based understanding is a critical component of human cognitive learning and is a primitive communication means between humans. This topic has recently attracted the interest of the computer vision community as sketching represents a powerful tool to express static objects and dynamic scenes. Unfortunately, despite its broad application domains, the current sketch-based models strongly rely on labels for supervised training, ignoring knowledge from unlabeled data, thus limiting the underlying generalization and the applicability. Therefore, we present a study about the use of unlabeled data to improve a sketch-based model. To this end, we evaluate variations of VAE and semi-supervised VAE, and present an extension of BYOL to deal with sketches. Our results show the superiority of sketch-BYOL, which outperforms other self-supervised approaches increasing the retrieval performance for known and unknown categories. Furthermore, we show how other tasks can benefit from our proposal.

* SketchDL at CVPR 2022 

  Access Paper or Ask Questions