Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Optimal client recommendation for market makers in illiquid financial products

Apr 27, 2017
Dieter Hendricks, Stephen J. Roberts

The process of liquidity provision in financial markets can result in prolonged exposure to illiquid instruments for market makers. In this case, where a proprietary position is not desired, pro-actively targeting the right client who is likely to be interested can be an effective means to offset this position, rather than relying on commensurate interest arising through natural demand. In this paper, we consider the inference of a client profile for the purpose of corporate bond recommendation, based on typical recorded information available to the market maker. Given a historical record of corporate bond transactions and bond meta-data, we use a topic-modelling analogy to develop a probabilistic technique for compiling a curated list of client recommendations for a particular bond that needs to be traded, ranked by probability of interest. We show that a model based on Latent Dirichlet Allocation offers promising performance to deliver relevant recommendations for sales traders.

* 12 pages, 3 figures, 1 table 

  Access Paper or Ask Questions

Detecting English Writing Styles For Non Native Speakers

Apr 24, 2017
Yanging Chen, Rami Al-Rfou', Yejin Choi

This paper presents the first attempt, up to our knowledge, to classify English writing styles on this scale with the challenge of classifying day to day language written by writers with different backgrounds covering various areas of topics.The paper proposes simple machine learning algorithms and simple to generate features to solve hard problems. Relying on the scale of the data available from large sources of knowledge like Wikipedia. We believe such sources of data are crucial to generate robust solutions for the web with high accuracy and easy to deploy in practice. The paper achieves 74\% accuracy classifying native versus non native speakers writing styles. Moreover, the paper shows some interesting observations on the similarity between different languages measured by the similarity of their users English writing styles. This technique could be used to show some well known facts about languages as in grouping them into families, which our experiments support.

* 9 figures, 5 tables, 9 pages 

  Access Paper or Ask Questions

Learning Contextualized Music Semantics from Tags via a Siamese Network

Jun 07, 2016
Ubai Sandouk, Ke Chen

Music information retrieval faces a challenge in modeling contextualized musical concepts formulated by a set of co-occurring tags. In this paper, we investigate the suitability of our recently proposed approach based on a Siamese neural network in fighting off this challenge. By means of tag features and probabilistic topic models, the network captures contextualized semantics from tags via unsupervised learning. This leads to a distributed semantics space and a potential solution to the out of vocabulary problem which has yet to be sufficiently addressed. We explore the nature of the resultant music-based semantics and address computational needs. We conduct experiments on three public music tag collections -namely, CAL500, MagTag5K and Million Song Dataset- and compare our approach to a number of state-of-the-art semantics learning approaches. Comparative results suggest that this approach outperforms previous approaches in terms of semantic priming and music tag completion.

* 20 pages. To appear in ACM TIST: Intelligent Music Systems and Applications 

  Access Paper or Ask Questions

Robust Kernel Density Estimation by Scaling and Projection in Hilbert Space

Nov 17, 2014
Robert A. Vandermeulen, Clayton D. Scott

While robust parameter estimation has been well studied in parametric density estimation, there has been little investigation into robust density estimation in the nonparametric setting. We present a robust version of the popular kernel density estimator (KDE). As with other estimators, a robust version of the KDE is useful since sample contamination is a common issue with datasets. What "robustness" means for a nonparametric density estimate is not straightforward and is a topic we explore in this paper. To construct a robust KDE we scale the traditional KDE and project it to its nearest weighted KDE in the $L^2$ norm. This yields a scaled and projected KDE (SPKDE). Because the squared $L^2$ norm penalizes point-wise errors superlinearly this causes the weighted KDE to allocate more weight to high density regions. We demonstrate the robustness of the SPKDE with numerical experiments and a consistency result which shows that asymptotically the SPKDE recovers the uncontaminated density under sufficient conditions on the contamination.

* Extended version of NIPS 2014 paper 

  Access Paper or Ask Questions

Appearance Descriptors for Person Re-identification: a Comprehensive Review

Jul 22, 2013
Riccardo Satta

In video-surveillance, person re-identification is the task of recognising whether an individual has already been observed over a network of cameras. Typically, this is achieved by exploiting the clothing appearance, as classical biometric traits like the face are impractical in real-world video surveillance scenarios. Clothing appearance is represented by means of low-level \textit{local} and/or \textit{global} features of the image, usually extracted according to some part-based body model to treat different body parts (e.g. torso and legs) independently. This paper provides a comprehensive review of current approaches to build appearance descriptors for person re-identification. The most relevant techniques are described in detail, and categorised according to the body models and features used. The aim of this work is to provide a structured body of knowledge and a starting point for researchers willing to conduct novel investigations on this challenging topic.

  Access Paper or Ask Questions

HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on Tabular and Textual Data

Apr 28, 2022
Kai Nakamura, Sharon Levy, Yi-Lin Tuan, Wenhu Chen, William Yang Wang

A pressing challenge in current dialogue systems is to successfully converse with users on topics with information distributed across different modalities. Previous work in multiturn dialogue systems has primarily focused on either text or table information. In more realistic scenarios, having a joint understanding of both is critical as knowledge is typically distributed over both unstructured and structured forms. We present a new dialogue dataset, HybriDialogue, which consists of crowdsourced natural conversations grounded on both Wikipedia text and tables. The conversations are created through the decomposition of complex multihop questions into simple, realistic multiturn dialogue interactions. We propose retrieval, system state tracking, and dialogue response generation tasks for our dataset and conduct baseline experiments for each. Our results show that there is still ample opportunity for improvement, demonstrating the importance of building stronger dialogue systems that can reason over the complex setting of information-seeking dialogue grounded on tables and text.

* Findings of ACL 2022 

  Access Paper or Ask Questions

A Method to Predict Semantic Relations on Artificial Intelligence Papers

Jan 24, 2022
Francisco Andrades, Ricardo Ñanculef

Predicting the emergence of links in large evolving networks is a difficult task with many practical applications. Recently, the Science4cast competition has illustrated this challenge presenting a network of 64.000 AI concepts and asking the participants to predict which topics are going to be researched together in the future. In this paper, we present a solution to this problem based on a new family of deep learning approaches, namely Graph Neural Networks. The results of the challenge show that our solution is competitive even if we had to impose severe restrictions to obtain a computationally efficient and parsimonious model: ignoring the intrinsic dynamics of the graph and using only a small subset of the nodes surrounding a target link. Preliminary experiments presented in this paper suggest the model is learning two related, but different patterns: the absorption of a node by a sub-graph and union of more dense sub-graphs. The model seems to excel at recognizing the first type of pattern.

* 2021 IEEE International Conference on Big Data (Big Data) 

  Access Paper or Ask Questions

Revelation of Task Difficulty in AI-aided Education

Jan 12, 2022
Yitzhak Spielberg, Amos Azaria

When a student is asked to perform a given task, her subjective estimate of the difficulty of that task has a strong influence on her performance. There exists a rich literature on the impact of perceived task difficulty on performance and motivation. Yet, there is another topic that is closely related to the subject of the influence of perceived task difficulty that did not receive any attention in previous research - the influence of revealing the true difficulty of a task to the student. This paper investigates the impact of revealing the task difficulty on the student's performance, motivation, self-efficacy and subjective task value via an experiment in which workers are asked to solve matchstick riddles. Furthermore, we discuss how the experiment results might be relevant for AI-aided education. Specifically, we elaborate on the question of how a student's learning experience might be improved by supporting her with two types of AI systems: an AI system that predicts task difficulty and an AI system that determines when task difficulty should be revealed and when not.

  Access Paper or Ask Questions

RGRecSys: A Toolkit for Robustness Evaluation of Recommender Systems

Jan 12, 2022
Zohreh Ovaisi, Shelby Heinecke, Jia Li, Yongfeng Zhang, Elena Zheleva, Caiming Xiong

Robust machine learning is an increasingly important topic that focuses on developing models resilient to various forms of imperfect data. Due to the pervasiveness of recommender systems in online technologies, researchers have carried out several robustness studies focusing on data sparsity and profile injection attacks. Instead, we propose a more holistic view of robustness for recommender systems that encompasses multiple dimensions - robustness with respect to sub-populations, transformations, distributional disparity, attack, and data sparsity. While there are several libraries that allow users to compare different recommender system models, there is no software library for comprehensive robustness evaluation of recommender system models under different scenarios. As our main contribution, we present a robustness evaluation toolkit, Robustness Gym for RecSys (RGRecSys --, that allows us to quickly and uniformly evaluate the robustness of recommender system models.

* In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining (WSDM 22), February 2022, ACM, 4 pages 

  Access Paper or Ask Questions

Revisiting Contextual Toxicity Detection in Conversations

Nov 30, 2021
Julia Ive, Atijit Anuchitanukul, Lucia Specia

Understanding toxicity in user conversations is undoubtedly an important problem. As it has been argued in previous work, addressing "covert" or implicit cases of toxicity is particularly hard and requires context. Very few previous studies have analysed the influence of conversational context in human perception or in automated detection models. We dive deeper into both these directions. We start by analysing existing contextual datasets and come to the conclusion that toxicity labelling by humans is in general influenced by the conversational structure, polarity and topic of the context. We then propose to bring these findings into computational detection models by introducing (a) neural architectures for contextual toxicity detection that are aware of the conversational structure, and (b) data augmentation strategies that can help model contextual toxicity detection. Our results have shown the encouraging potential of neural architectures that are aware of the conversation structure. We have also demonstrated that such models can benefit from synthetic data, especially in the social media domain.

  Access Paper or Ask Questions