Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

DCDLearn: Multi-order Deep Cross-distance Learning for Vehicle Re-Identification

Mar 25, 2020
Rixing Zhu, Jianwu Fang, Hongke Xu, Hongkai Yu, Jianru Xue

Vehicle re-identification (Re-ID) has become a popular research topic owing to its practicability in intelligent transportation systems. Vehicle Re-ID suffers the numerous challenges caused by drastic variation in illumination, occlusions, background, resolutions, viewing angles, and so on. To address it, this paper formulates a multi-order deep cross-distance learning (\textbf{DCDLearn}) model for vehicle re-identification, where an efficient one-view CycleGAN model is developed to alleviate exhaustive and enumerative cross-camera matching problem in previous works and smooth the domain discrepancy of cross cameras. Specially, we treat the transferred images and the reconstructed images generated by one-view CycleGAN as multi-order augmented data for deep cross-distance learning, where the cross distances of multi-order image set with distinct identities are learned by optimizing an objective function with multi-order augmented triplet loss and center loss to achieve the camera-invariance and identity-consistency. Extensive experiments on three vehicle Re-ID datasets demonstrate that the proposed method achieves significant improvement over the state-of-the-arts, especially for the small scale dataset.

  Access Paper or Ask Questions

A Comprehensive Study on Temporal Modeling for Online Action Detection

Jan 21, 2020
Wen Wang, Xiaojiang Peng, Yu Qiao, Jian Cheng

Online action detection (OAD) is a practical yet challenging task, which has attracted increasing attention in recent years. A typical OAD system mainly consists of three modules: a frame-level feature extractor which is usually based on pre-trained deep Convolutional Neural Networks (CNNs), a temporal modeling module, and an action classifier. Among them, the temporal modeling module is crucial which aggregates discriminative information from historical and current features. Though many temporal modeling methods have been developed for OAD and other topics, their effects are lack of investigation on OAD fairly. This paper aims to provide a comprehensive study on temporal modeling for OAD including four meta types of temporal modeling methods, \ie temporal pooling, temporal convolution, recurrent neural networks, and temporal attention, and uncover some good practices to produce a state-of-the-art OAD system. Many of them are explored in OAD for the first time, and extensively evaluated with various hyper parameters. Furthermore, based on our comprehensive study, we present several hybrid temporal modeling methods, which outperform the recent state-of-the-art methods with sizable margins on THUMOS-14 and TVSeries.

  Access Paper or Ask Questions

Automatic Spanish Translation of the SQuAD Dataset for Multilingual Question Answering

Dec 12, 2019
Casimiro Pio Carrino, Marta R. Costa-jussà, José A. R. Fonollosa

Recently, multilingual question answering became a crucial research topic, and it is receiving increased interest in the NLP community. However, the unavailability of large-scale datasets makes it challenging to train multilingual QA systems with performance comparable to the English ones. In this work, we develop the Translate Align Retrieve (TAR) method to automatically translate the Stanford Question Answering Dataset (SQuAD) v1.1 to Spanish. We then used this dataset to train Spanish QA systems by fine-tuning a Multilingual-BERT model. Finally, we evaluated our QA models with the recently proposed MLQA and XQuAD benchmarks for cross-lingual Extractive QA. Experimental results show that our models outperform the previous Multilingual-BERT baselines achieving the new state-of-the-art value of 68.1 F1 points on the Spanish MLQA corpus and 77.6 F1 and 61.8 Exact Match points on the Spanish XQuAD corpus. The resulting, synthetically generated SQuAD-es v1.1 corpora, with almost 100% of data contained in the original English version, to the best of our knowledge, is the first large-scale QA training resource for Spanish.

* Submitted to LREC 2020 

  Access Paper or Ask Questions

Context-Dependent Models for Predicting and Characterizing Facial Expressiveness

Dec 10, 2019
Victoria Lin, Jeffrey M. Girard, Louis-Philippe Morency

In recent years, extensive research has emerged in affective computing on topics like automatic emotion recognition and determining the signals that characterize individual emotions. Much less studied, however, is expressiveness, or the extent to which someone shows any feeling or emotion. Expressiveness is related to personality and mental health and plays a crucial role in social interaction. As such, the ability to automatically detect or predict expressiveness can facilitate significant advancements in areas ranging from psychiatric care to artificial social intelligence. Motivated by these potential applications, we present an extension of the BP4D+ dataset with human ratings of expressiveness and develop methods for (1) automatically predicting expressiveness from visual data and (2) defining relationships between interpretable visual signals and expressiveness. In addition, we study the emotional context in which expressiveness occurs and hypothesize that different sets of signals are indicative of expressiveness in different contexts (e.g., in response to surprise or in response to pain). Analysis of our statistical models confirms our hypothesis. Consequently, by looking at expressiveness separately in distinct emotional contexts, our predictive models show significant improvements over baselines and achieve comparable results to human performance in terms of correlation with the ground truth.

  Access Paper or Ask Questions

Traffic signal control optimization under severe incident conditions using Genetic Algorithm

Jun 11, 2019
Tuo Mao, Adriana-Simona Mihaita, Chen Cai

Traffic control optimization is a challenging task for various traffic centres in the world and majority of approaches focus only on applying adaptive methods under normal (recurrent) traffic conditions. But optimizing the control plans when severe incidents occur still remains a hard topic to address, especially if a high number of lanes or entire intersections are affected. This paper aims at tackling this problem and presents a novel methodology for optimizing the traffic signal timings in signalized urban intersections, under non-recurrent traffic incidents. The approach relies on deploying genetic algorithms (GA) by considering the phase durations as decision variables and the objective function to minimize as the total travel time in the network. Firstly, we develop the GA algorithm on a signalized testbed network under recurrent traffic conditions, with the purpose of fine-tuning the algorithm for crossover, mutation, fitness calculation, and obtain the optimal phase durations. Secondly, we apply the optimal signal timings previously found under severe incidents affecting the traffic flow in the network but without any further optimization. Lastly, we further apply the GA optimization under incident conditions and show that our approach improved the total travel time by almost 40.76%.

* 14 pages, 15 figures, preprint for the 26th ITS World Congress 21-25 Oct 2019, Singapore 

  Access Paper or Ask Questions

Learning Robust 3D Face Reconstruction and Discriminative Identity Representation

May 16, 2019
Yao Luo, Xiaoguang Tu, Mei Xie

3D face reconstruction from a single 2D image is a very important topic in computer vision. However, the current reconstruction methods are usually non-sensitive to face identities and over-sensitive to facial poses, which may result in similar 3D geometries for faces of different identities, or obtain different shapes for the same identity with different poses. When such methods are applied practically, their 3D estimates are either changeable for different photos of the same subject or over-regularized and generic to distinguish face identities. In this paper, we propose a robust solution to solve this problem by carefully designing a novel Siamese Convolutional Neural Network (SCNN). Specifically, regarding the 3D Morphable face Model (3DMM) parameters of the same individual as the same class, we employ the contrastive loss to enlarge the inter-class distance and meanwhile reduce the intra-class distance for the output 3DMM parameters. We also propose an identity loss to preserve the identity information for the same individual in the feature space. Training with these two losses, our SCNN could learn representations that are more discriminative for face identity and generalizable for pose variants. Experiments on the challenging database 300W-LP and AFLW2000-3D have shown the effectiveness of our method by comparing with state-of-the-arts.

* 5 pages, 6 figures, IEEE International Conference on Information Communication and Signal Processing 

  Access Paper or Ask Questions

Sameness Attracts, Novelty Disturbs, but Outliers Flourish in Fanfiction Online

Apr 16, 2019
Elise Jing, Simon DeDeo, Yong-Yeol Ahn

The nature of what people enjoy is not just a central question for the creative industry, it is a driving force of cultural evolution. It is widely believed that successful cultural products balance novelty and conventionality: they provide something familiar but at least somewhat divergent from what has come before, and occupy a satisfying middle ground between "more of the same" and "too strange". We test this belief using a large dataset of over half a million works of fanfiction from the website Archive of Our Own (AO3), looking at how the recognition a work receives varies with its novelty. We quantify the novelty through a term-based language model, and a topic model, in the context of existing works within the same fandom. Contrary to the balance theory, we find that the lowest-novelty are the most popular and that popularity declines monotonically with novelty. A few exceptions can be found: extremely popular works that are among the highest novelty within the fandom. Taken together, our findings not only challenge the traditional theory of the hedonic value of novelty, they invert it: people prefer the least novel things, are repelled by the middle ground, and have an occasional enthusiasm for extreme outliers. It suggests that cultural evolution must work against inertia --- the appetite people have to continually reconsume the familiar, and may resemble a punctuated equilibrium rather than a smooth evolution.

  Access Paper or Ask Questions

Expanding the Text Classification Toolbox with Cross-Lingual Embeddings

Mar 26, 2019
Meryem M'hamdi, Robert West, Andreea Hossmann, Michael Baeriswyl, Claudiu Musat

Most work in text classification and Natural Language Processing (NLP) focuses on English or a handful of other languages that have text corpora of hundreds of millions of words. This is creating a new version of the digital divide: the artificial intelligence (AI) divide. Transfer-based approaches, such as Cross-Lingual Text Classification (CLTC) - the task of categorizing texts written in different languages into a common taxonomy, are a promising solution to the emerging AI divide. Recent work on CLTC has focused on demonstrating the benefits of using bilingual word embeddings as features, relegating the CLTC problem to a mere benchmark based on a simple averaged perceptron. In this paper, we explore more extensively and systematically two flavors of the CLTC problem: news topic classification and textual churn intent detection (TCID) in social media. In particular, we test the hypothesis that embeddings with context are more effective, by multi-tasking the learning of multilingual word embeddings and text classification; we explore neural architectures for CLTC; and we move from bi- to multi-lingual word embeddings. For all architectures, types of word embeddings and datasets, we notice a consistent gain trend in favor of multilingual joint training, especially for low-resourced languages.

  Access Paper or Ask Questions

Interpretation of Natural Language Rules in Conversational Machine Reading

Aug 28, 2018
Marzieh Saeidi, Max Bartolo, Patrick Lewis, Sameer Singh, Tim Rocktäschel, Mike Sheldon, Guillaume Bouchard, Sebastian Riedel

Most work in machine reading focuses on question answering problems where the answer is directly expressed in the text to read. However, many real-world question answering problems require the reading of text not because it contains the literal answer, but because it contains a recipe to derive an answer together with the reader's background knowledge. One example is the task of interpreting regulations to answer "Can I...?" or "Do I have to...?" questions such as "I am working in Canada. Do I have to carry on paying UK National Insurance?" after reading a UK government website about this topic. This task requires both the interpretation of rules and the application of background knowledge. It is further complicated due to the fact that, in practice, most questions are underspecified, and a human assistant will regularly have to ask clarification questions such as "How long have you been working abroad?" when the answer cannot be directly derived from the question and text. In this paper, we formalise this task and develop a crowd-sourcing strategy to collect 32k task instances based on real-world rules and crowd-generated questions and scenarios. We analyse the challenges of this task and assess its difficulty by evaluating the performance of rule-based and machine-learning baselines. We observe promising results when no background knowledge is necessary, and substantial room for improvement whenever background knowledge is needed.

* EMNLP 2018 

  Access Paper or Ask Questions

A Survey on Hardware Implementations of Visual Object Trackers

Nov 07, 2017
Al-Hussein A. El-Shafie, S. E. D. Habib

Visual object tracking is an active topic in the computer vision domain with applications extending over numerous fields. The main sub-tasks required to build an object tracker (e.g. object detection, feature extraction and object tracking) are computation-intensive. In addition, real-time operation of the tracker is indispensable for almost all of its applications. Therefore, complete hardware or hardware/software co-design approaches are pursued for better tracker implementations. This paper presents a literature survey of the hardware implementations of object trackers over the last two decades. Although several tracking surveys exist in literature, a survey addressing the hardware implementations of the different trackers is missing. We believe this survey would fill the gap and complete the picture with the existing surveys of how to design an efficient tracker and point out the future directions researchers can follow in this field. We highlight the lack of hardware implementations for state-of-the-art tracking algorithms as well as for enhanced classical algorithms. We also stress the need for measuring the tracking performance of the hardware-based trackers. Additionally, enough details of the hardware-based trackers need to be provided to allow reasonable comparison between the different implementations.

* 17 pages, 14 Figures, 6 tables, 84 references 

  Access Paper or Ask Questions