We present a simple unsupervised approach for answer identification in organizational group chat. In recent years, organizational group chat is on the rise enabling asynchronous text-based collaboration between co-workers in different locations and time zones. Finding answers to questions is often critical for work efficiency. However, group chat is characterized by intertwined conversations and 'always on' availability, making it hard for users to pinpoint answers to questions they care about in real-time or search for answers in retrospective. In addition, structural and lexical characteristics differ between chat groups, making it hard to find a 'one model fits all' approach. Our Kernel Density Estimation (KDE) based clustering approach termed Ans-Chat implicitly learns discussion patterns as a means for answer identification, thus eliminating the need to channel-specific tagging. Empirical evaluation shows that this solution outperforms other approached.
The exponential increase in the use of the Internet and social media over the last two decades has changed human interaction. This has led to many positive outcomes, but at the same time it has brought risks and harms. While the volume of harmful content online, such as hate speech, is not manageable by humans, interest in the academic community to investigate automated means for hate speech detection has increased. In this study, we analyse six publicly available datasets by combining them into a single homogeneous dataset and classify them into three classes, abusive, hateful or neither. We create a baseline model and we improve model performance scores using various optimisation techniques. After attaining a competitive performance score, we create a tool which identifies and scores a page with effective metric in near-real time and uses the same as feedback to re-train our model. We prove the competitive performance of our multilingual model on two langauges, English and Hindi, leading to comparable or superior performance to most monolingual models.
Full truckload transportation (FTL) in the form of freight containers represents one of the most important transportation modes in international trade. Due to large volume and scale, in FTL, delivery time is often less critical but cost and service quality are crucial. Therefore, efficiently solving large scale multiple shift FTL problems is becoming more and more important and requires further research. In one of our earlier studies, a set covering model and a three-stage solution method were developed for a multi-shift FTL problem. This paper extends the previous work and presents a significantly more efficient approach by hybridising pricing and cutting strategies with metaheuristics (a variable neighbourhood search and a genetic algorithm). The metaheuristics were adopted to find promising columns (vehicle routes) guided by pricing and cuts are dynamically generated to eliminate infeasible flow assignments caused by incompatible commodities. Computational experiments on real-life and artificial benchmark FTL problems showed superior performance both in terms of computational time and solution quality, when compared with previous MIP based three-stage methods and two existing metaheuristics. The proposed cutting and heuristic pricing approach can efficiently solve large scale real-life FTL problems.
We investigate an evolutionary multi-objective approach to good micro for real-time strategy games. Good micro helps a player win skirmishes and is one of the keys to developing better real-time strategy game play. In prior work, the same multi-objective approach of maximizing damage done while minimizing damage received was used to evolve micro for a group of ranged units versus a group of melee units. We extend this work to consider groups composed from two types of units. Specifically, this paper uses evolutionary multi-objective optimization to generate micro for one group composed from both ranged and melee units versus another group of ranged and melee units. Our micro behavior representation uses influence maps to represent enemy spatial information and potential fields generated from distance, health, and weapons cool down to guide unit movement. Experimental results indicate that our multi-objective approach leads to a Pareto front of diverse high-quality micro encapsulating multiple possible tactics. This range of micro provided by the Pareto front enables a human or AI player to trade-off among short term tactics that better suit the player's longer term strategy - for example, choosing to minimize friendly unit damage at the cost of only lightly damaging the enemy versus maximizing damage to the enemy units at the cost of increased damage to friendly units. We believe that our results indicate the usefulness of potential fields as a representation, and of evolutionary multi-objective optimization as an approach, for generating good micro.
To this date, CAPTCHAs have served as the first line of defense preventing unauthorized access by (malicious) bots to web-based services, while at the same time maintaining a trouble-free experience for human visitors. However, recent work in the literature has provided evidence of sophisticated bots that make use of advancements in machine learning (ML) to easily bypass existing CAPTCHA-based defenses. In this work, we take the first step to address this problem. We introduce CAPTURE, a novel CAPTCHA scheme based on adversarial examples. While typically adversarial examples are used to lead an ML model astray, with CAPTURE, we attempt to make a "good use" of such mechanisms. Our empirical evaluations show that CAPTURE can produce CAPTCHAs that are easy to solve by humans while at the same time, effectively thwarting ML-based bot solvers.
Large scale machine learning and deep models are extremely data-hungry. Unfortunately, obtaining large amounts of labeled data is expensive, and training state-of-the-art models (with hyperparameter tuning) requires significant computing resources and time. Secondly, real-world data is noisy and imbalanced. As a result, several recent papers try to make the training process more efficient and robust. However, most existing work either focuses on robustness or efficiency, but not both. In this work, we introduce Glister, a GeneraLIzation based data Subset selecTion for Efficient and Robust learning framework. We formulate Glister as a mixed discrete-continuous bi-level optimization problem to select a subset of the training data, which maximizes the log-likelihood on a held-out validation set. Next, we propose an iterative online algorithm Glister-Online, which performs data selection iteratively along with the parameter updates and can be applied to any loss-based learning algorithm. We then show that for a rich class of loss functions including cross-entropy, hinge-loss, squared-loss, and logistic-loss, the inner discrete data selection is an instance of (weakly) submodular optimization, and we analyze conditions for which Glister-Online reduces the validation loss and converges. Finally, we propose Glister-Active, an extension to batch active learning, and we empirically demonstrate the performance of Glister on a wide range of tasks including, (a) data selection to reduce training time, (b) robust learning under label noise and imbalance settings, and (c) batch-active learning with several deep and shallow models. We show that our framework improves upon state of the art both in efficiency and accuracy (in cases (a) and (c)) and is more efficient compared to other state-of-the-art robust learning algorithms in case (b).
Pointwise matches between two time series are of great importance in time series analysis, and dynamic time warping (DTW) is known to provide generally reasonable matches. There are situations where time series alignment should be invariant to scaling and offset in amplitude or where local regions of the considered time series should be strongly reflected in pointwise matches. Two different variants of DTW, affine DTW (ADTW) and regional DTW (RDTW), are proposed to handle scaling and offset in amplitude and provide regional emphasis respectively. Furthermore, ADTW and RDTW can be combined in two different ways to generate alignments that incorporate advantages from both methods, where the affine model can be applied either globally to the entire time series or locally to each region. The proposed alignment methods outperform DTW on specific simulated datasets, and one-nearest-neighbor classifiers using their associated difference measures are competitive with the difference measures associated with state-of-the-art alignment methods on real datasets.
Mispronunciation detection and diagnosis (MDD) is designed to identify pronunciation errors and provide instructive feedback to guide non-native language learners, which is a core component in computer-assisted pronunciation training (CAPT) systems. However, MDD often suffers from the data-sparsity problem due to that collecting non-native data and the associated annotations is time-consuming and labor-intensive. To address this issue, we explore a fully end-to-end (E2E) neural model for MDD, which processes learners' speech directly based on raw waveforms. Compared to conventional hand-crafted acoustic features, raw waveforms retain more acoustic phenomena and potentially can help neural networks discover better and more customized representations. To this end, our MDD model adopts a co-called SincNet module to take input a raw waveform and covert it to a suitable vector representation sequence. SincNet employs the cardinal sine (sinc) function to implement learnable bandpass filters, drawing inspiration from the convolutional neural network (CNN). By comparison to CNN, SincNet has fewer parameters and is more amenable to human interpretation. Extensive experiments are conducted on the L2-ARCTIC dataset, which is a publicly-available non-native English speech corpus compiled for research on CAPT. We find that the sinc filters of SincNet can be adapted quickly for non-native language learners of different nationalities. Furthermore, our model can achieve comparable mispronunciation detection performance in relation to state-of-the-art E2E MDD models that take input the standard handcrafted acoustic features. Besides that, our model also provides considerable improvements on phone error rate (PER) and diagnosis accuracy.
Unmanned surface vehicles (USVs) have great value with their ability to execute hazardous and time-consuming missions over water surfaces. Recently, USVs for inland waterways have attracted increasing attention for their potential application in autonomous monitoring, transportation, and cleaning. However, unlike sailing in open water, the challenges posed by scenes of inland waterways, such as the complex distribution of obstacles, the global positioning system (GPS) signal denial environment, the reflection of bank-side structures, and the fog over the water surface, all impede USV application in inland waterways. To address these problems and stimulate relevant research, we introduce USVInland, a multisensor dataset for USVs in inland waterways. The collection of USVInland spans a trajectory of more than 26 km in diverse real-world scenes of inland waterways using various modalities, including lidar, stereo cameras, millimeter-wave radar, GPS, and inertial measurement units (IMUs). Based on the requirements and challenges in the perception and navigation of USVs for inland waterways, we build benchmarks for simultaneous localization and mapping (SLAM), stereo matching, and water segmentation. We evaluate common algorithms for the above tasks to determine the influence of unique inland waterway scenes on algorithm performance. Our dataset and the development tools are available online at https://www.orca-tech.cn/datasets.html.
Recent advances in sensing technologies require the design and development of pattern recognition models capable of processing spatiotemporal data efficiently. In this work, we propose a spatially and temporally aware tensor-based neural network for human pose recognition using three-dimensional skeleton data. Our model employs three novel components. First, an input layer capable of constructing highly discriminative spatiotemporal features. Second, a tensor fusion operation that produces compact yet rich representations of the data, and third, a tensor-based neural network that processes data representations in their original tensor form. Our model is end-to-end trainable and characterized by a small number of trainable parameters making it suitable for problems where the annotated data is limited. Experimental validation of the proposed model indicates that it can achieve state-of-the-art performance. Although in this study, we consider the problem of human pose recognition, our methodology is general enough to be applied to any pattern recognition problem spatiotemporal data from sensor networks.