Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Repurposing of Resources: from Everyday Problem Solving through to Crisis Management

Sep 17, 2021
Antonis Bikakis, Luke Dickens, Anthony Hunter, Rob Miller

The human ability to repurpose objects and processes is universal, but it is not a well-understood aspect of human intelligence. Repurposing arises in everyday situations such as finding substitutes for missing ingredients when cooking, or for unavailable tools when doing DIY. It also arises in critical, unprecedented situations needing crisis management. After natural disasters and during wartime, people must repurpose the materials and processes available to make shelter, distribute food, etc. Repurposing is equally important in professional life (e.g. clinicians often repurpose medicines off-license) and in addressing societal challenges (e.g. finding new roles for waste products,). Despite the importance of repurposing, the topic has received little academic attention. By considering examples from a variety of domains such as every-day activities, drug repurposing and natural disasters, we identify some principle characteristics of the process and describe some technical challenges that would be involved in modelling and simulating it. We consider cases of both substitution, i.e. finding an alternative for a missing resource, and exploitation, i.e. identifying a new role for an existing resource. We argue that these ideas could be developed into general formal theory of repurposing, and that this could then lead to the development of AI methods based on commonsense reasoning, argumentation, ontological reasoning, and various machine learning methods, to develop tools to support repurposing in practice.

* 16 pages 

  Access Paper or Ask Questions

ART-SLAM: Accurate Real-Time 6DoF LiDAR SLAM

Sep 12, 2021
Matteo Frosi, Matteo Matteucci

Real-time six degree-of-freedom pose estimation with ground vehicles represents a relevant and well studied topic in robotics, due to its many applications, such as autonomous driving and 3D mapping. Although some systems exist already, they are either not accurate or they struggle in real-time setting. In this paper, we propose a fast, accurate and modular LiDAR SLAM system for both batch and online estimation. We first apply downsampling and outlier removal, to filter out noise and reduce the size of the input point clouds. Filtered clouds are then used for pose tracking and floor detection, to ground-optimize the estimated trajectory. The availability of a pre-tracker, working in parallel with the filtering process, allows to obtain pre-computed odometries, to be used as aids when performing tracking. Efficient loop closure and pose optimization, achieved through a g2o pose graph, are the last steps of the proposed SLAM pipeline. We compare the performance of our system with state-of-the-art point cloud based methods, LOAM, LeGO-LOAM, A-LOAM, LeGO-LOAM-BOR and HDL, and show that the proposed system achieves equal or better accuracy and can easily handle even cases without loops. The comparison is done evaluating the estimated trajectory displacement using the KITTI and RADIATE datasets.

* This paper is currently under review 

  Access Paper or Ask Questions

Design and Experimental Evaluation of a Hierarchical Controller for an Autonomous Ground Vehicle with Large Uncertainties

Aug 09, 2021
Juncheng Li, Maopeng Ran, Lihua Xie

Autonomous ground vehicles (AGVs) are receiving increasing attention, and the motion planning and control problem for these vehicles has become a hot research topic. In real applications such as material handling, an AGV is subject to large uncertainties and its motion planning and control become challenging. In this paper, we investigate this problem by proposing a hierarchical control scheme, which is integrated by a model predictive control (MPC) based path planning and trajectory tracking control at the high level, and a reduced-order extended state observer (RESO) based dynamic control at the low level. The control at the high level consists of an MPC-based improved path planner, a velocity planner, and an MPC-based tracking controller. Both the path planning and trajectory tracking control problems are formulated under an MPC framework. The control at the low level employs the idea of active disturbance rejection control (ADRC). The uncertainties are estimated via a RESO and then compensated in the control in real-time. We show that, for the first-order uncertain AGV dynamic model, the RESO-based control only needs to know the control direction. Finally, simulations and experiments on an AGV with different payloads are conducted. The results illustrate that the proposed hierarchical control scheme achieves satisfactory motion planning and control performance with large uncertainties.

* Accepted for publication in IEEE Transactions on Control Systems Technology 

  Access Paper or Ask Questions

Electrical peak demand forecasting- A review

Aug 03, 2021
Shuang Dai, Fanlin Meng, Hongsheng Dai, Qian Wang, Xizhong Chen

The power system is undergoing rapid evolution with the roll-out of advanced metering infrastructure and local energy applications (e.g. electric vehicles) as well as the increasing penetration of intermittent renewable energy at both transmission and distribution level, which characterizes the peak load demand with stronger randomness and less predictability and therefore poses a threat to the power grid security. Since storing large quantities of electricity to satisfy load demand is neither economically nor environmentally friendly, effective peak demand management strategies and reliable peak load forecast methods become essential for optimizing the power system operations. To this end, this paper provides a timely and comprehensive overview of peak load demand forecast methods in the literature. To our best knowledge, this is the first comprehensive review on such topic. In this paper we first give a precise and unified problem definition of peak load demand forecast. Second, 139 papers on peak load forecast methods were systematically reviewed where methods were classified into different stages based on the timeline. Thirdly, a comparative analysis of peak load forecast methods are summarized and different optimizing methods to improve the forecast performance are discussed. The paper ends with a comprehensive summary of the reviewed papers and a discussion of potential future research directions.

  Access Paper or Ask Questions

Generalizing Fairness: Discovery and Mitigation of Unknown Sensitive Attributes

Jul 28, 2021
William Paul, Philippe Burlina

When deploying artificial intelligence (AI) in the real world, being able to trust the operation of the AI by characterizing how it performs is an ever-present and important topic. An important and still largely unexplored task in this characterization is determining major factors within the real world that affect the AI's behavior, such as weather conditions or lighting, and either a) being able to give justification for why it may have failed or b) eliminating the influence the factor has. Determining these sensitive factors heavily relies on collected data that is diverse enough to cover numerous combinations of these factors, which becomes more onerous when having many potential sensitive factors or operating in complex environments. This paper investigates methods that discover and separate out individual semantic sensitive factors from a given dataset to conduct this characterization as well as addressing mitigation of these factors' sensitivity. We also broaden remediation of fairness, which normally only addresses socially relevant factors, and widen it to deal with the desensitization of AI with regard to all possible aspects of variation in the domain. The proposed methods which discover these major factors reduce the potentially onerous demands of collecting a sufficiently diverse dataset. In experiments using the road sign (GTSRB) and facial imagery (CelebA) datasets, we show the promise of using this scheme to perform this characterization and remediation and demonstrate that our approach outperforms state of the art approaches.

  Access Paper or Ask Questions

An Initial Investigation of Non-Native Spoken Question-Answering

Jul 09, 2021
Vatsal Raina, Mark J. F. Gales

Text-based machine comprehension (MC) systems have a wide-range of applications, and standard corpora exist for developing and evaluating approaches. There has been far less research on spoken question answering (SQA) systems. The SQA task considered in this paper is to extract the answer from a candidate$\text{'}$s spoken response to a question in a prompt-response style language assessment test. Applying these MC approaches to this SQA task rather than, for example, off-topic response detection provides far more detailed information that can be used for further downstream processing. One significant challenge is the lack of appropriately annotated speech corpora to train systems for this task. Hence, a transfer-learning style approach is adopted where a system trained on text-based MC is evaluated on an SQA task with non-native speakers. Mismatches must be considered between text documents and spoken responses; non-native spoken grammar and written grammar. In practical SQA, ASR systems are used, necessitating an investigation of the impact of ASR errors. We show that a simple text-based ELECTRA MC model trained on SQuAD2.0 transfers well for SQA. It is found that there is an approximately linear relationship between ASR errors and the SQA assessment scores but grammar mismatches have minimal impact.

* 5 pages, 1 figure 

  Access Paper or Ask Questions

Automated Timeline Length Selection for Flexible Timeline Summarization

May 29, 2021
Xi Li, Qianren Mao, Hao Peng, Hongdong Zhu, Jianxin Li, Zheng Wang

By producing summaries for long-running events, timeline summarization (TLS) underpins many information retrieval tasks. Successful TLS requires identifying an appropriate set of key dates (the timeline length) to cover. However, doing so is challenging as the right length can change from one topic to another. Existing TLS solutions either rely on an event-agnostic fixed length or an expert-supplied setting. Neither of the strategies is desired for real-life TLS scenarios. A fixed, event-agnostic setting ignores the diversity of events and their development and hence can lead to low-quality TLS. Relying on expert-crafted settings is neither scalable nor sustainable for processing many dynamically changing events. This paper presents a better TLS approach for automatically and dynamically determining the TLS timeline length. We achieve this by employing the established elbow method from the machine learning community to automatically find the minimum number of dates within the time series to generate concise and informative summaries. We applied our approach to four TLS datasets of English and Chinese and compared them against three prior methods. Experimental results show that our approach delivers comparable or even better summaries over state-of-art TLS methods, but it achieves this without expert involvement.

  Access Paper or Ask Questions

Explaining Clinical Decision Support Systems in Medical Imaging using Cycle-Consistent Activation Maximization

Oct 13, 2020
Alexander Katzmann, Oliver Taubmann, Stephen Ahmad, Alexander Mühlberg, Michael Sühling, Horst-Michael Groß

Clinical decision support using deep neural networks has become a topic of steadily growing interest. While recent work has repeatedly demonstrated that deep learning offers major advantages for medical image classification over traditional methods, clinicians are often hesitant to adopt the technology because its underlying decision-making process is considered to be intransparent and difficult to comprehend. In recent years, this has been addressed by a variety of approaches that have successfully contributed to providing deeper insight. Most notably, additive feature attribution methods are able to propagate decisions back into the input space by creating a saliency map which allows the practitioner to "see what the network sees." However, the quality of the generated maps can become poor and the images noisy if only limited data is available - a typical scenario in clinical contexts. We propose a novel decision explanation scheme based on CycleGAN activation maximization which generates high-quality visualizations of classifier decisions even in smaller data sets. We conducted a user study in which these visualizations significantly outperformed existing methods on the LIDC dataset for lung lesion malignancy classification. With our approach we make a significant contribution to a better understanding of clinical decision support systems based on deep neural networks and thus aim to foster overall clinical acceptance.

* 32 pages, 9 figures, 18 pages appendix, metadata typo corrected 

  Access Paper or Ask Questions

PIE: Portrait Image Embedding for Semantic Control

Sep 20, 2020
Ayush Tewari, Mohamed Elgharib, Mallikarjun B R., Florian Bernard, Hans-Peter Seidel, Patrick Pérez, Michael Zollhöfer, Christian Theobalt

Editing of portrait images is a very popular and important research topic with a large variety of applications. For ease of use, control should be provided via a semantically meaningful parameterization that is akin to computer animation controls. The vast majority of existing techniques do not provide such intuitive and fine-grained control, or only enable coarse editing of a single isolated control parameter. Very recently, high-quality semantically controlled editing has been demonstrated, however only on synthetically created StyleGAN images. We present the first approach for embedding real portrait images in the latent space of StyleGAN, which allows for intuitive editing of the head pose, facial expression, and scene illumination in the image. Semantic editing in parameter space is achieved based on StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN. We design a novel hierarchical non-linear optimization problem to obtain the embedding. An identity preservation energy term allows spatially coherent edits while maintaining facial integrity. Our approach runs at interactive frame rates and thus allows the user to explore the space of possible edits. We evaluate our approach on a wide set of portrait photos, compare it to the current state of the art, and validate the effectiveness of its components in an ablation study.

* To appear in SIGGRAPH Asia 2020. Project webpage: 

  Access Paper or Ask Questions

Personalized Speech2Video with 3D Skeleton Regularization and Expressive Body Poses

Jul 17, 2020
Miao Liao, Sibo Zhang, Peng Wang, Hao Zhu, Ruigang Yang

In this paper, we propose a novel approach to convert given speech audio to a photo-realistic speaking video of a specific person, where the output video has synchronized, realistic, and expressive rich body dynamics. We achieve this by first generating 3D skeleton movements from the audio sequence using a recurrent neural network (RNN), and then synthesizing the output video via a conditional generative adversarial network (GAN). To make the skeleton movement realistic and expressive, we embed the knowledge of an articulated 3D human skeleton and a learned dictionary of personal speech iconic gestures into the generation process in both learning and testing pipelines. The former prevents the generation of unreasonable body distortion, while the later helps our model quickly learn meaningful body movement through a few recorded videos. To produce photo-realistic and high-resolution video with motion details, we propose to insert part attention mechanisms in the conditional GAN, where each detailed part, e.g. head and hand, is automatically zoomed in to have their own discriminators. To validate our approach, we collect a dataset with 20 high-quality videos from 1 male and 1 female model reading various documents under different topics. Compared with previous SoTA pipelines handling similar tasks, our approach achieves better results by a user study.

  Access Paper or Ask Questions