Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

General Robot Dynamics Learning and Gen2Real

Apr 06, 2021
Dengpeng Xing, Jiale Li, Yiming Yang, Bo Xu

Acquiring dynamics is an essential topic in robot learning, but up-to-date methods, such as dynamics randomization, need to restart to check nominal parameters, generate simulation data, and train networks whenever they face different robots. To improve it, we novelly investigate general robot dynamics, its inverse models, and Gen2Real, which means transferring to reality. Our motivations are to build a model that learns the intrinsic dynamics of various robots and lower the threshold of dynamics learning by enabling an amateur to obtain robot models without being trapped in details. This paper achieves the "generality" by randomizing dynamics parameters, topology configurations, and model dimensions, which in sequence cover the property, the connection, and the number of robot links. A structure modified from GPT is applied to access the pre-training model of general dynamics. We also study various inverse models of dynamics to facilitate different applications. We step further to investigate a new concept, "Gen2Real", to transfer simulated, general models to physical, specific robots. Simulation and experiment results demonstrate the validity of the proposed models and method.\footnote{ These authors contribute equally.

* On posting to RSS 

  Access Paper or Ask Questions

Understanding and Creating Art with AI: Review and Outlook

Feb 18, 2021
Eva Cetinic, James She

Technologies related to artificial intelligence (AI) have a strong impact on the changes of research and creative practices in visual arts. The growing number of research initiatives and creative applications that emerge in the intersection of AI and art, motivates us to examine and discuss the creative and explorative potentials of AI technologies in the context of art. This paper provides an integrated review of two facets of AI and art: 1) AI is used for art analysis and employed on digitized artwork collections; 2) AI is used for creative purposes and generating novel artworks. In the context of AI-related research for art understanding, we present a comprehensive overview of artwork datasets and recent works that address a variety of tasks such as classification, object detection, similarity retrieval, multimodal representations, computational aesthetics, etc. In relation to the role of AI in creating art, we address various practical and theoretical aspects of AI Art and consolidate related works that deal with those topics in detail. Finally, we provide a concise outlook on the future progression and potential impact of AI technologies on our understanding and creation of art.

* 17 pages, 3 figures 

  Access Paper or Ask Questions

Evolutionary Multitask Optimization: a Methodological Overview, Challenges and Future Research Directions

Feb 04, 2021
Eneko Osaba, Aritz D. Martinez, Javier Del Ser

In this work we consider multitasking in the context of solving multiple optimization problems simultaneously by conducting a single search process. The principal goal when dealing with this scenario is to dynamically exploit the existing complementarities among the problems (tasks) being optimized, helping each other through the exchange of valuable knowledge. Additionally, the emerging paradigm of Evolutionary Multitasking tackles multitask optimization scenarios by using as inspiration concepts drawn from Evolutionary Computation. The main purpose of this survey is to collect, organize and critically examine the abundant literature published so far in Evolutionary Multitasking, with an emphasis on the methodological patterns followed when designing new algorithmic proposals in this area (namely, multifactorial optimization and multipopulation-based multitasking). We complement our critical analysis with an identification of challenges that remain open to date, along with promising research directions that can stimulate future efforts in this topic. Our discussions held throughout this manuscript are offered to the audience as a reference of the general trajectory followed by the community working in this field in recent times, as well as a self-contained entry point for newcomers and researchers interested to join this exciting research avenue.

* 29 pages, 4 figures, under review for its consideration in Applied Soft Computing journal 

  Access Paper or Ask Questions

Enabling Robots to Draw and Tell: Towards Visually Grounded Multimodal Description Generation

Jan 14, 2021
Ting Han, Sina Zarrieß

Socially competent robots should be equipped with the ability to perceive the world that surrounds them and communicate about it in a human-like manner. Representative skills that exhibit such ability include generating image descriptions and visually grounded referring expressions. In the NLG community, these generation tasks are largely investigated in non-interactive and language-only settings. However, in face-to-face interaction, humans often deploy multiple modalities to communicate, forming seamless integration of natural language, hand gestures and other modalities like sketches. To enable robots to describe what they perceive with speech and sketches/gestures, we propose to model the task of generating natural language together with free-hand sketches/hand gestures to describe visual scenes and real life objects, namely, visually-grounded multimodal description generation. In this paper, we discuss the challenges and evaluation metrics of the task, and how the task can benefit from progress recently made in the natural language processing and computer vision realms, where related topics such as visually grounded NLG, distributional semantics, and photo-based sketch generation have been extensively studied.

* The 2nd Workshop on NLG for HRI colocated with The 13th International Conference on Natural Language Generation 

  Access Paper or Ask Questions

Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2 Network

Oct 27, 2020
Jialu Huang, Jing Liao, Sam Kwong

Image-to-Image (I2I) translation is a heated topic in academia, and it also has been applied in real-world industry for tasks like image synthesis, super-resolution, and colorization. However, traditional I2I translation methods train data in two or more domains together. This requires lots of computation resources. Moreover, the results are of lower quality, and they contain many more artifacts. The training process could be unstable when the data in different domains are not balanced, and modal collapse is more likely to happen. We proposed a new I2I translation method that generates a new model in the target domain via a series of model transformations on a pre-trained StyleGAN2 model in the source domain. After that, we proposed an inversion method to achieve the conversion between an image and its latent vector. By feeding the latent vector into the generated model, we can perform I2I translation between the source domain and target domain. Both qualitative and quantitative evaluations were conducted to prove that the proposed method can achieve outstanding performance in terms of image quality, diversity and semantic similarity to the input and reference images compared to state-of-the-art works.

* 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works 

  Access Paper or Ask Questions

An Approach to Evaluating Learning Algorithms for Decision Trees

Oct 26, 2020
Tianqi Xiao, Omer Nguena Timo, Florent Avellaneda, Yasir Malik, Stefan Bruda

Learning algorithms produce software models for realising critical classification tasks. Decision trees models are simpler than other models such as neural network and they are used in various critical domains such as the medical and the aeronautics. Low or unknown learning ability algorithms does not permit us to trust the produced software models, which lead to costly test activities for validating the models and to the waste of learning time in case the models are likely to be faulty due to the learning inability. Methods for evaluating the decision trees learning ability, as well as that for the other models, are needed especially since the testing of the learned models is still a hot topic. We propose a novel oracle-centered approach to evaluate (the learning ability of) learning algorithms for decision trees. It consists of generating data from reference trees playing the role of oracles, producing learned trees with existing learning algorithms, and determining the degree of correctness (DOE) of the learned trees by comparing them with the oracles. The average DOE is used to estimate the quality of the learning algorithm. the We assess five decision tree learning algorithms based on the proposed approach.

  Access Paper or Ask Questions

Hierarchical Metadata-Aware Document Categorization under Weak Supervision

Oct 26, 2020
Yu Zhang, Xiusi Chen, Yu Meng, Jiawei Han

Categorizing documents into a given label hierarchy is intuitively appealing due to the ubiquity of hierarchical topic structures in massive text corpora. Although related studies have achieved satisfying performance in fully supervised hierarchical document classification, they usually require massive human-annotated training data and only utilize text information. However, in many domains, (1) annotations are quite expensive where very few training samples can be acquired; (2) documents are accompanied by metadata information. Hence, this paper studies how to integrate the label hierarchy, metadata, and text signals for document categorization under weak supervision. We develop HiMeCat, an embedding-based generative framework for our task. Specifically, we propose a novel joint representation learning module that allows simultaneous modeling of category dependencies, metadata information and textual semantics, and we introduce a data augmentation module that hierarchically synthesizes training documents to complement the original, small-scale training set. Our experiments demonstrate a consistent improvement of HiMeCat over competitive baselines and validate the contribution of our representation learning and data augmentation modules.

* 9 pages; Accepted to WSDM 2021 

  Access Paper or Ask Questions

We Don't Speak the Same Language: Interpreting Polarization through Machine Translation

Oct 18, 2020
Ashiqur R. KhudaBukhsh, Rupak Sarkar, Mark S. Kamlet, Tom M. Mitchell

Polarization among US political parties, media and elites is a widely studied topic. Prominent lines of prior research across multiple disciplines have observed and analyzed growing polarization in social media. In this paper, we present a new methodology that offers a fresh perspective on interpreting polarization through the lens of machine translation. With a novel proposition that two sub-communities are speaking in two different \emph{languages}, we demonstrate that modern machine translation methods can provide a simple yet powerful and interpretable framework to understand the differences between two (or more) large-scale social media discussion data sets at the granularity of words. Via a substantial corpus of 86.6 million comments by 6.5 million users on over 200,000 news videos hosted by YouTube channels of four prominent US news networks, we demonstrate that simple word-level and phrase-level translation pairs can reveal deep insights into the current political divide -- what is \emph{black lives matter} to one can be \emph{all lives matter} to the other.

  Access Paper or Ask Questions