In this work, we address the problem of audio-based near-duplicate video retrieval. We propose the Audio Similarity Learning (AuSiL) approach that effectively captures temporal patterns of audio similarity between video pairs. For the robust similarity calculation between two videos, we first extract representative audio-based video descriptors by leveraging transfer learning based on a Convolutional Neural Network (CNN) trained on a large scale dataset of audio events, and then we calculate the similarity matrix derived from the pairwise similarity of these descriptors. The similarity matrix is subsequently fed to a CNN network that captures the temporal structures existing within its content. We train our network following a triplet generation process and optimizing the triplet loss function. To evaluate the effectiveness of the proposed approach, we have manually annotated two publicly available video datasets based on the audio duplicity between their videos. The proposed approach achieves very competitive results compared to three state-of-the-art methods. Also, unlike the competing methods, it is very robust to the retrieval of audio duplicates generated with speed transformations.
In an effort to penetrate the market at an affordable cost, consumer robots tend to provide limited processing capabilities, just enough to serve the purpose they have been designed for. However, a robot, in principle, should be able to interact and process the constantly increasing information streams generated from sensors or other devices. This would require the implementation of algorithms and mathematical models for the accurate processing of data volumes and significant computational resources. It is clear that as the data deluge continues to grow exponentially, deploying such algorithms on consumer robots will not be easy. Current work presents a cloud-based architecture that aims to offload computational resources from robots to a remote infrastructure, by utilizing and implementing cloud technologies. This way robots are allowed to enjoy functionality offered by complex algorithms that are executed on the cloud. The proposed system architecture allows developers and engineers not specialised in robotic implementation environments to utilize generic robotic algorithms and services off-the-shelf.
The communication and collaboration of Cyber-Physical Systems, including machines and robots, among themselves and with humans, is expected to attract researchers' interest for the years to come. A key element of the new revolution is the Internet of Things (IoT). IoT infrastructures enable communication between different connected devices using internet protocols. The integration of robots in an IoT platform can improve robot capabilities by providing access to other devices and resources. In this paper we present an IoT-enabled application including a NAO robot which can communicate through an IoT platform with a reflex measurement system and a hardware node that provides robotics-oriented services in the form of RESTful web services. An activity reminder application is also included, illustrating the extension capabilities of the system.
FPGAs are commonly used to accelerate domain-specific algorithmic implementations, as they can achieve impressive performance boosts, are reprogrammable and exhibit minimal power consumption. In this work, the SqueezeNet DCNN is accelerated using an SoC FPGA in order for the offered object recognition resource to be employed in a robotic application. Experiments are conducted to investigate the performance and power consumption of the implementation in comparison to deployment on other widely-used computational systems.