The Capacitated Vehicle Routing Problem is a well-known NP-hard problem that poses the challenge of finding the optimal route of a vehicle delivering products to multiple locations. Recently, new efforts have emerged to create constructive and perturbative heuristics to tackle this problem using Deep Learning. In this paper, we join these efforts to develop the Combined Deep Constructor and Perturbator, which combines two powerful constructive and perturbative Deep Learning-based heuristics, using attention mechanisms at their core. Furthermore, we improve the Attention Model-Dynamic for the Capacitated Vehicle Routing Problem by proposing a memory-efficient algorithm that reduces its memory complexity by a factor of the number of nodes. Our method shows promising results. It demonstrates a cost improvement in common datasets when compared against other multiple Deep Learning methods. It also obtains close results to the state-of-the art heuristics from the Operations Research field. Additionally, the proposed memory efficient algorithm for the Attention Model-Dynamic model enables its use in problem instances with more than 100 nodes.
Video Retrieval is a challenging task where a text query is matched to a video or vice versa. Most of the existing approaches for addressing such a problem rely on annotations made by the users. Although simple, this approach is not always feasible in practice. In this work, we explore the application of the language-image model, CLIP, to obtain video representations without the need for said annotations. This model was explicitly trained to learn a common space where images and text can be compared. Using various techniques described in this document, we extended its application to videos, obtaining state-of-the-art results on the MSR-VTT and MSVD benchmarks.
Suspicious behavior is likely to threaten security, assets, life, or freedom. This behavior has no particular pattern, which complicates the tasks to detect it and define it. Even for human observers, it is complex to spot suspicious behavior in surveillance videos. Some proposals to tackle abnormal and suspicious behavior-related problems are available in the literature. However, they usually suffer from high false-positive rates due to different classes with high visual similarity. The Pre-Crime Behavior method removes information related to a crime commission to focus on suspicious behavior before the crime happens. The resulting samples from different types of crime have a high-visual similarity with normal-behavior samples. To address this problem, we implemented 3D Convolutional Neural Networks and trained them under different approaches. Also, we tested different values in the number-of-filter parameter to optimize computational resources. Finally, the comparison between the performance using different training approaches shows the best option to improve the suspicious behavior detection on surveillance videos.
Crime generates significant losses, both human and economic. Every year, billions of dollars are lost due to attacks, crimes, and scams. Surveillance video camera networks are generating vast amounts of data, and the surveillance staff can not process all the information in real-time. The human sight has its limitations, where the visual focus is among the most critical ones when dealing with surveillance. A crime can occur in a different screen segment or on a distinct monitor, and the staff may not notice it. Our proposal focuses on shoplifting crimes by analyzing special situations that an average person will consider as typical conditions, but may lead to a crime. While other approaches identify the crime itself, we instead model suspicious behavior -- the one that may occur before a person commits a crime -- by detecting precise segments of a video with a high probability to contain a shoplifting crime. By doing so, we provide the staff with more opportunities to act and prevent crime. We implemented a 3DCNN model as a video feature extractor and tested its performance on a dataset composed of daily-action and shoplifting samples. The results are encouraging since it correctly identifies 75% of the cases where a crime is about to happen.