Document similarity is an important part of Natural Language Processing and is most commonly used for plagiarism-detection and text summarization. Thus, finding the overall most effective document similarity algorithm could have a major positive impact on the field of Natural Language Processing. This report sets out to examine the numerous document similarity algorithms, and determine which ones are the most useful. It addresses the most effective document similarity algorithm by categorizing them into 3 types of document similarity algorithms: statistical algorithms, neural networks, and corpus/knowledge-based algorithms. The most effective algorithms in each category are also compared in our work using a series of benchmark datasets and evaluations that test every possible area that each algorithm could be used in.
Discovery and recognition of Group Activities (GA) based on imagery data processing have significant applications in persistent surveillance systems, which play an important role in some Internet services. The process is involved with analysis of sequential imagery data with spatiotemporal associations. Discretion of video imagery requires a proper inference system capable of discriminating and differentiating cohesive observations and interlinking them to known ontologies. We propose an Ontology based GAR with a proper inference model that is capable of identifying and classifying a sequence of events in group activities. A multi-layered Hidden Markov Model (HMM) is proposed to recognize different levels of abstract GA. The multi-layered HMM consists of N layers of HMMs where each layer comprises of M number of HMMs running in parallel. The number of layers depends on the order of information to be extracted. At each layer, by matching and correlating attributes of detected group events, the model attempts to associate sensory observations to known ontology perceptions. This paper demonstrates and compares performance of three different implementation of HMM, namely, concatenated N-HMM, cascaded C-HMM and hybrid H-HMM for building effective multi-layered HMM.
In a factory production line, different industry parts need to be quickly differentiated and sorted for further process. Parts can be of different colors and shapes. It is tedious for humans to differentiate and sort these objects in appropriate categories. Automating this process would save more time and cost. In the automation process, choosing an appropriate model to detect and classify different objects based on specific features is more challenging. In this paper, three different neural network models are compared to the object sorting system. They are namely CNN, Fast R-CNN, and Faster R-CNN. These models are tested, and their performance is analyzed. Moreover, for the object sorting system, an Arduino-controlled 5 DoF (degree of freedom) robot arm is programmed to grab and drop symmetrical objects to the targeted zone. Objects are categorized into classes based on color, defective and non-defective objects.
This paper investigates different methods to detect obstacles ahead of a robot using a camera in the robot, an aerial camera, and an ultrasound sensor. We also explored various efficient path finding methods for the robot to navigate to the target source. Single and multi-iteration angle-based navigation algorithms were developed. The theta-based path finding algorithms were compared with the Dijkstra Algorithm and their performance were analyzed.
Face Recognition is most used for biometric user authentication that identifies a user based on his or her facial features. The system is in high demand, as it is used by many businesses and employed in many devices such as smartphones and surveillance cameras. However, one frequent problem that is still observed in this user-verification method is its accuracy rate. Numerous approaches and algorithms have been experimented to improve the stated flaw of the system. This research develops one such algorithm that utilizes a combination of two different approaches. Using the concepts from Linear Algebra and computational geometry, the research examines the integration of Principal Component Analysis with Delaunay Triangulation; the method triangulates a set of face landmark points and obtains eigenfaces of the provided images. It compares the algorithm with traditional PCA and discusses the inclusion of different face landmark points to deliver an effective recognition rate.