We present an online system for real time recognition of actions involving objects working in online mode. The system merges two streams of information processing running in parallel. One is carried out by a hierarchical self-organizing map (SOM) system that recognizes the performed actions by analysing the spatial trajectories of the agent's movements. It consists of two layers of SOMs and a custom made supervised neural network. The activation sequences in the first layer SOM represent the sequences of significant postures of the agent during the performance of actions. These activation sequences are subsequently recoded and clustered in the second layer SOM, and then labeled by the activity in the third layer custom made supervised neural network. The second information processing stream is carried out by a second system that determines which object among several in the agent's vicinity the action is applied to. This is achieved by applying a proximity measure. The presented method combines the two information processing streams to determine what action the agent performed and on what object. The action recognition system has been tested with excellent performance.
Human recognition of the actions of other humans is very efficient and is based on patterns of movements. Our theoretical starting point is that the dynamics of the joint movements is important to action categorization. On the basis of this theory, we present a novel action recognition system that employs a hierarchy of Self-Organizing Maps together with a custom supervised neural network that learns to categorize actions. The system preprocesses the input from a Kinect like 3D camera to exploit the information not only about joint positions, but also their first and second order dynamics. We evaluate our system in two experiments with publicly available data sets, and compare its performance to the performance with less sophisticated preprocessing of the input. The results show that including the dynamics of the actions improves the performance. We also apply an attention mechanism that focuses on the parts of the body that are the most involved in performing the actions.