Humans perceive an action stream as a sequence of clearly segmented ``action units''. This gives rise to the idea that action recognition is to interpret the continuous human behaviors as a sequence of action primitives such as `` picking up a coffee pot''. The novel approach of our work is to segment the continuous actions in natural tasks by detecting agent-centered switches of attention. Based on the fact that eye and head movements are closely linked to attention, we develop a method to detect attention by integrating eye gaze and head position information. Then, attention switches are calculated and used to segment the action sequence into action units which are recognized by Hidden Markov Models. An experimental system is built for recognizing actions in the natural task of ``stapling a letter'', which demonstrates the effectiveness of the approach. We also observe that when asked to describe others' activities, an observer usually produces verbal descriptions that correspond to subtasks but not action units. Thus, the observer conceptualizes the sensory input into the abstract level corresponding to tasks or subtasks, then verbalizes the perceptual results to yield utterances. In light of this, this work concentrates on recognizing tasks instead of action primitives. With the ability to track the course of gaze and head movements, our approach uses gaze and head cues to detect agent-centered attention switches that can then be utilized to segment an action sequence into action units. Based on recognizing those action primitives, parallel hidden Markov models are applied to model and integrate the probabilistic sequences of the action units of different body parts.
Last modified on Sept 13, 2008
Graphic Design by Elisha Hardy