Social perception is the main channel through which human beings access the social world, very much like vision and hearing are channels through which people access the physical world. For a long time, researchers have addressed the computational implementation of vision and hearing in domains like computer vision and speech recognition, but only early attempts have been made to do the same with social perception. We believe social perception to be one of the missing links in the communication between humans and computers, and so in this presentation I will present our recent research in social signals analysis (e.g., head pose, camera-based heart rate estimation, eye gaze). We will concentrate on behavior modeling and recognition, with an emphasis on sensing and understanding users’ interactive actions and intentions for achieving multimodal human-computer interaction in natural settings, in particular pertaining to human dynamic face and body behavior in context dependent situations (task, mood/affect). Perspectives on multisensory observation will be addressed.