Category Archives: Lectures

View all lectures

Introduction to Hadoop

Eng. Niccolò Becchi, Wikido events portal founder, will held a technical seminar about Apache Hadoop on Monday 2013 June 17 at Media Integration and Communication Center.

Data mining representation

Data mining representation


  • what is it and how it came about;
  • who uses it and what for;
  • the map-reduce: an application in many small pieces;
  • and especially when you may agree to use it even if you do not work at Facebook?

Hadoop is a tool that allows you to run scalable applications on clusters consisting of tens, hundreds or even thousands of servers. It is currently used by Facebook, Yahoo, LastFm and many realities that have the need to work on gigabytes or even petabytes of data.

At the core of the framework is the paradigm of the Map-Reduce. Developed internally at Google on its distributed filesystem it was created to respond to his need for parallel processing of large amounts of data. Hadoop is Google’s open source version of their software which anyone can use for processing data on his servers or, possibly, on the Amazon cloud (consuming some credit card!).

During the meeting, you will see the first steps of map-reduce paradigm. In this kind of programming many (but not all) algorithms are rewritable. We will look at some tools that increase productivity in the Map-Reduce application development.

Material: it is recommended to participate bringing a PC with the Java Development Environment installed (JDK >= 1.6) and the Hadoop package (downloadable from: (#Download) > then choose 1.1.X > current stable version, 1.1 release)

Social Media Annotation

The large success of online social platforms for creation, sharing and tagging of user-generated media has lead to a strong interest by the multimedia and computer vision communities in research on methods and techniques for annotating and searching social media.

Social Media

Visual content similarity, geo-tags and tag co-occurrence, together with social connections and comments, can be exploited to perform tag suggestion as well as to perform content classification and clustering and enable more effective semantic indexing and retrieval of visual data.

However there is need to countervail the relatively low quality of these metadata user produced tags and annotations are known to be ambiguous, imprecise and/or incomplete, overly personalized and limited – and at the same time take into account the ‘web-scale’ quantity of media and the fact that social network users continuously add new images and create new terms.

We will review the state of the art approaches to automatic annotation and tag refinement for social images and discuss extensions to tag suggestion and localization in web video sequences.

Marco Bertini

Lecturer: Marco Bertini

Game Theory in Pattern Recognition and Machine Learning

The development of game theory in the early 1940’s by John von Neumann was a reaction against the then dominant view that problems in economic theory can be formulated using standard methods from optimization theory. Indeed, most real – world economic problems typically involve conflicting interactions among decision-making agents that cannot be adequately captured by a single (global) objective function, thereby requiring a different, more sophisticated treatment. Accordingly, the main point made by game theorists is to shift the emphasis from optimality criteria to equilibrium conditions.

Game Theory in Pattern Recognition and Machine Learning: graph transduction

Game Theory in Pattern Recognition and Machine Learning: graph transduction

As it provides an abstract theoretically-founded framework to elegantly model complex scenarios, game theory has found a variety of applications not only in economics and, more generally, social sciences but also in different fields of engineering and information technologies. In particular, in the past there have been various attempts aimed at formulating problems in computer vision, pattern recognition and machine learning from a game-theoretic perspective and, with the recent development of algorithmic game theory, the interest in these communities around game-theoretic models and algorithms is growing at a fast pace.

The goal of these three lectures is to offer an introduction to the basic concepts of game theory and to provide an overview of the work we’re currently doing in my group on the use of game-theoretic models in pattern recognition, computer vision, and machine learning.

I shall assume no pre-existing knowledge of game theory by the audience, thereby making the lectures self-contained and understandable by a non-expert.

The three lectures will be structured as follows:

  • Lecture 1: Introduction to the basic concepts of game theory
  • Lecture 2: Evolutionary games and data clustering
  • Lecture 3: Contextual pattern recognition and graph transduction

The lectures are based on two (broader) tutorials I gave at ICPR 2010 and CVPR 2011 (with A. Torsello).

Stereo vision algorithms for dense 3D reconstruction: introduction and recent developments

Stefano Mattoccia, of DEIS will held a lecture at MICC entitled “Stereo vision algorithms for dense 3D reconstruction: introduction and recent developments”.

The lecturer Stefano Mattoccia, DEIS, University of Bologna

The lecturer Stefano Mattoccia, DEIS, University of Bologna

The stereo vision enables the 3D reconstruction of scenes observed by two or more cameras. In this seminar, considering the case of dense 3D reconstructions, the main problems of stereo vision will be introduced and recent developments in this area will be examined with particular reference to algorithms that lend themselves to being mapped on devices with parallel processing capabilities (eg FPGA, GPU).

A coarse-to-fine approach for fast deformable object detection

Marco Pedersoli will present a method that can dramatically accelerate object detection with part based models. The method is based on the observation that the cost of detection is likely to be dominated by the cost of matching each part to the image, and not by the cost of computing the optimal configuration of the parts as commonly assumed. Therefore accelerating detection requires minimizing the number of part-to-image comparisons.

Coarse-to-fine inference

A method for the fast inference of multi-resolution part based models. (a) example detections; (b) scores obtained by matching the lowest resolution part (root filter) at all image locations; (c) scores obtained by matching the intermediate resolution parts, only at location selected based on the response of the root part; (d) scores obtained by matching the high resolution parts, only at locations selected based on the intermediate resolution scores

To this end we propose a multiple-resolutions hierarchical part based model and a corresponding coarse-to-fine inference procedure that recursively eliminates from the search space unpromising part placements. The method yields a ten-fold speedup over the standard dynamic programming approach and is complementary to the cascade-of-parts approach. Compared to the latter, our
method does not have parameters to be determined empirically, which simplifies its use during the training of the model. Most importantly, the two techniques can be combined to obtain a very significant speedup, of two orders of magnitude in some cases.

We evaluate our method extensively on the PASCAL VOC and INRIA datasets, demonstrating a very high increase in the detection speed with little degradation of the accuracy.

On automatic reading of handwriting (What the writing hand could tell to the reading eye)

Prof. Angelo Marcelli will present a principled approach to automatic cursive handwriting reading. The approach is based upon handwriting generation models, and according to them assumes that handwriting is a learned complex motoric task which is accomplished by sequencing simpler movement called stroke.

Automatic reading of cursive handwriting

Automatic reading of cursive handwriting

As learning proceeds in human, so does fluency, which results in producing similar sequence of strokes in correspondence of sequence of letters. Such invariants represents therefore the basic drawing units to which an interpretation can be associated. Reading is then achieved by detecting the invariants used to produce the word to be recognized, associating to them their interpretations, and eventually concatenating the interpretations along the ink of the word, without explicit segmentation in characters, as in case of analytical approaches, or using dictionary of possible words, as in case of holistic methods.

He will conclude with a list of key issues to be addressed while pursuing the proposed approach and the steps we have undertaken along this path.

Vehicles that Learn by Observing How We Drive: Multidisciplinary Explorations in Human-Centered Driver Assistance

Understanding driver behavior and ethnography surrounding the task of driving are essential in the development of human-centric driver assistance systems.

Laboratory for Intelligent and Safe Automobiles University of California at San Diego

Laboratory for Intelligent and Safe Automobiles University of California at San Diego

Novel instrumented vehicles are used for conducting experiments, where the rich contextual information about vehicle dynamics, surround and driver state are captured for careful, detailed ethnographic studies, as well as realistic data for developing algorithms to analyze multi sensory signals for active safety.

In this presentation, Prof. Mohan M. Trivedi will provide a systems- oriented framework for developing multimodal sensing, inferencing algorithms and human-vehicle interfaces for safer automobiles.

He will consider three main components of the system, driver, vehicle, and vehicle surround. He will discuss various issues and ideas for developing models for these main components as well as activities associated with the complex task of safe driving.

The presentation will include discussion of novel sensory systems and learning algorithms for capturing not only the dynamic surround information of the vehicle but also the state, intent and activity patterns of drivers.

He will also introduce a new type of visual display called “dynamic active display”. These displays present visual information to the driver where driving view and safety-critical visual icons are presented to the driver in a manner that minimizes deviation of her gaze direction without adding to unnecessary visual clutter.

These contributions support the practical promise of the “human-centric active safety” (HCAS) systems in enhancing the safety, comfort, and convenience.

For videos and publication list visit LISA

Developmental Agents for Vision

In this talk, Marco Gori introduce the notion of developmental agents, that are based on the theory of “learning from constraints” (see e.g.

Perceptual and logic constraints

Perceptual and logic constraints

It is claimed that in most interesting tasks, learning from constraints naturally leads to “deep architectures”, that emerge when following the developmental principle of focusing attention on “easy constraints”, at each stage. Interestingly, this suggests that stage-based learning, as discussed in developmental psychology, might not be primarily the outcome of biology, but it could be instead the consequence of optimization principles and complexity issues that hold regardless of the “body”.

In the second part of the talk, he gives insights on the adoption of the proposed framework in computer vision. The proposed functional approach leads naturally to develop different notions of features, the lower level of which are somehow related to classical SIFT. It is pointed out that the adoption of information-theoretic principles are at the basis of the feature generation either at low or high level of the vision computer hierarchy. The functions that are developed are inherently independent of roto-translations and do acquire scale invariance by the minimization of an appropriate entropy-based measure, that is also at the basis of the focus of attention.

Finally, he gives an overview of different constraints emerging at different layers of the hierarchy, and claim that the overall system is expected to work in any visual environment by acting continuously, with no separation between learning and scene interpretation.

Computer Recognition of Human Activities, Objects and their Interactions

Computer Vision has graduated from a research tool in early 1960s to a mature discipline today. The developments in cameras, computers and memory have contributed in part to this maturing of computer vision. Namely, there is an explosive growth in the number of cameras in public places, the speed of computers has increased significantly and the price of memory has spectacularly decreased. The word camera may be used in a very broad sense since the imaging modalities range from the usual cameras imaging a visual intensity image to thermal image and laser range image. In addition, several applications of computer vision technology are contributing to the solution of a diverse set of societal problems.

Human activities

Human activities

At The University of Texas at Austin, we are pursuing a number of projects on human activity understanding and face/emotion recognition. Professor Aggarwal will present his research on modeling and recognition of actions and interactions, and human and object interactions. The
object may be a piece of luggage, a car or an unmovable object like a fence. The applications considered include monitoring of: human activities in public places, identification of abandoned baggage and face and emotion recognition. The issues considered in these problems will illustrate the richness of ideas involved and the difficulties associated with understanding human activities. Application of the above research to monitoring and surveillance will be discussed together with actual examples and their solutions.

Reading of electronic health records through the electronic health card

Roberto Caldelli will present an application for digital terrestrial television, which allows a user, in possession of the electronic health card (CSE) of the Region of Tuscany, to consult with their Electronic Health Record (ESF) on TV at home. The identity of the user, owner of the CSE, is ensured through a process of client authentication, defined by the Tuscany Region, based on a protocol which provides for asymmetric encryption, HTTPS connection and X509 certificates.

Roberto Caldelli

Roberto Caldelli

The application allows to view personal health information in a simple and immediate manner, without the need to provide a smart-card reader and having to install special libraries.

This application is a sample implementation for the supply of value-added services through a strong authentication technique, which can be used in other application scenarios that need to provide, safely, personal information.