MICC Research

The MICC is headed by prof. Alberto Del Bimbo. Its research directions are: automatic video annotation, content based retrieval, intelligent videosurveillance, internet applications, natural interaction.

Automatic video annotation

Vidivideo: improving accessibility of videos

Vidivideo: improving accessibility of videos

May 18, 2010

The VidiVideo project takes on the challenge of creating a substantially enhanced semantic access to video, implemented in a search engine. The outcome of the project is an audio-visual search engine, composed of two parts: an automatic annotation part, that runs off-line, where detectors for more than 1000 semantic concepts are collected in a thesaurus to process and automatically annotate the video and an interactive part that provides a video search engine for both technical and non-technical users.

Automatic trademark detection and recognition in sports videos

Automatic trademark detection and recognition in sports videos

April 7, 2010

The availability of measures of appearance of trademarks and logos in a video is important in fields of marketing and sponsoring. These statistics can, in fact, be used by the sponsors to estimate the number TV viewers that noticed them and then evaluate the effects of the sponsorship. The goal of the ongoing project is to create a semi-automatic system for detection, tracking and recognition of pre-defined brands and trademarks in broadcast television. The number of appearances of a logo, its position, size and duration will be recorded to derive indexes and statistics that can be used for marketing analysis.

Video event classification using bag-of-words and string kernels

Video event classification using bag-of-words and string kernels

April 6, 2010

The recognition of events in videos is a relevant and challenging task of automatic semantic video analysis. At present one of the most successful frameworks, used for object recognition tasks, is the bag-of-words (BoW) approach. However it does not model the temporal information of the video stream. We are working at a novel method to introduce temporal information within the BoW approach by modeling a video clip as a sequence of histograms of visual features, computed from each frame using the traditional BoW model.

Human action categorization in unconstrained videos

Human action categorization in unconstrained videos

April 6, 2010

Building a general human activity recognition and classification system is a challenging problem, because of the variations in environment, people and actions. In fact environment variation can be caused by cluttered or moving background, camera motion, illumination changes. People may have different size, shape and posture appearance. Recently, interest-points based models have been successfully applied to the human action classification problem, because they overcome some limitations of holistic models such as the necessity of performing background subtraction and tracking. We are working at a novel method based on the visual bag-of-words model and on a new spatio-temporal descriptor.


Content based retrieval

Accurate Evaluation of HER-2 Amplification in FISH Images

Accurate Evaluation of HER-2 Amplification in FISH Images

May 17, 2010

In this research we present a system that supports accurate estimation of the ratio of HER-2 over CEP-17 dots in FISH images of breast tissue samples. Compared to previous work, the system incorporates a model to associate with each segmented nucleus a reliability score that estimates the confidence of the measure of the ratio of HER-2 over CEP-17 dots within the nucleus.

Image forensics using SIFT features

Image forensics using SIFT features

April 6, 2010

In many application scenarios digital images play a basic role and often it is important to assess if their content is realistic or has been manipulated to mislead watcher’s opinion. Image forensics tools provide answers to similar questions. We are working on a novel method that focuses in particular on the problem of detecting if a feigned image has been created by cloning an area of the image onto another zone to make a duplication or to cancel something awkward.

SIFTPose: local pose estimation from a single scale invariant keypoint

SIFTPose: local pose estimation from a single scale invariant keypoint

March 29, 2010

The aim of this project is to develop a new method of estimating the poses of imaged scene surfaces provided that they can be locally approximated by their tangent planes. Our approach performs an accurate direct estimation by exploiting the robustness of scale invariant feature transform (SIFT). The results are representative of the state of the art for this challenging task.


Intelligent videosurveillance

Scale Invariant 3D Multi-Person Tracking with a PTZ camera

Scale Invariant 3D Multi-Person Tracking with a PTZ camera

July 14, 2010

This research aims to realize a videosurveillance system for real-time 3D tracking of multiple people moving over an extended area, as seen from a rotating and zooming camera. The proposed method exploits multi-view image matching techniques to obtain dynamic-calibration of the camera and track many ground targets simultaneously, by slewing the video sensor from target to target and zooming in and out as necessary.

Optimal face detection and tracking

Optimal face detection and tracking

June 18, 2010

The project’s goal is to develop a reliable face detector and tracker for indoor video surveillance. The problem that we have been asked to deal with is to provide good quality face images of people entering restricted areas. Those images are going to be used for face recognition, and a feedback will be provided from the face recognition system to state if the person has been recognized or not.

 Particle filter-based visual tracking

Particle filter-based visual tracking

June 11, 2010

The project’s goal is to develop a computationally efficient, robust real-time particle filter-based visual tracker. In particular, we aim to increase the robustness of the tracker when it is used in conjunction with weak (but computationally efficient) appearance model, such as color histograms. To achieve this goal, we have proposed an adaptive parameter estimation method that estimates the statistic parameters of the particle filter on-line, so that it is possible to increase or reduce the uncertainty in the filter depending on a measure of its performances (tracking quality).

Mobile Robot Path Tracking with uncalibrated cameras

Mobile Robot Path Tracking with uncalibrated cameras

April 29, 2010

The aim of this transfer project is the motion control problem of a wheeled mobile robot (WMR) as observed from uncalibrated ceiling cameras. We develop a method that localizes the robot in real-time and smartly drives it over a path in a large environment with a pure pursuit controller, achieving less then 5 pixel on cross track error. Experiments are reported for Ambrogio, a two-wheel differentially-driven mobile robot provided by Zucchetti Centro Sistemi.

3D Mesh Partitioning

3D Mesh Partitioning

March 29, 2010

In this research, a model is proposed for decomposition of 3D objects based on Reeb-graphs. The model is motivated by perceptual principles and supports identification of salient object protrusions. Experimental results have demonstrate the effectiveness of the proposed approach with respect to different solutions appeared in the literature, and with reference to ground-truth data obtained by manually decomposing 3D objects.

3D Face Recognition

3D Face Recognition

March 29, 2010

In this research, we present a novel approach to 3D face matching that shows high effectiveness in distinguishing facial differences between distinct individuals from differences induced by non-neutral expressions within the same individual. We present an extensive comparative evaluation of performance with the FRGC v2.0 dataset and the SHREC08 dataset.


Internet applications

LIT: Lexicon of the Italian Television

LIT: Lexicon of the Italian Television

July 20, 2010

LIT (Lexicon of the Italian Television) is a project conceived by the Accademia della Crusca, the leading research institution on the Italian language, in collaboration with CLIEO (Center for theoretical and historical Linguistics: Italian, European and Oriental languages), with the aim of studying frequencies of the Italian lexicon used in television content and targets the specific sector of web applications for linguistic research. The corpus of transcriptions is constituted approximately by 170 hours of random television recordings transmitted by the national broadcaster RAI (Italian Radio Television) during the year 2006.

IM3I: immersive multimedia interfaces

IM3I: immersive multimedia interfaces

May 19, 2010

The IM3I project addresses the needs of a new generation of media and communication industry that has to confront itself not only with changing technologies, but also with the radical change in media consumption behaviour. IM3I will enable new ways of accessing and presenting media content to users, and new ways for users to interact with services, offering a natural and transparent way to deal with the complexities of interaction, while hiding them from the user.

Mediateca di Palazzo Medici Riccardi

Mediateca di Palazzo Medici Riccardi

March 29, 2010

The Mediateca Medicea is a digital archive relating to Palazzo Medici Riccardi, one of the most important buildings in Florence, which now belongs to the Provincial Authority and houses the administrative offices. The Mediateca Medicea is designed in particular for academics and experts in the fields of art, history, the humanities, photography and the conservation of the cultural heritage, but also for students or scholars following up specific strands of research.


Natural interaction

Multi-user interactive table for neurocognitive and neuromotor rehabilitation

Multi-user interactive table for neurocognitive and neuromotor rehabilitation

March 29, 2010

This project concerns the design and development of a multi-touch system that provides innovative tools for neurocognitive and neuromotor rehabilitation for senile diseases. This project comes to life thanks to the collaboration between MICC, the Faculty of Psychology (University of Florence) and Montedomini A.S.P., a public agency for self sufficient and disabled elders that offers welfare and health care services.

TANGerINE Grape

TANGerINE Grape

March 26, 2010

TANGerINE Grape is a collaborative knowledge sharing system that can be used through natural and tangible interfaces. The final goal is to enable users to enrich their knowledge through the attainment of information both from digital libraries and from the knowledge shared by other users involved in the same interaction session.

Multi-user environment for semantic search of multimedia contents

Multi-user environment for semantic search of multimedia contents

March 26, 2010

This research project exploits new technologies (multi-touch table and iPhone) in order to develop a multi-user, multi-role and multi-modal system for multimedia content search, annotation and organization. As use case we considered the field of broadcast journalism where editors and archivists work together in creating a film report using archive footage.

CocoNUIT

CocoNUIT

March 26, 2010

This project aims to realize a lightweight, flexible and extensible Cocoa Framework to create Multitouch and more in general Tangible apps. It implements the basic gestures recognition and offers the possibility for each user to define and setup its owns gestures easily. Because of its nature we hope this framework will work good with Quartz and Core Animation to realize fun and useful apps. It offers also a lot of off-the-shelf widgets, ready to quick realize your own NUI app.

TANGerINE Cities

TANGerINE Cities

March 25, 2010

TANGerINE cities is a research project that investigates collaborative tangible applications. It was made within TANGerINE research project. This project is an ongoing research on TUIs (tangible user interfaces) combining previous experiences with natural vision-based gestural interaction on augmented surfaces and tabletops with the introduction of smart wireless objects and sensor fusion techniques.

Shawbak

Shawbak

March 25, 2010

A technology transfer project realized for the international exhibition From Petra to Shawbak: archeology of a frontier. A multi-touch tableTop was realized for this exhibition that presents the results of the latest international archeology investigations and of the research conducted by the archaeological mission of the University of Florence in these past twenty years in Jordan at the sites of Petra and Shawbak, one of the most important historical areas in the world.

  • Pages

  • Videos on Vimeo