April 6, 2010
Building a general human activity recognition and classification system is a challenging problem, because of the variations in environment, people and actions. In fact environment variation can be caused by cluttered or moving background, camera motion, illumination changes. People may have different size, shape and posture appearance. Recently, interest-points based models have been successfully applied to the human action classification problem, because they overcome some limitations of holistic models such as the necessity of performing background subtraction and tracking. We are working at a novel method based on the visual bag-of-words model and on a new spatio-temporal descriptor.
MICC Research
The MICC is headed by prof. Alberto Del Bimbo. Its research directions are: automatic video annotation, content based retrieval, intelligent videosurveillance, internet applications, natural interaction.
Automatic video annotation
Vidivideo: improving accessibility of videos
May 18, 2010
The VidiVideo project takes on the challenge of creating a substantially enhanced semantic access to video, implemented in a search engine. The outcome of the project is an audio-visual search engine, composed of two parts: an automatic annotation part, that runs off-line, where detectors for more than 1000 semantic concepts are collected in a thesaurus to process and automatically annotate the video and an interactive part that provides a video search engine for both technical and non-technical users.
Automatic trademark detection and recognition in sports videos
April 7, 2010
The availability of measures of appearance of trademarks and logos in a video is important in fields of marketing and sponsoring. These statistics can, in fact, be used by the sponsors to estimate the number TV viewers that noticed them and then evaluate the effects of the sponsorship. The goal of the ongoing project is to create a semi-automatic system for detection, tracking and recognition of pre-defined brands and trademarks in broadcast television. The number of appearances of a logo, its position, size and duration will be recorded to derive indexes and statistics that can be used for marketing analysis.
Video event classification using bag-of-words and string kernels
April 6, 2010
The recognition of events in videos is a relevant and challenging task of automatic semantic video analysis. At present one of the most successful frameworks, used for object recognition tasks, is the bag-of-words (BoW) approach. However it does not model the temporal information of the video stream. We are working at a novel method to introduce temporal information within the BoW approach by modeling a video clip as a sequence of histograms of visual features, computed from each frame using the traditional BoW model.
Human action categorization in unconstrained videos
April 6, 2010
Building a general human activity recognition and classification system is a challenging problem, because of the variations in environment, people and actions. In fact environment variation can be caused by cluttered or moving background, camera motion, illumination changes. People may have different size, shape and posture appearance. Recently, interest-points based models have been successfully applied to the human action classification problem, because they overcome some limitations of holistic models such as the necessity of performing background subtraction and tracking. We are working at a novel method based on the visual bag-of-words model and on a new spatio-temporal descriptor.
Content based retrieval
Accurate Evaluation of HER-2 Amplification in FISH Images
May 17, 2010
In this research we present a system that supports accurate estimation of the ratio of HER-2 over CEP-17 dots in FISH images of breast tissue samples. Compared to previous work, the system incorporates a model to associate with each segmented nucleus a reliability score that estimates the confidence of the measure of the ratio of HER-2 over CEP-17 dots within the nucleus.
Image forensics using SIFT features
April 6, 2010
In many application scenarios digital images play a basic role and often it is important to assess if their content is realistic or has been manipulated to mislead watcher’s opinion. Image forensics tools provide answers to similar questions. We are working on a novel method that focuses in particular on the problem of detecting if a feigned image has been created by cloning an area of the image onto another zone to make a duplication or to cancel something awkward.
SIFTPose: local pose estimation from a single scale invariant keypoint
March 29, 2010
The aim of this project is to develop a new method of estimating the poses of imaged scene surfaces provided that they can be locally approximated by their tangent planes. Our approach performs an accurate direct estimation by exploiting the robustness of scale invariant feature transform (SIFT). The results are representative of the state of the art for this challenging task.
Intelligent videosurveillance
Scale Invariant 3D Multi-Person Tracking with a PTZ camera
July 14, 2010
This research aims to realize a videosurveillance system for real-time 3D tracking of multiple people moving over an extended area, as seen from a rotating and zooming camera. The proposed method exploits multi-view image matching techniques to obtain dynamic-calibration of the camera and track many ground targets simultaneously, by slewing the video sensor from target to target and zooming in and out as necessary.
Optimal face detection and tracking
June 18, 2010
The project’s goal is to develop a reliable face detector and tracker for indoor video surveillance. The problem that we have been asked to deal with is to provide good quality face images of people entering restricted areas. Those images are going to be used for face recognition, and a feedback will be provided from the face recognition system to state if the person has been recognized or not.
Particle filter-based visual tracking
June 11, 2010
The project’s goal is to develop a computationally efficient, robust real-time particle filter-based visual tracker. In particular, we aim to increase the robustness of the tracker when it is used in conjunction with weak (but computationally efficient) appearance model, such as color histograms. To achieve this goal, we have proposed an adaptive parameter estimation method that estimates the statistic parameters of the particle filter on-line, so that it is possible to increase or reduce the uncertainty in the filter depending on a measure of its performances (tracking quality).
Mobile Robot Path Tracking with uncalibrated cameras
April 29, 2010
The aim of this transfer project is the motion control problem of a wheeled mobile robot (WMR) as observed from uncalibrated ceiling cameras. We develop a method that localizes the robot in real-time and smartly drives it over a path in a large environment with a pure pursuit controller, achieving less then 5 pixel on cross track error. Experiments are reported for Ambrogio, a two-wheel differentially-driven mobile robot provided by Zucchetti Centro Sistemi.
3D Mesh Partitioning
March 29, 2010
In this research, a model is proposed for decomposition of 3D objects based on Reeb-graphs. The model is motivated by perceptual principles and supports identification of salient object protrusions. Experimental results have demonstrate the effectiveness of the proposed approach with respect to different solutions appeared in the literature, and with reference to ground-truth data obtained by manually decomposing 3D objects.
3D Face Recognition
March 29, 2010
In this research, we present a novel approach to 3D face matching that shows high effectiveness in distinguishing facial differences between distinct individuals from differences induced by non-neutral expressions within the same individual. We present an extensive comparative evaluation of performance with the FRGC v2.0 dataset and the SHREC08 dataset.
Internet applications
LIT: Lexicon of the Italian Television
July 20, 2010
LIT (Lexicon of the Italian Television) is a project conceived by the Accademia della Crusca, the leading research institution on the Italian language, in collaboration with CLIEO (Center for theoretical and historical Linguistics: Italian, European and Oriental languages), with the aim of studying frequencies of the Italian lexicon used in television content and targets the specific sector of web applications for linguistic research. The corpus of transcriptions is constituted approximately by 170 hours of random television recordings transmitted by the national broadcaster RAI (Italian Radio Television) during the year 2006.
IM3I: immersive multimedia interfaces
May 19, 2010
The IM3I project addresses the needs of a new generation of media and communication industry that has to confront itself not only with changing technologies, but also with the radical change in media consumption behaviour. IM3I will enable new ways of accessing and presenting media content to users, and new ways for users to interact with services, offering a natural and transparent way to deal with the complexities of interaction, while hiding them from the user.
Mediateca di Palazzo Medici Riccardi
March 29, 2010
The Mediateca Medicea is a digital archive relating to Palazzo Medici Riccardi, one of the most important buildings in Florence, which now belongs to the Provincial Authority and houses the administrative offices. The Mediateca Medicea is designed in particular for academics and experts in the fields of art, history, the humanities, photography and the conservation of the cultural heritage, but also for students or scholars following up specific strands of research.
Natural interaction
Multi-user interactive table for neurocognitive and neuromotor rehabilitation
March 29, 2010
This project concerns the design and development of a multi-touch system that provides innovative tools for neurocognitive and neuromotor rehabilitation for senile diseases. This project comes to life thanks to the collaboration between MICC, the Faculty of Psychology (University of Florence) and Montedomini A.S.P., a public agency for self sufficient and disabled elders that offers welfare and health care services.
TANGerINE Grape
March 26, 2010
TANGerINE Grape is a collaborative knowledge sharing system that can be used through natural and tangible interfaces. The final goal is to enable users to enrich their knowledge through the attainment of information both from digital libraries and from the knowledge shared by other users involved in the same interaction session.
Multi-user environment for semantic search of multimedia contents
March 26, 2010
This research project exploits new technologies (multi-touch table and iPhone) in order to develop a multi-user, multi-role and multi-modal system for multimedia content search, annotation and organization. As use case we considered the field of broadcast journalism where editors and archivists work together in creating a film report using archive footage.
CocoNUIT
March 26, 2010
This project aims to realize a lightweight, flexible and extensible Cocoa Framework to create Multitouch and more in general Tangible apps. It implements the basic gestures recognition and offers the possibility for each user to define and setup its owns gestures easily. Because of its nature we hope this framework will work good with Quartz and Core Animation to realize fun and useful apps. It offers also a lot of off-the-shelf widgets, ready to quick realize your own NUI app.
TANGerINE Cities
March 25, 2010
TANGerINE cities is a research project that investigates collaborative tangible applications. It was made within TANGerINE research project. This project is an ongoing research on TUIs (tangible user interfaces) combining previous experiences with natural vision-based gestural interaction on augmented surfaces and tabletops with the introduction of smart wireless objects and sensor fusion techniques.
Shawbak
March 25, 2010
A technology transfer project realized for the international exhibition From Petra to Shawbak: archeology of a frontier. A multi-touch tableTop was realized for this exhibition that presents the results of the latest international archeology investigations and of the research conducted by the archaeological mission of the University of Florence in these past twenty years in Jordan at the sites of Petra and Shawbak, one of the most important historical areas in the world.