Publications - AImageLab

A semi-automatic video annotation tool with MPEG-7 content collections

Authors: Cucchiara, Rita; Grana, Costantino; D., Bulgarelli; Vezzani, Roberto

In this work, we present a general purpose system for hierarchical structural segmentation and automatic annotation of video clips, by … (Read full abstract)

In this work, we present a general purpose system for hierarchical structural segmentation and automatic annotation of video clips, by means of standardized low level features. We propose to automatically extract some prototypes for each class with a context based intra-class clustering. Clips are annotated following the MPEG-7 standard directives to provide easier portability. Results of automatic annotation and semiautomatic metadata creation are provided

2006 Relazione in Atti di Convegno

DOI IRIS

A system for automatic face obscuration for privacy purposes

Authors: Cucchiara, Rita; Prati, Andrea; Vezzani, Roberto

Published in: PATTERN RECOGNITION LETTERS

This work proposes a method for automatic face obscuration capable of protecting people's identity. Since face detection heavily benefits from … (Read full abstract)

This work proposes a method for automatic face obscuration capable of protecting people's identity. Since face detection heavily benefits from the possibility to exploit tracking, multi-camera people tracking has been integrated with a face detector based on colour clustering and Hough transform. Moreover, the multiple viewpoints provided by multiple cameras are exploited in order to always obtain a good-quality image of the face. The identity of people in different views is kept consistent by means of a geometrical, uncalibrated approach based on homographies. Experimental results show the accuracy of the proposed approach. (c) 2006 Elsevier B.V. All rights reserved.

2006 Articolo su rivista

DOI IRIS

Advanced video surveillance with pan tilt zoom cameras

Authors: Cucchiara, Rita; Prati, Andrea; Vezzani, Roberto

In this paper an advanced video surveillance system is proposed.Our goal is the detection of the people’s heads toallow their … (Read full abstract)

In this paper an advanced video surveillance system is proposed.Our goal is the detection of the people’s heads toallow their obscuration for privacy issues or to performrecognition tasks. We propose a system based on active PTZ(Pan-Tilt-Zoom) cameras that produce head images havinga large enough size, and can cover an area larger than stillcameras. Since conventional approaches are not suitable toPTZ cameras, the proposed approach is based on the socalleddirection histograms to compute the ego-motion andon frame differencing for detecting moving objects. It exploitspost-processing and active contours to extract preciseshape of moving objects to be fed to a probabilistic algorithmto track moving people in the scene. Person following,instead, is based on simple heuristic rules that movethe camera as soon as the selected person is close to theborder of the field of view. Finally, a color and shape basedhead detection that takes advantage of the people trackingis presented. Experimental results on a live active camerademonstrate the feasibility of real-time person followingand of the consecutive head detection phase.

2006 Relazione in Atti di Convegno

IRIS

Estimating Geospatial Trajectory of a Moving Camera

Authors: A., Hakeem; Vezzani, Roberto; S., Shah; Cucchiara, Rita

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

This paper proposes a novel method for estimating thegeospatial trajectory of a moving camera. The proposedmethod uses a set of … (Read full abstract)

This paper proposes a novel method for estimating thegeospatial trajectory of a moving camera. The proposedmethod uses a set of reference images with known GPS(global positioning system) locations to recover the trajectoryof a moving camera using geometric constraints. Theproposed method has three main steps. First, scale invariantfeatures transform (SIFT) are detected and matched betweenthe reference images and the video frames to calculatea weighted adjacency matrix (WAM) based on the numberof SIFT matches. Second, using the estimated WAM, themaximum matching reference image is selected for the currentvideo frame, which is then used to estimate the relativeposition (rotation and translation) of the video frame usingthe fundamental matrix constraint. The relative position isrecovered upto a scale factor and a triangulation amongthe video frame and two reference images is performed toresolve the scale ambiguity. Third, an outlier rejection andtrajectory smoothing (using b-spline) post processing stepis employed. This is because the estimated camera locationsmay be noisy due to bad point correspondence or degenerateestimates of fundamental matrices. Results of recoveringcamera trajectory are reported for real sequences.

2006 Relazione in Atti di Convegno

DOI IRIS

MPEG-7 Pictorially Enriched Ontologies for Video Annotation

Authors: Grana, Costantino; Vezzani, Roberto; Bulgarelli, Daniele; Cucchiara, Rita

A system for the automatic creation of Pictorially Enriched Ontologies is presented, that is ontologies for context-based video digital libraries, … (Read full abstract)

A system for the automatic creation of Pictorially Enriched Ontologies is presented, that is ontologies for context-based video digital libraries, enriched by pictorial concepts for video annotation, summarization and similarity-based retrieval. Extraction of pictorial concepts with video clips clustering, ontology storing with MPEG-7, and the use of the ontology for stored video annotation are described. Re-sults on sport videos and TRECVID2005 video material are reported.

2006 Relazione in Atti di Convegno

IRIS

PEANO: Pictorial Enriched Annotation of Video

Authors: Grana, Costantino; Vezzani, Roberto; Bulgarelli, Daniele; Gualdi, Giovanni; Cucchiara, Rita; M., Bertini; C., Torniai; A., Del Bimbo

In this DEMO, we present a tool set for video digital library management that allows i) structural annotation of edited … (Read full abstract)

In this DEMO, we present a tool set for video digital library management that allows i) structural annotation of edited videos in MPEG-7 by automatically extracting shots and clips; ii) automatic semantic annotation based on perceptual similarity against a taxonomy enriched with pictorial concepts iii) video clip access and hierarchical summarization with stand-alone and web interface iv) access to clips from mobile platform in GPRS-UMTS videostreaming. The tools can be applied in different domain-specific Video Digital Libraries. The main novelty is the possibility to enrich the annotation with pictorial concepts that are added to a textual taxonomy in order to make the automatic annotation process more fast and often effective. The resulting multimedia ontology is described in the MPEG-7 framework. The PEANO (Perceptual Annotation of Video) tool has been tested over video art, sport (Soccer, Olimpic Games 2006, Formula 1) and news clips.

2006 Relazione in Atti di Convegno

DOI IRIS

University of Modena and Reggio Emilia at TRECVID 2006

Authors: Grana, Costantino; Vezzani, Roberto; Cucchiara, Rita

What approach or combination of approaches did you test in each of your submitted runs?TRECVID2005_UNIMORE_??.xml: the same linear transition detector … (Read full abstract)

What approach or combination of approaches did you test in each of your submitted runs?TRECVID2005_UNIMORE_??.xml: the same linear transition detector (LTD) was tested forevery run, with ten uniformly spaced thresholds for the detection.What if any significant differences (in terms of what measures) did you find among theruns?The system behaved as expected: the higher the threshold the better the recall. Of course theprecision lowered correspondently. Interesting enough, it seems that we cannot overcome theoverall limit around 80% for recall and 88% for precision, independently of the other parameter.Based on the results, can you estimate the relative contribution of each component of yoursystem/approach to its effectiveness?One of the main objective of our system was to test the performance of a single algorithm forboth cuts and gradual transitions. So all the merit and the demerits are related to our LTD.Overall, what did you learn about runs/approaches and the research question(s) thatmotivated them?The use of a single algorithm allows the system to be run without training. Just a singleparameter may be employed to tune the sensibility of the system, thus allowing its use in generalpurpose/user friendly systems.

2006 Relazione in Atti di Convegno

IRIS

Ambient Intelligence for Security in Public Parks: the LAICA Project

Authors: Cucchiara, Rita; Prati, Andrea; Vezzani, Roberto

In this paper, we address the exploitation of computervision techniques to develop multimedia services andautomatic monitoring systems related to the … (Read full abstract)

In this paper, we address the exploitation of computervision techniques to develop multimedia services andautomatic monitoring systems related to the securityand the privacy in public areas. The research is part ofa two-year ltalian project called LAICA, intended toprovide advanced services for citizens and publicofficers. Citizens want fast and friendly web access topublic places, to see the environment in real-timewithout violating the privacy laws. Public officers andpolicy centres want a fast and reactive monitoringsystem, capable to automatically detect dangeroussituations, given the huge amount of cameras that cannot be monitored simultaneously by human operators.In this work, we describe the project and the definedmethodologies in multi-camera video mosaicing,people tracking and consistent labelling, and access toprocessed data with face obscuration.

2005 Relazione in Atti di Convegno

DOI IRIS

An Integrated Multi-Modal Sensor Network for Video Surveillance

Authors: Prati, Andrea; Vezzani, Roberto; L., Benini; E., Farella; P., Zappi

To enhance video surveillance systems, multi-modal sensorintegration can be a successful strategy. In this work, a computervision system able to … (Read full abstract)

To enhance video surveillance systems, multi-modal sensorintegration can be a successful strategy. In this work, a computervision system able to detect and track people frommultiple cameras is integrated with a wireless sensor networkmounting PIR (Passive InfraRed) sensors. The twosubsystems are briefly described and possible cases in whichcomputer vision algorithms are likely to fail are discussed.Then, simple but reliable outputs from the PIR sensor nodesare exploited to improve the accuracy of the vision system.In particular, two case studies are reported: the first usesthe presence detection of PIR sensors to disambiguate betweenan opened door and a moving person, while the secondhandles motion direction changes during occlusions. Preliminaryresults are reported and demonstrate the usefulness ofthe integration of the two subsystems.

2005 Relazione in Atti di Convegno

DOI IRIS

Assessing Temporal Coherence for Posture Classification with Large Occlusions

Authors: Cucchiara, Rita; Vezzani, Roberto

In this paper we present a people posture classificationapproach especially devoted to cope with occlusions. Inparticular, the approach aims at … (Read full abstract)

In this paper we present a people posture classificationapproach especially devoted to cope with occlusions. Inparticular, the approach aims at assessing temporal coherenceof visual data over probabilistic models. A mixed predictiveand probabilistic tracking is proposed: a probabilistictracking maintains along time the actual appearance ofdetected people and evaluates the occlusion probability; anadditional tracking with Kalman prediction improves the estimationof the people position inside the room. ProbabilisticProjection Maps (PPMs) created with a learning phaseare matched against the appearance mask of the track. Finally,an Hidden Markov Model formulation of the posturecorrects the frame-by-frame classification uncertainties andmakes the system reliable even in presence of occlusions.Results obtained over real indoor sequences are discussed.

2005 Relazione in Atti di Convegno

DOI IRIS

Publications by Roberto Vezzani

A semi-automatic video annotation tool with MPEG-7 content collections

A system for automatic face obscuration for privacy purposes

Advanced video surveillance with pan tilt zoom cameras

Estimating Geospatial Trajectory of a Moving Camera

MPEG-7 Pictorially Enriched Ontologies for Video Annotation

PEANO: Pictorial Enriched Annotation of Video

University of Modena and Reggio Emilia at TRECVID 2006

Ambient Intelligence for Security in Public Parks: the LAICA Project

An Integrated Multi-Modal Sensor Network for Video Surveillance

Assessing Temporal Coherence for Posture Classification with Large Occlusions