Publications by Roberto Vezzani

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Active filters (Clear): Author: Roberto Vezzani

Intelligent Video Surveillance

Authors: Cucchiara, Rita; Prati, Andrea; Vezzani, Roberto

Safety and security reasons are pushing the growth of surveillance systems, for both prevention and forensic tasks. Unfortunately, most of … (Read full abstract)

Safety and security reasons are pushing the growth of surveillance systems, for both prevention and forensic tasks. Unfortunately, most of the installed systems have recording capability only, with quality so poor that makes them completely unhelpful. This chapter will introduce the concepts of modern systems for Intelligent Video Surveillance (IVS), with the claim of providing neither a complete treatment nor a technical description of this topic but of representing a simple and concise panorama of the motivations, components, and trends of these systems. Different from CCTV systems, IVS should be able, for instance, to monitor people in public areas and smart homes, to control urban traffi c, and to identity assessment for security and safety of critical infrastructure.

2012 Capitolo/Saggio

People Orientation Recognition by Mixtures of Wrapped Distributions on Random Trees

Authors: Baltieri, Davide; Vezzani, Roberto; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

The recognition of people orientation in single images is still an open issue in several real cases, when the image … (Read full abstract)

The recognition of people orientation in single images is still an open issue in several real cases, when the image resolution is poor, body parts cannot be distinguished and localized or motion cannot be exploited. However, the estimation of a person orientation, even an approximated one, could be very useful to improve people tracking and re-identification systems, or to provide a coarse alignment of body models on the input images. In these situations, holistic features seem to be more effective and faster than model based 3D reconstructions. In this paper we propose to describe the people appearance with multi-level HoG feature sets and to classify their orientation using an array of Extremely Randomized Trees classifiers trained on quantized directions. The outputs of the classifiers are then integrated into a global continuous probability density function using a Mixture of Approximated Wrapped Gaussian distributions. Experiments on the TUD Multiview Pedestrians, the Sarc3D, and the 3DPeS datasets confirm the efficacy of the method and the improvement with respect to state of the art approaches.

2012 Relazione in Atti di Convegno

3DPes: 3D People Dataset for Surveillance and Forensics

Authors: Baltieri, Davide; Vezzani, Roberto; Cucchiara, Rita

The interest of the research community in creating reference datasets for performance analysis is always very high. Although new datasets, … (Read full abstract)

The interest of the research community in creating reference datasets for performance analysis is always very high. Although new datasets, collecting large amounts of video footage are spreading in surveillance and forensics, few bench-marks with annotation data are available for testing specific tasks and especially for 3D/multi-view analysis. In this paper we present 3DPeS, a new dataset for 3D/multi- view surveillance and forensic applications. This has been designed for discussing and evaluating research results in people re-identification and other related activities (people detection, people segmentation and people tracking). The new assessed version of the dataset contains hundreds of video sequences of 200 people taken from a multi-camera distributed surveillance system over several days, with different light conditions; each person is detected multiple times and from different points of view. In surveillance scenarios, the dataset can be exploited to evaluate people reacquisition, 3D body models and people activity reconstruction algorithms. In forensics it can be adopted too, by relaxing some constraints (e.g. real time) and neglecting some information (e.g. calibration). Some results on this new dataset are presented using state of the art methods for people re-identification as a benchmark for future comparisons.

2011 Relazione in Atti di Convegno

Multi-view people surveillance using 3D information

Authors: Baltieri, Davide; Vezzani, Roberto; Cucchiara, Rita; A., Utasi; C., Benedek; T., Sziranyi

In this paper we introduce a novel surveillance system, which uses 3D information extracted from multiple cameras to detect, track … (Read full abstract)

In this paper we introduce a novel surveillance system, which uses 3D information extracted from multiple cameras to detect, track and re-identify people. The detection method is based on a 3D Marked Point Process model using two pixel-level features extracted from multi-plane projections of binary foreground masks, and uses a stochastic optimization framework to estimate the position and the height of each person. We apply a rule based Kalman-filter tracking on the detection results to find the object-to-object correspondence between consecutive time steps. Finally, a 3D body model based long-term tracking module connects broken tracks and is also used to re-identify people

2011 Relazione in Atti di Convegno

Probabilistic people tracking with appearance models and occlusion classification: The AD-HOC system

Authors: Vezzani, Roberto; Grana, Costantino; Cucchiara, Rita

Published in: PATTERN RECOGNITION LETTERS

AD-HOC (Appearance Driven Human tracking with Occlusion Classification) is a complete framework for multiple people tracking in video surveillance applications … (Read full abstract)

AD-HOC (Appearance Driven Human tracking with Occlusion Classification) is a complete framework for multiple people tracking in video surveillance applications in presence of large occlusions. The appearance-based approach allows the estimation of the pixel-wise shape of each tracked person even during the occlusion. This peculiarity can be very useful for higher level processes, such as action recognition or event detection. A first step predicts the position of all the objects in the new frame while a MAP framework provides a solution for best placement. A second step associates each candidate foreground pixel to an object according to mutual object position and color similarity. A novel definition of non-visible regions accounts for the parts of the objects that are not detected in the current frame, classifying them as dynamic, scene or apparent occlusions. Results on surveillance videos are reported, using in-house produced videos and the PETS2006 test set.

2011 Articolo su rivista

SARC3D: a new 3D body model for People Tracking and Re-identification

Authors: Baltieri, Davide; Vezzani, Roberto; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

We propose a new simplified 3D body model (called Sarc3D) for surveillance application, that can be created, updated and compared … (Read full abstract)

We propose a new simplified 3D body model (called Sarc3D) for surveillance application, that can be created, updated and compared in rea-time.People are detected and tracked in each calibrated camera, and their silhouette, appearance, position and orientation are extracted and used to place, scale and orientate a 3D body model. Foreach vertex of the model a signature (color features, reliability and saliency) is computed from the 2D appearance images and exploited for mathing. This approach achieves robustness against partial occlusions, pose and viewpoint changes. The complete proposal and a full experimental evaluation is presented, using a new benchmark suite and the PETS2009 dataset.

2011 Relazione in Atti di Convegno

3D Body Model Construction and Matching for Real Time People Re-Identification

Authors: Baltieri, Davide; Vezzani, Roberto; Cucchiara, Rita

Wide area video surveillance always requires to extract and integrate information coming from different cameras and views. Re-identification of people … (Read full abstract)

Wide area video surveillance always requires to extract and integrate information coming from different cameras and views. Re-identification of people captured from different cameras or different views is one of most challenging problems. In this paper, we present a novel approach for people matching with vertices-based 3D human models.People are detected and tracked in each calibrated camera, and their silhouette, appearance, position and orientation are extracted and used to place, scale and orientate a 3D body model. Colour features are computed from the 2D appearance images and mapped to the 3D model vertices, generating the 3D model for each tracked person. A distance function between 3D models is defined in order to find matches among models belonging to the same person. This approach achieves robustness against partial occlusions, pose and viewpoint changes. A first experimental evaluation is conducted using images extracted from a real camera set-up.

2010 Relazione in Atti di Convegno

Event Driven Software Architecture for Multi-camera and Distributed Surveillance Research Systems

Authors: Vezzani, Roberto; Cucchiara, Rita

Surveillance of wide areas with several connected cameras integrated in the same automatic system is no more a chimera, but … (Read full abstract)

Surveillance of wide areas with several connected cameras integrated in the same automatic system is no more a chimera, but modular, scalable and flexible architectures are mandatory to manage them. This paper points out the main issues on the development of distributed surveillance systems and proposes an integrated framework particularly suitable for research purposes. As first, exploiting a computer architecture analogy, a three layer tracking system is proposed, which copes with the integration of both overlapping and non overlapping cameras. Then, a static service oriented architecture is adopted to collect and manage the plethora of high level modules, such as face detection and recognition, posture and action classification, and so on. Finally, the overall architecture is controlled by an event driven communication infrastructure, which assures the scalability and the flexibility of the system.

2010 Relazione in Atti di Convegno

Fast Background Initialization with Recursive Hadamard Transform

Authors: Baltieri, Davide; Vezzani, Roberto; Cucchiara, Rita

In this paper, we present a new and fast techniquefor background estimation from cluttered image sequences.Most of the background initialization … (Read full abstract)

In this paper, we present a new and fast techniquefor background estimation from cluttered image sequences.Most of the background initialization approaches developedso far collect a number of initial frames and then requirea slow estimation step which introduces a delay wheneverit is applied. Conversely, the proposed technique redistributesthe computational load among all the frames bymeans of a patch by patch preprocessing, which makesthe overall algorithm more suitable for real-time applications.For each patch location a prototype set is created andmaintained. The background is then iteratively estimatedby choosing from each set the most appropriate candidatepatch, which should verify a sort of frequency coherencewith its neighbors. To this aim, the Hadamard transformhas been adopted which requires less computation time thanthe commonly used DCT. Finally, a refinement step exploitsspatial continuity constraints along the patch borders toprevent erroneous patch selections. The approach has beencompared with the state of the art on videos from availabledatasets (ViSOR and CAVIAR), showing a speed up of about10 times and an improved accuracy

2010 Relazione in Atti di Convegno

HMM Based Action Recognition with Projection Histogram Features

Authors: Vezzani, Roberto; Baltieri, Davide; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Hidden Markov Models (HMM) have been widely used for action recognition, since they allow to easily model the temporal evolution … (Read full abstract)

Hidden Markov Models (HMM) have been widely used for action recognition, since they allow to easily model the temporal evolution of a single or a set of numeric features extracted from the data. The selection of the feature set and the related emission probability function are the key issues to be defined. In particular, if the training set is not sufficiently large, a manual or automatic feature selection and reduction is mandatory. In this paper we propose to model the emission probability function as a Mixture of Gaussian and the feature set is obtained from the projection histograms of the foreground mask. The projectionhistograms contain the number of moving pixel for each row and for each column of the frame and they provide sufficient information to infer the instantaneous posture of the person. Then, the HMM framework recovers the temporal evolution of the postures recognizing in such a manner the global action. The proposed method have been successfully tested on the UT-Tower and on the Weizmann Datasets.

2010 Relazione in Atti di Convegno

Page 8 of 13 • Total publications: 124