MA3HO'11 foreword
Authors: Cucchiara, R.; Daoudi, M.; Del Bimbo, A.
Explore our research publications: papers, articles, and conference proceedings from AImageLab.
Authors: Cucchiara, R.; Daoudi, M.; Del Bimbo, A.
Authors: Calderara, Simone; Prati, Andrea; Cucchiara, Rita
Published in: INTERNATIONAL JOURNAL OF MULTIMEDIA INTELLIGENCE AND SECURITY
This paper presents a method for recognising human actions bytracking body parts without using artificial markers. A sophisticated appearance-based tracking able to cope with occlusions is exploited to extract a probability map for each moving object. A segmentation technique based on mixture of Gaussians (MoG) is then employed to extract and track significantpoints on this map, corresponding to significant regions on the human silhouette. The evolution of the mixture in time is analysed by transforming it in a sequence of symbols (corresponding to a MoG). The similarity between actions is computed by applying global alignment and dynamic programming techniques to the corresponding sequences and using a variational approximation of the Kullback-Leibler divergence to measure the dissimilarity between two MoGs. Experiments on publicly available datasets and comparison with existing methods are provided.
Authors: Calderara, Simone; Prati, Andrea; Cucchiara, Rita
Published in: IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
People trajectory analysis is a recurrent task inmany pattern recognition applications, such as surveillance,behavior analysis, video annotation, and many others. In thispaper we propose a new framework for analyzing trajectoryshape, invariant to spatial shifts of the people motion in thescene. In order to cope with the noise and the uncertainty ofthe trajectory samples, we propose to describe the trajectoriesas a sequence of angles modelled by distributions of circularstatistics, i.e. a mixture of von Mises (MovM) distributions.To deal with MovM, we define a new specific EM algorithmfor estimating the parameters and derive a closed form of theBhattacharyya distance between single vM pdfs. Trajectories arethen modelled with a sequence of symbols, corresponding tothe most suitable distribution in the mixture, and comparedeach other after a global alignment procedure to cope withtrajectories of different lengths. The trajectories in the trainingset are clustered according with their shape similarity in an offlinephase, and testing trajectories are then classified with aspecific on-line EM, based on sufficient statistics. The approachis particularly suitable for classifying people trajectories in videosurveillance, searching for abnormal (i.e. infrequent) paths. Testson synthetic and real data are provided with also a completecomparison with other circular statistical and alignment methods.
Authors: Baltieri, Davide; Vezzani, Roberto; Cucchiara, Rita; A., Utasi; C., Benedek; T., Sziranyi
In this paper we introduce a novel surveillance system, which uses 3D information extracted from multiple cameras to detect, track and re-identify people. The detection method is based on a 3D Marked Point Process model using two pixel-level features extracted from multi-plane projections of binary foreground masks, and uses a stochastic optimization framework to estimate the position and the height of each person. We apply a rule based Kalman-filter tracking on the detection results to find the object-to-object correspondence between consecutive time steps. Finally, a 3D body model based long-term tracking module connects broken tracks and is also used to re-identify people
Authors: Grana, Costantino; Montangero, Manuela; Borghesani, Daniele; Cucchiara, Rita
Published in: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE
In this paper we present a novel dynamic programming algorithm to synthesize an optimal decision tree from OR-decision tables,an extension of standard decision tables,which allow to choose between several alternative actions in the same rule. Experiments are reported,showing the computational time improvements over state of the art implementations of connected components labeling,using this modelling technique.
Authors: Coppi, Dalia; Calderara, Simone; Cucchiara, Rita
Following people in different video sources is a challenging task: variations in the type of camera, in the lighting conditions, in the scene settings (e.g. crowd or occlusions) and in the point of view must be accounted. In this paper we propose a system based only on appearance information that, disregarding temporal and spatial information, can be flexibly applied on both moving and static cameras. We exploit the joint use of transductive learning and spectral properties of graph Laplacians proposing a formulation of the people tracing problem as a semi-supervised classification. The knowledge encoded in two labeled input sets of positive and negative samples of the target person and the continuous spectral update of these models allow us to obtain a robust approach for people tracing in surveillance video sequences. Experiments on publicly available datasets show satisfactory results and exhibit a good robustness in dealing with short and long term occlusions.
Authors: Vezzani, Roberto; Grana, Costantino; Cucchiara, Rita
Published in: PATTERN RECOGNITION LETTERS
AD-HOC (Appearance Driven Human tracking with Occlusion Classification) is a complete framework for multiple people tracking in video surveillance applications in presence of large occlusions. The appearance-based approach allows the estimation of the pixel-wise shape of each tracked person even during the occlusion. This peculiarity can be very useful for higher level processes, such as action recognition or event detection. A first step predicts the position of all the objects in the new frame while a MAP framework provides a solution for best placement. A second step associates each candidate foreground pixel to an object according to mutual object position and color similarity. A novel definition of non-visible regions accounts for the parts of the objects that are not detected in the current frame, classifying them as dynamic, scene or apparent occlusions. Results on surveillance videos are reported, using in-house produced videos and the PETS2006 test set.
Authors: Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
This paper provides an analysis on relevance feedback techniques in a multimedia system designed for the interactive exploration and annotation of artistic collections, in particular illuminated manuscripts. The relevance feedback is presented not only as a very effective technique to improve the performance of the system, but also as a clever way to increase the user experience, mixing the interactive surfing through the artistic content with the possibility to gather valuable information from the user, and consequently improving his retrieval satisfaction. We compare a modification of the Mean-Shift Feature Space Warping algorithm, as representative of the standard RF procedures, and a learning-based technique based on transduction, considered in order to overcome some limitation of the previous technique. Experiments are reported regarding the adopted visual features based on covariance matrices.
Authors: Baltieri, Davide; Vezzani, Roberto; Cucchiara, Rita
Published in: LECTURE NOTES IN COMPUTER SCIENCE
We propose a new simplified 3D body model (called Sarc3D) for surveillance application, that can be created, updated and compared in rea-time.People are detected and tracked in each calibrated camera, and their silhouette, appearance, position and orientation are extracted and used to place, scale and orientate a 3D body model. Foreach vertex of the model a signature (color features, reliability and saliency) is computed from the 2D appearance images and exploited for mathing. This approach achieves robustness against partial occlusions, pose and viewpoint changes. The complete proposal and a full experimental evaluation is presented, using a new benchmark suite and the PETS2009 dataset.
Authors: Gualdi, Giovanni; Prati, Andrea; Cucchiara, Rita
Despite the many efforts in finding effective feature sets or accurate classifiers for people detection, few works have addressed ways for reducing the computational burden introducedby the sliding window paradigm. This paper proposes a multi-stage procedure for refining the search for pedestrians using the HOG features and the monolithic SVM classifier. The multi-stage procedure is based on particle-based estimation of pdfs and exploits the margin provided by the classifier to draw more particles on the areas where the classifier’s response is higher. This iterative algorithm achieves the same accuracy than sliding window using less particles (and thus being more efficient) and, conversely, is more accurate when configured to work at thesame computational load. Experimental results on publicly available datasets demonstrate that this method, previouslyproposed for boosted classifiers only, can be successfully applied to monolithic classifiers.