Publications by Rita Cucchiara

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Active filters (Clear): Author: Rita Cucchiara

Automatic Single-Image People Segmentation and Removal for Cultural Heritage Imaging

Authors: Manfredi, Marco; Grana, Costantino; Cucchiara, Rita

In this paper, the problem of automatic people removal from digital photographs is addressed. Removing unintended people from a scene … (Read full abstract)

In this paper, the problem of automatic people removal from digital photographs is addressed. Removing unintended people from a scene can be very useful to focus further steps of image analysis only on the object of interest, A supervised segmentation algorithm is presented and tested in several scenarios.

2013 Relazione in Atti di Convegno

Beyond Bag of Words for Concept Detection and Search of Cultural Heritage Archives

Authors: Grana, Costantino; Serra, Giuseppe; Manfredi, Marco; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Several local features have become quite popular for concept detection and search, due to their ability to capture distinctive details. … (Read full abstract)

Several local features have become quite popular for concept detection and search, due to their ability to capture distinctive details. Typically a Bag of Words approach is followed, where a codebook is built by quantizing the local features. In this paper, we propose to represent SIFT local features extracted from an image as a multivariate Gaussian distribution, obtaining a mean vector and a covariance matrix. Differently from common techniques based on the Bag of Words model, our solution does not rely on the construction of a visual vocabulary, thus removing the dependence of the image descriptors on the specific dataset and allowing to immediately retargeting the features to different classification and search problems. Experimental results are conducted on two very different Cultural Heritage image archives, composed of illuminated manuscript miniatures, and architectural elements pictures collected from the web, on which the proposed approach outperforms the Bag of Words technique both in classification and retrieval.

2013 Relazione in Atti di Convegno

Hand Segmentation for Gesture Recognition in EGO-Vision

Authors: Serra, Giuseppe; Camurri, Marco; Baraldi, Lorenzo; Michela, Benedetti; Cucchiara, Rita

Portable devices for first-person camera views will play a central role in future interactive systems. One necessary step for feasible … (Read full abstract)

Portable devices for first-person camera views will play a central role in future interactive systems. One necessary step for feasible human-computer guided activities is gesture recognition, preceded by a reliable hand segmentation from egocentric vision. In this work we provide a novel hand segmentation algorithm based on Random Forest superpixel classification that integrates light, time and space consistency. We also propose a gesture recognition method based Exemplar SVMs since it requires a only small set of positive samples, hence it is well suitable for the egocentric video applications. Furthermore, this method is enhanced by using segmented images instead of full frames during test phase. Experimental results show that our hand segmentation algorithm outperforms the state-of-the-art approaches and improves the gesture recognition accuracy on both the publicly available EDSH dataset and our dataset designed for cultural heritage applications.

2013 Relazione in Atti di Convegno

Human Behavior Understanding with Wide Area Sensing Floors

Authors: Lombardi, Martino; Pieracci, Augusto; Santinelli, Paolo; Vezzani, Roberto; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

The research on innovative and natural interfaces aims at developing devices able to capture and understand the human behavior without … (Read full abstract)

The research on innovative and natural interfaces aims at developing devices able to capture and understand the human behavior without the need of a direct interaction. In this paper we propose and describe a framework based on a sensing floor device. The pressure field generated by people or objects standing on the floor is captured and analyzed. Local and global features are computed by a low level processing unit and sent to high level interfaces. The framework can be used in different applications, such as entertainment, education or surveillance. A detailed description of the sensing element and the processing architectures is provided, together with some sample applications developed to test the device capabilities.

2013 Relazione in Atti di Convegno

Image Classification with Multivariate Gaussian Descriptors

Authors: Grana, Costantino; Serra, Giuseppe; Manfredi, Marco; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Techniques based on Bag Of Words approach represent images by quantizing local descriptors and summarizing their distribution in a histogram. … (Read full abstract)

Techniques based on Bag Of Words approach represent images by quantizing local descriptors and summarizing their distribution in a histogram. Dierently, in this paper we describe an image as multivariate Gaussian distribution, estimated over the extracted local descriptors. The estimated distribution is mapped to a high-dimensional descriptor, by concatenating the mean vector and the projection of the covariance matrix on the Euclidean space tangent to the Riemannian manifold. To deal with large scale datasets and high dimensional feature spaces the Stochastic Gradient Descent solver is adopted. The experimental results on Caltech-101 and ImageCLEF2011 show that the method obtains competitive performance with state-of-the art approaches.

2013 Relazione in Atti di Convegno

Intelligent video surveillance as a service

Authors: Prati, A.; Vezzani, R.; Fornaciari, M.; Cucchiara, R.

Nowadays, intelligent video surveillance has become an essential tool of the greatest importance for several security-related applications. With the growth … (Read full abstract)

Nowadays, intelligent video surveillance has become an essential tool of the greatest importance for several security-related applications. With the growth of installed cameras and the increasing complexity of required algorithms, in-house self-contained video surveillance systems become a chimera for most institutions and (small) companies. The paradigm of Video Surveillance as a Service (VSaaS) helps distributing not only storage space in the cloud (necessary for handling large amounts of video data), but also infrastructures and computational power. This chapter will briefly introduce the motivations and the main characteristics of a VSaaS system, providing a case study where research-lab computer vision algorithms are integrated in a VSaaS platform. The lessons learnt and some future directions on this topic will be also highlighted.

2013 Capitolo/Saggio

Learning articulated body models for people re-identification

Authors: Baltieri, Davide; Vezzani, Roberto; Cucchiara, Rita

People re-identification is a challenging problem in surveillance and forensics and it aims at associating multiple instances of the same … (Read full abstract)

People re-identification is a challenging problem in surveillance and forensics and it aims at associating multiple instances of the same person which have been acquired from different points of view and after a temporal gap. Image-based appearance features are usually adopted but, in addition to their intrinsically low discriminability, they are subject to perspective and view-point issues. We propose to completely change the approach by mapping local descriptors extracted from RGB-D sensors on a 3D body model for creating a view-independent signature. An original bone-wise color descriptor is generated and reduced with PCA to compute the person signature. The virtual bone set used to map appearance features is learned using a recursive splitting approach. Finally, people matching for re-identification is performed using the Relaxed Pairwise Metric Learning, which simultaneously provides feature reduction and weighting. Experiments on a specific dataset created with the Microsoft Kinect sensor and the OpenNi libraries prove the advantages of the proposed technique with respect to state of the art methods based on 2D or non-articulated 3D body models.

2013 Relazione in Atti di Convegno

Lightweight Sign Recognition for Mobile Devices

Authors: Fornaciari, Michele; Prati, Andrea; Grana, Costantino; Cucchiara, Rita

The diffusion of powerful mobile devices has posed the basis for new applications implementing on the devices (which are embedded … (Read full abstract)

The diffusion of powerful mobile devices has posed the basis for new applications implementing on the devices (which are embedded devices) sophisticated computer vision and pattern recognition algorithms. This paper describes the implementation of a complete system for automatic recognition of places localized on a map through the recognition of significant signs by means of the camera of a mobile device (smartphone, tablet, etc.). The paper proposes a novel classification algorithm based on the innovative use of bag-of-words on ORB features. The recognition is achieved using a simple yet effective search scheme which exploits GPS localization to limit the possible matches. This simple solution brings several advantages, such as the speed also on limited-resource devices, the usability also with limited training samples and the easiness of adapting to new training samples and classes. The overall architecture of the system is based on a REST-JSON client-server architecture. The experimental results have been conducted in a real scenario and evaluating the different parameters which influence the performance.

2013 Relazione in Atti di Convegno

Modeling Local Descriptors with Multivariate Gaussians for Object and Scene Recognition

Authors: Serra, Giuseppe; Grana, Costantino; Manfredi, Marco; Cucchiara, Rita

Common techniques represent images by quantizing local descriptors and summarizing their distribution in a histogram. In this paper we propose … (Read full abstract)

Common techniques represent images by quantizing local descriptors and summarizing their distribution in a histogram. In this paper we propose to employ a parametric description and compare its capabilities to histogram based approaches. We use the multivariate Gaussian distribution, applied over the SIFT descriptors, extracted with dense sampling on a spatial pyramid. Every distribution is converted to a high-dimensional descriptor, by concatenating the mean vector and the projection of the covariance matrix on the Euclidean space tangent to the Riemannian manifold. Experiments on Caltech-101 and ImageCLEF2011 are performed using the Stochastic Gradient Descent solver, which allows to deal with large scale datasets and high dimensional feature spaces.

2013 Relazione in Atti di Convegno

On the design of embedded solutions to banknote recognition

Authors: Rashid, A.; Prati, A.; Cucchiara, R.

Published in: OPTICAL ENGINEERING

Banknote recognition systems have many applications in the modern world of automatic monetary transaction machines. They are traditionally based on … (Read full abstract)

Banknote recognition systems have many applications in the modern world of automatic monetary transaction machines. They are traditionally based on simple classifiers applied over manually selected areas. A new solution in this field, borrowed by content-based image retrieval (CBIR), which is based on dense scale-invariant feature transform features in a bag-of-words framework followed by a support vector machine (SVM) classifier, is explored. The proposed computer vision system for banknote recognition, on one hand, enables recognition at high accuracy and speed, and, on the other hand, provides basis for further applications, e.g., counterfeit detection and fitness test. This approach makes the system robust to various defects, which may occur during image acquisition or during circulation life of banknote. We implemented and tested on an embedded platform three state-of-the-art classification techniques [SVM, artificial neural network (ANN), and hidden Markov model (HMM)]. The comparative results are reported for accuracy with different sizes of the training datasets and with various types of datasets. In this framework, the SVM classifier outperforms ANN and HMM on the basis of speed and accuracy on our embedded platform. © 2013 Society of Photo-Optical Instrumentation Engineers.

2013 Articolo su rivista

Page 32 of 51 • Total publications: 505