Publications by Rita Cucchiara

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Active filters (Clear): Author: Rita Cucchiara

Automatic segmentation of digitalized historical manuscripts

Authors: Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita

Published in: MULTIMEDIA TOOLS AND APPLICATIONS

The artistic content of historical manuscripts provides a lot of challenges in terms of automatic text extraction, picture segmentation and … (Read full abstract)

The artistic content of historical manuscripts provides a lot of challenges in terms of automatic text extraction, picture segmentation and retrieval by similarity. In particular this work addresses the problem of automatic extraction of meaningful pictures, distinguishing them from handwritten text and floral and abstract decorations. The proposed solution firstly employs a circular statistics description of a directional histogram in order to extract text. Then visual descriptors are computed over the pictorial regions of the page: the semantic content is distinguished from the decorative parts using color histograms and a novel texture feature called Gradient Spatial Dependency Matrix. The feature vectors are finally processed using an embedding procedure which allows increased performance in later SVM classification. Results for both feature extraction and embedding based classification are reported, supporting the effectiveness of the proposal on high resolution replicas of artistic manuscripts.

2011 Articolo su rivista

Contextual Information and Covariance Descriptors for People Surveillance: An Application for Safety of Construction Workers

Authors: Gualdi, Giovanni; Prati, Andrea; Cucchiara, Rita

Published in: EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING

In computer science, contextual information can be used both to reduce computations and to increase accuracy. This paper discusses how … (Read full abstract)

In computer science, contextual information can be used both to reduce computations and to increase accuracy. This paper discusses how it can be exploited for people surveillance in very cluttered environments in terms of perspective (i.e., weak scenecalibration) and appearance of the objects of interest (i.e., relevance feedback on the training of a classifier). These techniques are applied to a pedestrian detector that uses a LogitBoost classifier, appropriately modified to work with covariance descriptors which lie on Riemannian manifolds. On each detected pedestrian, a similar classifier is employed to obtain a precise localization of the head. Two novelties on the algorithms are proposed in this case: polar image transformations to better exploit the circular feature of the head appearance and multispectral image derivatives that catch not only luminance but also chrominance variations. The complete approach has been tested on the surveillance of a construction site to detect workers that do not wear the hard hat: in such scenarios, the complexity and dynamics are very high, making pedestrian detection a real challenge.

2011 Articolo su rivista

Detecting Anomalies in People’s Trajectories using Spectral Graph Analysis

Authors: Calderara, Simone; Uri, Heinemann; Prati, Andrea; Cucchiara, Rita; Naftali, Tishby

Published in: COMPUTER VISION AND IMAGE UNDERSTANDING

Video surveillance is becoming the technology of choice for monitoring crowded areas for security threats. While video provides ample information … (Read full abstract)

Video surveillance is becoming the technology of choice for monitoring crowded areas for security threats. While video provides ample information for human inspectors, there is a great need for robust automated techniques that can efficiently detect anomalous behavior in streaming video from single ormultiple cameras. In this work we synergistically combine two state-of-the-art methodologies. The rst is the ability to track and label single person trajectories in a crowded area using multiple video cameras, and the second is a new class of novelty detection algorithms based on spectral analysis of graphs. By representing the trajectories as sequences of transitions betweennodes in a graph, shared individual trajectories capture only a small subspace of the possible trajectories on the graph. This subspace is characterized by large connected components of the graph, which are spanned by the eigenvectors with the low eigenvalues of the graph Laplacian matrix. Using this technique, we develop robust invariant distance measures for detectinganomalous trajectories, and demonstrate their application on realvideo data.

2011 Articolo su rivista

Energy-efficient Feedback Tracking on Embedded Smart Cameras by Hardware-level Optimization

Authors: M., Casares; Santinelli, Paolo; S., Velipasalar; Prati, Andrea; Cucchiara, Rita

Embedded systems have limited processing power, memory and energy. When camera sensors are added to an embedded system, the problem … (Read full abstract)

Embedded systems have limited processing power, memory and energy. When camera sensors are added to an embedded system, the problem of limited resources becomes even more pronounced. In this paper, we introduce two methodologies to increase the energy-efficiency and battery-life of an embeddedsmart camera by hardware-level operations when performingobject detection and tracking. The CITRIC platform is employedas our embedded smart camera. First, down-sampling is performed at hardware level on the micro-controller of the imagesensor rather than performing software-level down-sampling atthe main microprocessor of the camera board. In addition, instead of performing object detection and tracking on wholeimage, we first estimate the location of the target in the nextframe, form a search region around it, then crop the next frameby using the HREF and VSYNC signals at the micro-controllerof the image sensor, and perform detection and tracking onlyin the cropped search region. Thus, the amount of data thatis moved from the image sensor to the main memory at eachframe is optimized. Also, we can adaptively change the size ofthe cropped window during tracking depending on the objectsize. Reducing the amount of transferred data, better use ofthe memory resources, and delegating image down-samplingand cropping tasks to the micro-controller on the image sensor,result in significant decrease in energy consumption and increasein battery-life. Experimental results show that hardware-leveldown-sampling and cropping, and performing detection andtracking in cropped regions provide 41.24% decrease in energyconsumption, and 107.2% increase in battery-life. Compared toperforming software-level down-sampling and processing wholeframes, proposed methodology provides an additional 8 hours ofcontinuous processing on 4 AA batteries, increasing the lifetimeof the camera to 15.5 hours.

2011 Relazione in Atti di Convegno

Energy-efficient foreground object detection on embedded smart cameras by hardware-level operations

Authors: Casares, M.; Santinelli, P.; Velipasalar, S.; Prati, A.; Cucchiara, R.

Published in: IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS

Embedded smart cameras have limited processing power, memory and energy. In this paper, we introduce two methodologies to increase the … (Read full abstract)

Embedded smart cameras have limited processing power, memory and energy. In this paper, we introduce two methodologies to increase the energy-efficiency and the battery-life of an embedded smart camera by hardware-level operations when performing foreground object detection. We use the CITRIC platform as our embedded smart camera. We first perform down-sampling at hardware level on the micro-controller of the image sensor rather than performing software-level down-sampling at the main microprocessor of the camera board. In addition, we crop an image frame at hardware level by using the HREF and VSYNC signals at the micro-controller of the image sensor to perform foreground object detection only in the cropped search region instead of the whole image. Thus, the amount of data that is moved from the image sensor to the main memory at each frame, is greatly reduced. Thanks to reduced data transfer, better use of the memory resources and not occupying the main microprocessor with image down-sampling and cropping tasks, we obtain significant savings in energy consumption and battery-life. Experimental results show that hardware-level down-sampling and cropping, and performing detection in cropped regions provide 54.14% decrease in energy consumption, and 121.25% increase in battery-life compared to performing software-level down-sampling and processing whole frames. © 2011 IEEE.

2011 Relazione in Atti di Convegno

Energy-efficient Object Detection and Tracking on Embedded Smart Cameras by Hardware-level Operations at the Image Sensor

Authors: M., Casares; Santinelli, Paolo; S., Velipasalar; Prati, Andrea; Cucchiara, Rita

Embedded smart cameras have limited processing power, memory and energy. In this paper, we introduce two methodologies to increase the … (Read full abstract)

Embedded smart cameras have limited processing power, memory and energy. In this paper, we introduce two methodologies to increase the energy-efficiency and the battery-life of an embedded smart camera by hardware-level operations when performing object detection and tracking. We use the CITRIC platform as our embedded smart camera. We first perform down-sampling at hardware-level on the microcontroller of the image sensor rather than performing software-level down-sampling at the main microprocessor of the camera board. In addition, instead of performing object detection on whole image, we first estimate the location of the target in the next frame, form a search region around it, then crop the next frame by using the HREF and VSYNC signals at the microcontrollerof the image sensor, and perform detection and tracking only in the cropped search region. Thus, the amount of data that is moved from the image sensor to the main memory at each frame, is greatly reduced. Thanks to reduced data transfer, better use of the memory resources and not occupying the main microprocessor with image down-sampling and cropping tasks, we obtain significant savings in energy consumption and battery-life. Experimental results show that hardware-level down-sampling and cropping, and performing detection in cropped regions provide 54:14% decrease in energy consumption, and 121:25% increase in battery-life compared to performing software-level downsampling and processing whole frame.

2011 Relazione in Atti di Convegno

Feature Space Warping Relevance Feedback with Transductive Learning

Authors: Borghesani, Daniele; Coppi, Dalia; Grana, Costantino; Calderara, Simone; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Relevance feedback is a widely adopted approach to improve content-based information retrieval systems by keeping the user in the retrieval … (Read full abstract)

Relevance feedback is a widely adopted approach to improve content-based information retrieval systems by keeping the user in the retrieval loop. Among the fundamental relevance feedback approaches, feature space warping has been proposed as an effective approach for bridging the gap between high-level semantics and the low-level features. Recently, combination of feature space warping and query point movement techniques has been proposed in contrast to learning based approaches, showing good performance under dierent data distributions. In this paper we propose to merge feature space warping and transductive learning, in order to benet from both the ability of adapting data to the user hints and the information coming from unlabeled samples. Experimental results on an image retrieval task reveal signicant performance improvements from the proposed method.

2011 Relazione in Atti di Convegno

Identification of Intruders in Groups of People using Cameras and RFIDs

Authors: Cucchiara, Rita; Fornaciari, Michele; Haider, Razia; Mandreoli, Federica; Prati, Andrea

The identification of intruders in groups of people moving in wide open areas represents a challenging scenario where coordination between … (Read full abstract)

The identification of intruders in groups of people moving in wide open areas represents a challenging scenario where coordination between cameras can be certainly used but this solution is not enough. In this paper, we propose to go beyond pure vision-based approaches by integrating the use of distributed cameras with the RFID technology. To this end, we introduce a system that “maps” RFID tags to people detected by cameras by using sophisticated techniques to filter the singular modalities and an evidential fusion architecture, based on Transferable Belief Model, to combine the two sources of information and manage conflict between them. The conducted experimental evaluation shows very promising results, especially in treating groups of people.

2011 Relazione in Atti di Convegno

Iterative active querying for surveillance data retrieval in crime detection and forensics

Authors: Coppi, Dalia; Calderara, Simone; Cucchiara, Rita

Large sets of visual data are now available both, in real time andoff line, at time of investigation in multimedia … (Read full abstract)

Large sets of visual data are now available both, in real time andoff line, at time of investigation in multimedia forensics, however passive querying systems often encounter difficulties in retrieving significant results. In this paper we propose an iterativeactive querying system for video surveillance and forensic applications based on the continuous interaction between the userand the system. The positive and negative user feedbacks areexploited as the input of a graph based transductive procedurefor iteratively refining the initial query results. Experimentsare shown using people trajectories and people appearance asdistance metrics.

2011 Relazione in Atti di Convegno

Joint ACM workshop on human gesture and behavior understanding (J-HGBU'11)

Authors: Pantic, M.; Pentland, A.; Vinciarelli, A.; Cucchiara, R.; Daoudi, M.; Del Bimbo, A.

The ability to understand social signals of a person we are communicating with is the core of social intelligence. Social … (Read full abstract)

The ability to understand social signals of a person we are communicating with is the core of social intelligence. Social Intelligence is a facet of human intelligence that has been argued to be indispensable and perhaps the most important for success in life. At the same time, human-centric multimedia applications for humans and about humans are becoming increasingly important. 3D modeled human-objects, like bodies, heads and faces are exploited for animation, security, and human computer interaction, while three dimensional motion of arms, legs and local body features is used for more complete human gesture, activity and behavior analysis. The Joint Human Gesture and Behavior Understanding (J-HGBU) workshop event consists of two parts focusing on these complementary challenges: the Workshop on Multimedia Access to 3D Human Objects (MA3HO'11) and the Workshop on Social Signal Processing (SSPW'11). © 2011 ACM.

2011 Relazione in Atti di Convegno

Page 36 of 51 • Total publications: 505