Publications by Rita Cucchiara

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Active filters (Clear): Author: Rita Cucchiara

A new era in the study of intergroup nonverbal behaviour: Studying intergroup dyadic interactions “online”

Authors: Di Bernardo, Gian Antonio; Vezzali, Loris; Palazzi, Andrea; Calderara, Simone; Bicocchi, Nicola; Zambonelli, Franco; Cucchiara, Rita; Cadamuro, Alessia

We examined predictors and consequences of intergroup nonverbal behaviour by relying on new technologies and new objective indices. In three … (Read full abstract)

We examined predictors and consequences of intergroup nonverbal behaviour by relying on new technologies and new objective indices. In three studies, both in the laboratory and in the field with children, behaviour was a function of implicit prejudice.

2017 Abstract in Atti di Convegno

A Video Library System Using Scene Detection and Automatic Tagging

Authors: Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita

We present a novel video browsing and retrieval system for edited videos, in which videos are automatically decomposed into meaningful … (Read full abstract)

We present a novel video browsing and retrieval system for edited videos, in which videos are automatically decomposed into meaningful and storytelling parts (i.e. scenes) and tagged according to their transcript. The system relies on a Triplet Deep Neural Network which exploits multimodal features, and has been implemented as a set of extensions to the eXo Platform Enterprise Content Management System (ECMS). This set of extensions enable the interactive visualization of a video, its automatic and semi-automatic annotation, as well as a keyword-based search inside the video collection. The platform also allows a natural integration with third-party add-ons, so that automatic annotations can be exploited outside the proposed platform.

2017 Relazione in Atti di Convegno

Affective level design for a role-playing videogame evaluated by a brain–computer interface and machine learning methods

Authors: Balducci, Fabrizio; Grana, Costantino; Cucchiara, Rita

Published in: THE VISUAL COMPUTER

Game science has become a research field, which attracts industry attention due to a worldwide rich sell-market. To understand the … (Read full abstract)

Game science has become a research field, which attracts industry attention due to a worldwide rich sell-market. To understand the player experience, concepts like flow or boredom mental states require formalization and empirical investigation, taking advantage of the objective data that psychophysiological methods like electroencephalography (EEG) can provide. This work studies the affective ludology and shows two different game levels for Neverwinter Nights 2 developed with the aim to manipulate emotions; two sets of affective design guidelines are presented, with a rigorous formalization that considers the characteristics of role-playing genre and its specific gameplay. An empirical investigation with a brain–computer interface headset has been conducted: by extracting numerical data features, machine learning techniques classify the different activities of the gaming sessions (task and events) to verify if their design differentiation coincides with the affective one. The observed results, also supported by subjective questionnaires data, confirm the goodness of the proposed guidelines, suggesting that this evaluation methodology could be extended to other evaluation tasks.

2017 Articolo su rivista

Attentive Models in Vision: Computing Saliency Maps in the Deep Learning Era

Authors: Cornia, Marcella; Abati, Davide; Baraldi, Lorenzo; Palazzi, Andrea; Calderara, Simone; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Estimating the focus of attention of a person looking at an image or a video is a crucial step which … (Read full abstract)

Estimating the focus of attention of a person looking at an image or a video is a crucial step which can enhance many vision-based inference mechanisms: image segmentation and annotation, video captioning, autonomous driving are some examples. The early stages of the attentive behavior are typically bottom-up; reproducing the same mechanism means to find the saliency embodied in the images, i.e. which parts of an image pop out of a visual scene. This process has been studied for decades in neuroscience and in terms of computational models for reproducing the human cortical process. In the last few years, early models have been replaced by deep learning architectures, that outperform any early approach compared against public datasets. In this paper, we propose a discussion on why convolutional neural networks (CNNs) are so accurate in saliency prediction. We present our DL architectures which combine both bottom-up cues and higher-level semantics, and incorporate the concept of time in the attentional process through LSTM recurrent architectures. Eventually, we present a video-specific architecture based on the C3D network, which can extracts spatio-temporal features by means of 3D convolutions to model task-driven attentive behaviors. The merit of this work is to show how these deep networks are not mere brute-force methods tuned on massive amount of data, but represent well-defined architectures which recall very closely the early saliency models, although improved with the semantics learned by human ground-thuth.

2017 Relazione in Atti di Convegno

Editorial Message from the Program Chairs

Authors: Cucchiara, R.; Matsushita, Y.; Sebe, N.; Soatto, S.

Published in: PROCEEDINGS IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION

2017 Relazione in Atti di Convegno

Embedded Recurrent Network for Head Pose Estimation in Car

Authors: Borghi, Guido; Gasparini, Riccardo; Vezzani, Roberto; Cucchiara, Rita

An accurate and fast driver's head pose estimation is a rich source of information, in particular in the automotive context. … (Read full abstract)

An accurate and fast driver's head pose estimation is a rich source of information, in particular in the automotive context. Head pose is a key element for driver's behavior investigation, pose analysis, attention monitoring and also a useful component to improve the efficacy of Human-Car Interaction systems. In this paper, a Recurrent Neural Network is exploited to tackle the problem of driver head pose estimation, directly and only working on depth images to be more reliable in presence of varying or insufficient illumination. Experimental results, obtained from two public dataset, namely Biwi Kinect Head Pose and ICT-3DHP Database, prove the efficacy of the proposed method that overcomes state-of-art works. Besides, the entire system is implemented and tested on two embedded boards with real time performance.

2017 Relazione in Atti di Convegno

Fast and Accurate Facial Landmark Localization in Depth Images for In-car Applications

Authors: Frigieri, Elia; Borghi, Guido; Vezzani, Roberto; Cucchiara, Rita

A correct and reliable localization of facial landmark enables several applications in many fields, ranging from Human Computer Interaction to … (Read full abstract)

A correct and reliable localization of facial landmark enables several applications in many fields, ranging from Human Computer Interaction to video surveillance. For instance, it can provide a valuable input to monitor the driver physical state and attention level in automotive context. In this paper, we tackle the problem of facial landmark localization through a deep approach. The developed system runs in real time and, in particular, is more reliable than state-of-the-art competitors specially in presence of light changes and poor illumination, thanks to the use of depth images as input. We also collected and shared a new realistic dataset inside a car, called MotorMark, to train and test the system. In addition, we exploited the public Eurecom Kinect Face Dataset for the evaluation phase, achieving promising results both in terms of accuracy and computational speed.

2017 Relazione in Atti di Convegno

From Depth Data to Head Pose Estimation: a Siamese approach

Authors: Venturelli, Marco; Borghi, Guido; Vezzani, Roberto; Cucchiara, Rita

The correct estimation of the head pose is a problem of the great importance for many applications. For instance, it … (Read full abstract)

The correct estimation of the head pose is a problem of the great importance for many applications. For instance, it is an enabling technology in automotive for driver attention monitoring. In this paper, we tackle the pose estimation problem through a deep learning network working in regression manner. Traditional methods usually rely on visual facial features, such as facial landmarks or nose tip position. In contrast, we exploit a Convolutional Neural Network (CNN) to perform head pose estimation directly from depth data. We exploit a Siamese architecture and we propose a novel loss function to improve the learning of the regression network layer. The system has been tested on two public datasets, Biwi Kinect Head Pose and ICT-3DHP database. The reported results demonstrate the improvement in accuracy with respect to current state-of-the-art approaches and the real time capabilities of the overall framework.

2017 Relazione in Atti di Convegno

From Groups to Leaders and Back. Exploring Mutual Predictability Between Social Groups and Their Leaders

Authors: Solera, Francesco; Calderara, Simone; Cucchiara, Rita

Recently, social theories and empirical observations identified small groups and leaders as the basic elements which shape a crowd. This … (Read full abstract)

Recently, social theories and empirical observations identified small groups and leaders as the basic elements which shape a crowd. This leads to an intermediate level of abstraction that is placed between the crowd as a flow of people, and the crowd as a collection of individuals. Consequently, automatic analysis of crowds in computer vision is also experiencing a shift in focus from individuals to groups and from small groups to their leaders. In this chapter, we present state-of-the-art solutions to the groups and leaders detection problem, which are able to account for physical factors as well as for sociological evidence observed over short time windows. The presented algorithms are framed as structured learning problems over the set of individual trajectories. However, the way trajectories are exploited to predict the structure of the crowd is not fixed but rather learned from recorded and annotated data, enabling the method to adapt these concepts to different scenarios, densities, cultures, and other unobservable complexities. Additionally, we investigate the relation between leaders and their groups and propose the first attempt to exploit leadership as prior knowledge for group detection.

2017 Capitolo/Saggio

Generative Adversarial Models for People Attribute Recognition in Surveillance

Authors: Fabbri, Matteo; Calderara, Simone; Cucchiara, Rita

In this paper we propose a deep architecture for detecting people attributes (e.g. gender, race, clothing ...) in surveillance contexts. … (Read full abstract)

In this paper we propose a deep architecture for detecting people attributes (e.g. gender, race, clothing ...) in surveillance contexts. Our proposal explicitly deal with poor resolution and occlusion issues that often occur in surveillance footages by enhancing the images by means of Deep Convolutional Generative Adversarial Networks (DCGAN). Experiments show that by combining both our Generative Reconstruction and Deep Attribute Classification Network we can effectively extract attributes even when resolution is poor and in presence of strong occlusions up to 80% of the whole person figure.

2017 Relazione in Atti di Convegno

Page 23 of 51 • Total publications: 505