Publications - AImageLab

Optimizing GPU-Based Connected Components Labeling Algorithms

Authors: Allegretti, Stefano; Bolelli, Federico; Cancilla, Michele; Grana, Costantino

Connected Components Labeling (CCL) is a fundamental image processing technique, widely used in various application areas. Computational throughput of Graphical … (Read full abstract)

Connected Components Labeling (CCL) is a fundamental image processing technique, widely used in various application areas. Computational throughput of Graphical Processing Units (GPUs) makes them eligible for such a kind of algorithms. In the last decade, many approaches to compute CCL on GPUs have been proposed. Unfortunately, most of them have focused on 4-way connectivity neglecting the importance of 8-way connectivity. This paper aims to extend state-of-the-art GPU-based algorithms from 4 to 8-way connectivity and to improve them with additional optimizations. Experimental results revealed the effectiveness of the proposed strategies.

2018 Relazione in Atti di Convegno

DOI IRIS

Paying More Attention to Saliency: Image Captioning with Saliency and Context Attention

Authors: Cornia, Marcella; Baraldi, Lorenzo; Serra, Giuseppe; Cucchiara, Rita

Published in: ACM TRANSACTIONS ON MULTIMEDIA COMPUTING, COMMUNICATIONS AND APPLICATIONS

Image captioning has been recently gaining a lot of attention thanks to the impressive achievements shown by deep captioning architectures, … (Read full abstract)

Image captioning has been recently gaining a lot of attention thanks to the impressive achievements shown by deep captioning architectures, which combine Convolutional Neural Networks to extract image representations, and Recurrent Neural Networks to generate the corresponding captions. At the same time, a significant research effort has been dedicated to the development of saliency prediction models, which can predict human eye fixations. Despite saliency information could be useful to condition an image captioning architecture, by providing an indication of what is salient and what is not, no model has yet succeeded in effectively incorporating these two techniques. In this work, we propose an image captioning approach in which a generative recurrent neural network can focus on different parts of the input image during the generation of the caption, by exploiting the conditioning given by a saliency prediction model on which parts of the image are salient and which are contextual. We demonstrate, through extensive quantitative and qualitative experiments on large scale datasets, that our model achieves superior performances with respect to different image captioning baselines with and without saliency. Finally, we also show that the trained model can focus on salient and contextual regions during the generation of the caption in an appropriate way.

2018 Articolo su rivista

DOI IRIS

Personality Gaze Patterns Unveiled via Automatic Relevance Determination

Authors: Cuculo, Vittorio; D’Amelio, Alessandro; Lanzarotti, Raffaella; Boccignone, Giuseppe

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Understanding human gaze behaviour in social context, as along a face-to-face interaction, remains an open research issue which is strictly … (Read full abstract)

Understanding human gaze behaviour in social context, as along a face-to-face interaction, remains an open research issue which is strictly related to personality traits. In the effort to bridge the gap between available data and models, typical approaches focus on the analysis of spatial and temporal preferences of gaze deployment over specific regions of the observed face, while adopting classic statistical methods. In this note we propose a different analysis perspective based on novel data-mining techniques and a probabilistic classification method that relies on Gaussian Processes exploiting Automatic Relevance Determination (ARD) kernel. Preliminary results obtained on a publicly available dataset are provided.

2018 Relazione in Atti di Convegno

DOI IRIS

Plug-and-play CNN for crowd motion analysis: An application in abnormal event detection

Authors: Ravanbakhsh, M.; Nabi, M.; Mousavi, H.; Sangineto, E.; Sebe, N.

Most of the crowd abnormal event detection methods rely on complex hand-crafted features to represent the crowd motion and appearance. … (Read full abstract)

Most of the crowd abnormal event detection methods rely on complex hand-crafted features to represent the crowd motion and appearance. Convolutional Neural Networks (CNN) have shown to be a powerful instrument with excellent representational capacities, which can leverage the need for hand-crafted features. In this paper, we show that keeping track of the changes in the CNN feature across time can be used to effectively detect local anomalies. Specifically, we propose to measure local abnormality by combining semantic information (inherited from existing CNN models) with low-level optical-flow. One of the advantages of this method is that it can be used without the fine-tuning phase. The proposed method is validated on challenging abnormality detection datasets and the results show the superiority of our approach compared with the state-of-the art methods.

2018 Relazione in Atti di Convegno

DOI IRIS

Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model

Authors: Cornia, Marcella; Baraldi, Lorenzo; Serra, Giuseppe; Cucchiara, Rita

Published in: IEEE TRANSACTIONS ON IMAGE PROCESSING

Data-driven saliency has recently gained a lot of attention thanks to the use of Convolutional Neural Networks for predicting gaze … (Read full abstract)

Data-driven saliency has recently gained a lot of attention thanks to the use of Convolutional Neural Networks for predicting gaze fixations. In this paper we go beyond standard approaches to saliency prediction, in which gaze maps are computed with a feed-forward network, and present a novel model which can predict accurate saliency maps by incorporating neural attentive mechanisms. The core of our solution is a Convolutional LSTM that focuses on the most salient regions of the input image to iteratively refine the predicted saliency map. Additionally, to tackle the center bias typical of human eye fixations, our model can learn a set of prior maps generated with Gaussian functions. We show, through an extensive evaluation, that the proposed architecture outperforms the current state of the art on public saliency prediction datasets. We further study the contribution of each key component to demonstrate their robustness on different scenarios.

2018 Articolo su rivista

DOI IRIS

RALE051: a novel established cell line of sporadic Burkitt lymphoma

Authors: L’Abbate, Alberto; Iacobucci, Ilaria; Lonoce, Angelo; Turchiano, Antonella; Ficarra, Elisa; Paciello, Giulia; Cattina, Federica; Ferrari, Anna; Imbrogno, Enrica; Agostinelli, Claudio; Zinzani, Pierluigi; Martinelli, Giovanni; Derenzini, Enrico; Storlazzi, Clelia Tiziana

Published in: LEUKEMIA & LYMPHOMA

2018 Articolo su rivista

DOI IRIS

SACHER Project: A Cloud Platform and Integrated Services for Cultural Heritage and for Restoration

Authors: Bertacchi, Silvia; Al Jawarneh, Isam Mashhour; Apollonio, Fabrizio Ivan; Bertacchi, Gianna; Cancilla, Michele; Foschini, Luca; Grana, Costantino; Martuscelli, Giuseppe; Montanari, Rebecca

The SACHER project provides a distributed, open source and federated cloud platform able to support the life-cycle management of various … (Read full abstract)

The SACHER project provides a distributed, open source and federated cloud platform able to support the life-cycle management of various kinds of data concerning tangible Cultural Heritage. The paper describes the SACHER platform and, in particular, among the various integrated service prototypes, the most important ones to support restoration processes and cultural asset management: (i) 3D Life Cycle Management for Cultural Heritage (SACHER 3D CH), based on 3D digital models of architecture and dedicated to the management of Cultural Heritage and to the storage of the numerous data generated by the team of professionals involved in the restoration process; (ii) Multidimensional Search Engine for Cultural Heritage (SACHER MuSE CH), an advanced multi-level search system designed to manage Heritage data from heterogeneous sources.

2018 Relazione in Atti di Convegno

DOI IRIS

SAM: Pushing the Limits of Saliency Prediction Models

Authors: Cornia, Marcella; Baraldi, Lorenzo; Serra, Giuseppe; Cucchiara, Rita

The prediction of human eye fixations has been recently gaining a lot of attention thanks to the improvements shown by … (Read full abstract)

The prediction of human eye fixations has been recently gaining a lot of attention thanks to the improvements shown by deep architectures. In our work, we go beyond classical feed-forward networks to predict saliency maps and propose a Saliency Attentive Model which incorporates neural attention mechanisms to iteratively refine predictions. Experiments demonstrate that the proposed strategy overcomes by a considerable margin the state of the art on the largest dataset available for saliency prediction. Here, we provide experimental results on other popular saliency datasets to confirm the effectiveness and the generalization capabilities of our model, which enable us to reach the state of the art on all considered datasets.

2018 Relazione in Atti di Convegno

DOI IRIS

Semantic-Fusion Gans for Semi-Supervised Satellite Image Classification

Authors: Subhankar, Roy; Sangineto, E.; Demir, B.; Sebe, N.

Published in: PROCEEDINGS - INTERNATIONAL CONFERENCE ON IMAGE PROCESSING

Most of the public satellite image datasets contain only a small number of annotated images. The lack of a sufficient … (Read full abstract)

Most of the public satellite image datasets contain only a small number of annotated images. The lack of a sufficient quantity of labeled data for training is a bottleneck for the use of modern deep-learning based classification approaches in this domain. In this paper we propose a semi -supervised approach to deal with this problem. We use the discriminator $(D)$ of a Generative Adversarial Network (GAN) as the final classifier, and we train $D$ using both labeled and unlabeled data. The main novelty we introduce is the representation of the visual information fed to $D$ by means of two different channels: the original image and its “semantic” representation, the latter being obtained by means of an external network trained on ImageNet. The two channels are fused in $D$ and jointly used to classify fake images, real labeled and real unlabeled images. We show that using only 100 labeled images, the proposed approach achieves an accuracy close to 69% and a significant improvement with respect to other GAN-based semi-supervised methods. Although we have tested our approach only on satellite images, we do not use any domain-specific knowledge. Thus, our method can be applied to other semi-supervised domains.

2018 Relazione in Atti di Convegno

DOI IRIS

Sistema e metodo di autenticazione di persone in ambienti a limitata visibilità

Authors: Borghi, Guido; Grazioli, Filippo; Vezzani, Roberto; Pini, Stefano; Cucchiara, Rita

2018 Brevetto

IRIS