Publications - AImageLab

Aligning Text and Document Illustrations: towards Visually Explainable Digital Humanities

Authors: Baraldi, Lorenzo; Cornia, Marcella; Grana, Costantino; Cucchiara, Rita

While several approaches to bring vision and language together are emerging, none of them has yet addressed the digital humanities … (Read full abstract)

While several approaches to bring vision and language together are emerging, none of them has yet addressed the digital humanities domain, which, nevertheless, is a rich source of visual and textual data. To foster research in this direction, we investigate the learning of visual-semantic embeddings for historical document illustrations, devising both supervised and semi-supervised approaches. We exploit the joint visual-semantic embeddings to automatically align illustrations and textual elements, thus providing an automatic annotation of the visual content of a manuscript. Experiments are performed on the Borso d'Este Holy Bible, one of the most sophisticated illuminated manuscript from the Renaissance, which we manually annotate aligning every illustration with textual commentaries written by experts. Experimental results quantify the domain shift between ordinary visual-semantic datasets and the proposed one, validate the proposed strategies, and devise future works on the same line.

2018 Relazione in Atti di Convegno

DOI IRIS

Connected Components Labeling on DRAGs

Authors: Bolelli, Federico; Baraldi, Lorenzo; Cancilla, Michele; Grana, Costantino

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

In this paper we introduce a new Connected Components Labeling (CCL) algorithm which exploits a novel approach to model decision … (Read full abstract)

In this paper we introduce a new Connected Components Labeling (CCL) algorithm which exploits a novel approach to model decision problems as Directed Acyclic Graphs with a root, which will be called Directed Rooted Acyclic Graphs (DRAGs). This structure supports the use of sets of equivalent actions, as required by CCL, and optimally leverages these equivalences to reduce the number of nodes (decision points). The advantage of this representation is that a DRAG, differently from decision trees usually exploited by the state-of-the-art algorithms, will contain only the minimum number of nodes required to reach the leaf corresponding to a set of condition values. This combines the benefits of using binary decision trees with a reduction of the machine code size. Experiments show a consistent improvement of the execution time when the model is applied to CCL.

2018 Relazione in Atti di Convegno

DOI IRIS

Improving Skin Lesion Segmentation with Generative Adversarial Networks

Authors: Pollastri, Federico; Bolelli, Federico; Paredes, Roberto; Grana, Costantino

Published in: PROCEEDINGS IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS

This paper proposes a novel strategy that employs Generative Adversarial Networks (GANs) to augment data in the image segmentation field, … (Read full abstract)

This paper proposes a novel strategy that employs Generative Adversarial Networks (GANs) to augment data in the image segmentation field, and a Convolutional-Deconvolutional Neural Network (CDNN) to automatically generate lesion segmentation mask from dermoscopic images. Training the CDNN with our GAN generated data effectively improves the state-of-the-art.

2018 Relazione in Atti di Convegno

DOI IRIS

Optimizing GPU-Based Connected Components Labeling Algorithms

Authors: Allegretti, Stefano; Bolelli, Federico; Cancilla, Michele; Grana, Costantino

Connected Components Labeling (CCL) is a fundamental image processing technique, widely used in various application areas. Computational throughput of Graphical … (Read full abstract)

Connected Components Labeling (CCL) is a fundamental image processing technique, widely used in various application areas. Computational throughput of Graphical Processing Units (GPUs) makes them eligible for such a kind of algorithms. In the last decade, many approaches to compute CCL on GPUs have been proposed. Unfortunately, most of them have focused on 4-way connectivity neglecting the importance of 8-way connectivity. This paper aims to extend state-of-the-art GPU-based algorithms from 4 to 8-way connectivity and to improve them with additional optimizations. Experimental results revealed the effectiveness of the proposed strategies.

2018 Relazione in Atti di Convegno

DOI IRIS

SACHER Project: A Cloud Platform and Integrated Services for Cultural Heritage and for Restoration

Authors: Bertacchi, Silvia; Al Jawarneh, Isam Mashhour; Apollonio, Fabrizio Ivan; Bertacchi, Gianna; Cancilla, Michele; Foschini, Luca; Grana, Costantino; Martuscelli, Giuseppe; Montanari, Rebecca

The SACHER project provides a distributed, open source and federated cloud platform able to support the life-cycle management of various … (Read full abstract)

The SACHER project provides a distributed, open source and federated cloud platform able to support the life-cycle management of various kinds of data concerning tangible Cultural Heritage. The paper describes the SACHER platform and, in particular, among the various integrated service prototypes, the most important ones to support restoration processes and cultural asset management: (i) 3D Life Cycle Management for Cultural Heritage (SACHER 3D CH), based on 3D digital models of architecture and dedicated to the management of Cultural Heritage and to the storage of the numerous data generated by the team of professionals involved in the restoration process; (ii) Multidimensional Search Engine for Cultural Heritage (SACHER MuSE CH), an advanced multi-level search system designed to manage Heritage data from heterogeneous sources.

2018 Relazione in Atti di Convegno

DOI IRIS

XDOCS: An Application to Index Historical Documents

Authors: Bolelli, Federico; Borghi, Guido; Grana, Costantino

Published in: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE

Dematerialization and digitalization of historical documents are key elements for their availability, preservation and diffusion. Unfortunately, the conversion from handwritten … (Read full abstract)

Dematerialization and digitalization of historical documents are key elements for their availability, preservation and diffusion. Unfortunately, the conversion from handwritten to digitalized documents presents several technical challenges. The XDOCS project is created with the main goal of making available and extending the usability of historical documents for a great variety of audience, like scholars, institutions and libraries. In this paper the core elements of XDOCS, i.e. page dewarping and word spotting technique, are described and two new applications, i.e. annotation/indexing and search tool, are presented.

2018 Relazione in Atti di Convegno

DOI IRIS

A Video Library System Using Scene Detection and Automatic Tagging

Authors: Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita

We present a novel video browsing and retrieval system for edited videos, in which videos are automatically decomposed into meaningful … (Read full abstract)

We present a novel video browsing and retrieval system for edited videos, in which videos are automatically decomposed into meaningful and storytelling parts (i.e. scenes) and tagged according to their transcript. The system relies on a Triplet Deep Neural Network which exploits multimodal features, and has been implemented as a set of extensions to the eXo Platform Enterprise Content Management System (ECMS). This set of extensions enable the interactive visualization of a video, its automatic and semi-automatic annotation, as well as a keyword-based search inside the video collection. The platform also allows a natural integration with third-party add-ons, so that automatic annotations can be exploited outside the proposed platform.

2017 Relazione in Atti di Convegno

DOI IRIS

Affective Classication of Gaming Activities Coming From RPG Gaming Sessions

Authors: Balducci, Fabrizio; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Each human activity involves feelings and subjective emotions: different people will perform and sense the same task with different outcomes … (Read full abstract)

Each human activity involves feelings and subjective emotions: different people will perform and sense the same task with different outcomes and experience; to understand this experience, concepts like Flow or Boredom must be investigated using objective data provided by methods like electroencephalography. This work carries on the analysis of EEG data coming from brain-computer interface and videogame "Neverwinter Nights 2": we propose an experimental methodology comparing results coming from different off-the-shelf machine learning techniques, employed on the gaming activities, to check if each affective state corresponds to the hypothesis xed in their formal design guidelines.

2017 Relazione in Atti di Convegno

DOI IRIS

Affective level design for a role-playing videogame evaluated by a brain–computer interface and machine learning methods

Authors: Balducci, Fabrizio; Grana, Costantino; Cucchiara, Rita

Published in: THE VISUAL COMPUTER

Game science has become a research field, which attracts industry attention due to a worldwide rich sell-market. To understand the … (Read full abstract)

Game science has become a research field, which attracts industry attention due to a worldwide rich sell-market. To understand the player experience, concepts like flow or boredom mental states require formalization and empirical investigation, taking advantage of the objective data that psychophysiological methods like electroencephalography (EEG) can provide. This work studies the affective ludology and shows two different game levels for Neverwinter Nights 2 developed with the aim to manipulate emotions; two sets of affective design guidelines are presented, with a rigorous formalization that considers the characteristics of role-playing genre and its specific gameplay. An empirical investigation with a brain–computer interface headset has been conducted: by extracting numerical data features, machine learning techniques classify the different activities of the gaming sessions (task and events) to verify if their design differentiation coincides with the affective one. The observed results, also supported by subjective questionnaires data, confirm the goodness of the proposed guidelines, suggesting that this evaluation methodology could be extended to other evaluation tasks.

2017 Articolo su rivista

DOI IRIS

Hierarchical Boundary-Aware Neural Encoder for Video Captioning

Authors: Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita

Published in: PROCEEDINGS - IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION

The use of Recurrent Neural Networks for video captioning has recently gained a lot of attention, since they can be … (Read full abstract)

The use of Recurrent Neural Networks for video captioning has recently gained a lot of attention, since they can be used both to encode the input video and to generate the corresponding description. In this paper, we present a recurrent video encoding scheme which can discover and leverage the hierarchical structure of the video. Unlike the classical encoder-decoder approach, in which a video is encoded continuously by a recurrent layer, we propose a novel LSTM cell, which can identify discontinuity points between frames or segments and modify the temporal connections of the encoding layer accordingly. We evaluate our approach on three large-scale datasets: the Montreal Video Annotation dataset, the MPII Movie Description dataset and the Microsoft Video Description Corpus. Experiments show that our approach can discover appropriate hierarchical representations of input videos and improve the state of the art results on movie description datasets.

2017 Relazione in Atti di Convegno

DOI IRIS

Publications by Costantino Grana

Aligning Text and Document Illustrations: towards Visually Explainable Digital Humanities

Connected Components Labeling on DRAGs

Improving Skin Lesion Segmentation with Generative Adversarial Networks

Optimizing GPU-Based Connected Components Labeling Algorithms

SACHER Project: A Cloud Platform and Integrated Services for Cultural Heritage and for Restoration

XDOCS: An Application to Index Historical Documents

A Video Library System Using Scene Detection and Automatic Tagging

Affective Classication of Gaming Activities Coming From RPG Gaming Sessions

Affective level design for a role-playing videogame evaluated by a brain–computer interface and machine learning methods

Hierarchical Boundary-Aware Neural Encoder for Video Captioning