Publications - AImageLab

How does Connected Components Labeling with Decision Trees perform on GPUs?

Authors: Allegretti, Stefano; Bolelli, Federico; Cancilla, Michele; Pollastri, Federico; Canalini, Laura; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

In this paper the problem of Connected Components Labeling (CCL) in binary images using Graphic Processing Units (GPUs) is tackled … (Read full abstract)

In this paper the problem of Connected Components Labeling (CCL) in binary images using Graphic Processing Units (GPUs) is tackled by a different perspective. In the last decade, many novel algorithms have been released, specifically designed for GPUs. Because CCL literature concerning sequential algorithms is very rich, and includes many efficient solutions, designers of parallel algorithms were often inspired by techniques that had already proved successful in a sequential environment, such as the Union-Find paradigm for solving equivalences between provisional labels. However, the use of decision trees to minimize memory accesses, which is one of the main feature of the best performing sequential algorithms, was never taken into account when designing parallel CCL solutions. In fact, branches in the code tend to cause thread divergence, which usually leads to inefficiency. Anyway, this consideration does not necessarily apply to every possible scenario. Are we sure that the advantages of decision trees do not compensate for the cost of thread divergence? In order to answer this question, we chose three well-known sequential CCL algorithms, which employ decision trees as the cornerstone of their strategy, and we built a data-parallel version of each of them. Experimental tests on real case datasets show that, in most cases, these solutions outperform state-of-the-art algorithms, thus demonstrating the effectiveness of decision trees also in a parallel environment.

2019 Relazione in Atti di Convegno

DOI IRIS

Improving the Performance of Thinning Algorithms with Directed Rooted Acyclic Graphs

Authors: Bolelli, Federico; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

In this paper we propose a strategy to optimize the performance of thinning algorithms. This solution is obtained by combining … (Read full abstract)

In this paper we propose a strategy to optimize the performance of thinning algorithms. This solution is obtained by combining three proven strategies for binary images neighborhood exploration, namely modeling the problem with an optimal decision tree, reusing pixels from the previous step of the algorithm, and reducing the code footprint by means of Directed Rooted Acyclic Graphs. A complete and open-source benchmarking suite is also provided. Experimental results confirm that the proposed algorithms clearly outperform classical implementations.

2019 Relazione in Atti di Convegno

DOI IRIS

M-VAD Names: a Dataset for Video Captioning with Naming

Authors: Pini, Stefano; Cornia, Marcella; Bolelli, Federico; Baraldi, Lorenzo; Cucchiara, Rita

Published in: MULTIMEDIA TOOLS AND APPLICATIONS

Current movie captioning architectures are not capable of mentioning characters with their proper name, replacing them with a generic "someone" … (Read full abstract)

Current movie captioning architectures are not capable of mentioning characters with their proper name, replacing them with a generic "someone" tag. The lack of movie description datasets with characters' visual annotations surely plays a relevant role in this shortage. Recently, we proposed to extend the M-VAD dataset by introducing such information. In this paper, we present an improved version of the dataset, namely M-VAD Names, and its semi-automatic annotation procedure. The resulting dataset contains 63k visual tracks and 34k textual mentions, all associated with character identities. To showcase the features of the dataset and quantify the complexity of the naming task, we investigate multimodal architectures to replace the "someone" tags with proper character names in existing video captions. The evaluation is further extended by testing this application on videos outside of the M-VAD Names dataset.

2019 Articolo su rivista

DOI IRIS

Skin Lesion Segmentation Ensemble with Diverse Training Strategies

Authors: Canalini, Laura; Pollastri, Federico; Bolelli, Federico; Cancilla, Michele; Allegretti, Stefano; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

This paper presents a novel strategy to perform skin lesion segmentation from dermoscopic images. We design an effective segmentation pipeline, … (Read full abstract)

This paper presents a novel strategy to perform skin lesion segmentation from dermoscopic images. We design an effective segmentation pipeline, and explore several pre-training methods to initialize the features extractor, highlighting how different procedures lead the Convolutional Neural Network (CNN) to focus on different features. An encoder-decoder segmentation CNN is employed to take advantage of each pre-trained features extractor. Experimental results reveal how multiple initialization strategies can be exploited, by means of an ensemble method, to obtain state-of-the-art skin lesion segmentation accuracy.

2019 Relazione in Atti di Convegno

DOI IRIS

A Hierarchical Quasi-Recurrent approach to Video Captioning

Authors: Bolelli, Federico; Baraldi, Lorenzo; Grana, Costantino

Video captioning has picked up a considerable measure of attention thanks to the use of Recurrent Neural Networks, since they … (Read full abstract)

Video captioning has picked up a considerable measure of attention thanks to the use of Recurrent Neural Networks, since they can be utilized to both encode the input video and to create the corresponding description. In this paper, we present a recurrent video encoding scheme which can find and exploit the layered structure of the video. Differently from the established encoder-decoder approach, in which a video is encoded continuously by a recurrent layer, we propose to employ Quasi-Recurrent Neural Networks, further extending their basic cell with a boundary detector which can recognize discontinuity points between frames or segments and likewise modify the temporal connections of the encoding layer. We assess our approach on a large scale dataset, the Montreal Video Annotation dataset. Experiments demonstrate that our approach can find suitable levels of representation of the input information, while reducing the computational requirements.

2018 Relazione in Atti di Convegno

DOI IRIS

Connected Components Labeling on DRAGs

Authors: Bolelli, Federico; Baraldi, Lorenzo; Cancilla, Michele; Grana, Costantino

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

In this paper we introduce a new Connected Components Labeling (CCL) algorithm which exploits a novel approach to model decision … (Read full abstract)

In this paper we introduce a new Connected Components Labeling (CCL) algorithm which exploits a novel approach to model decision problems as Directed Acyclic Graphs with a root, which will be called Directed Rooted Acyclic Graphs (DRAGs). This structure supports the use of sets of equivalent actions, as required by CCL, and optimally leverages these equivalences to reduce the number of nodes (decision points). The advantage of this representation is that a DRAG, differently from decision trees usually exploited by the state-of-the-art algorithms, will contain only the minimum number of nodes required to reach the leaf corresponding to a set of condition values. This combines the benefits of using binary decision trees with a reduction of the machine code size. Experiments show a consistent improvement of the execution time when the model is applied to CCL.

2018 Relazione in Atti di Convegno

DOI IRIS

Improving Skin Lesion Segmentation with Generative Adversarial Networks

Authors: Pollastri, Federico; Bolelli, Federico; Paredes, Roberto; Grana, Costantino

Published in: PROCEEDINGS IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS

This paper proposes a novel strategy that employs Generative Adversarial Networks (GANs) to augment data in the image segmentation field, … (Read full abstract)

This paper proposes a novel strategy that employs Generative Adversarial Networks (GANs) to augment data in the image segmentation field, and a Convolutional-Deconvolutional Neural Network (CDNN) to automatically generate lesion segmentation mask from dermoscopic images. Training the CDNN with our GAN generated data effectively improves the state-of-the-art.

2018 Relazione in Atti di Convegno

DOI IRIS

Optimizing GPU-Based Connected Components Labeling Algorithms

Authors: Allegretti, Stefano; Bolelli, Federico; Cancilla, Michele; Grana, Costantino

Connected Components Labeling (CCL) is a fundamental image processing technique, widely used in various application areas. Computational throughput of Graphical … (Read full abstract)

Connected Components Labeling (CCL) is a fundamental image processing technique, widely used in various application areas. Computational throughput of Graphical Processing Units (GPUs) makes them eligible for such a kind of algorithms. In the last decade, many approaches to compute CCL on GPUs have been proposed. Unfortunately, most of them have focused on 4-way connectivity neglecting the importance of 8-way connectivity. This paper aims to extend state-of-the-art GPU-based algorithms from 4 to 8-way connectivity and to improve them with additional optimizations. Experimental results revealed the effectiveness of the proposed strategies.

2018 Relazione in Atti di Convegno

DOI IRIS

XDOCS: An Application to Index Historical Documents

Authors: Bolelli, Federico; Borghi, Guido; Grana, Costantino

Published in: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE

Dematerialization and digitalization of historical documents are key elements for their availability, preservation and diffusion. Unfortunately, the conversion from handwritten … (Read full abstract)

Dematerialization and digitalization of historical documents are key elements for their availability, preservation and diffusion. Unfortunately, the conversion from handwritten to digitalized documents presents several technical challenges. The XDOCS project is created with the main goal of making available and extending the usability of historical documents for a great variety of audience, like scholars, institutions and libraries. In this paper the core elements of XDOCS, i.e. page dewarping and word spotting technique, are described and two new applications, i.e. annotation/indexing and search tool, are presented.

2018 Relazione in Atti di Convegno

DOI IRIS

Historical Handwritten Text Images Word Spotting through Sliding Window HOG Features

Authors: Bolelli, Federico; Borghi, Guido; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

In this paper we present an innovative technique to semi-automatically index handwritten word images. The proposed method is based on … (Read full abstract)

In this paper we present an innovative technique to semi-automatically index handwritten word images. The proposed method is based on HOG descriptors and exploits Dynamic Time Warping technique to compare feature vectors elaborated from single handwritten words. Our strategy is applied to a new challenging dataset extracted from Italian civil registries of the XIX century. Experimental results, compared with some previously developed word spotting strategies, confirmed that our method outperforms competitors.

2017 Relazione in Atti di Convegno

DOI IRIS

Publications by Federico Bolelli

How does Connected Components Labeling with Decision Trees perform on GPUs?

Improving the Performance of Thinning Algorithms with Directed Rooted Acyclic Graphs

M-VAD Names: a Dataset for Video Captioning with Naming

Skin Lesion Segmentation Ensemble with Diverse Training Strategies

A Hierarchical Quasi-Recurrent approach to Video Captioning

Connected Components Labeling on DRAGs

Improving Skin Lesion Segmentation with Generative Adversarial Networks

Optimizing GPU-Based Connected Components Labeling Algorithms

XDOCS: An Application to Index Historical Documents

Historical Handwritten Text Images Word Spotting through Sliding Window HOG Features