Publications

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Optimized Block-Based Algorithms to Label Connected Components on GPUs

Authors: Allegretti, Stefano; Bolelli, Federico; Grana, Costantino

Published in: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS

Connected Components Labeling (CCL) is a crucial step of several image processing and computer vision pipelines. Many efficient sequential strategies … (Read full abstract)

Connected Components Labeling (CCL) is a crucial step of several image processing and computer vision pipelines. Many efficient sequential strategies exist, among which one of the most effective is the use of a block-based mask to drastically cut the number of memory accesses. In the last decade, aided by the fast development of Graphics Processing Units (GPUs), a lot of data parallel CCL algorithms have been proposed along with sequential ones. Applications that entirely run in GPU can benefit from parallel implementations of CCL that allow to avoid expensive memory transfers between host and device. In this paper, two new eight-connectivity CCL algorithms are proposed, namely Block-based Union Find (BUF) and Block-based Komura Equivalence (BKE). These algorithms optimize existing GPU solutions introducing a block-based approach. Extensions for three-dimensional datasets are also discussed. In order to produce a fair comparison with previously proposed alternatives, YACCLAB, a public CCL benchmarking framework, has been extended and made suitable for evaluating also GPU algorithms. Moreover, three-dimensional datasets have been added to its collection. Experimental results on real cases and synthetically generated datasets demonstrate the superiority of the new proposals with respect to state-of-the-art, both on 2D and 3D scenarios.

2020 Articolo su rivista

Ottimizzazione di Algoritmi per l’Elaborazione di Immagini Binarie

Authors: Bolelli, Federico

La procedura che rende un algoritmo più efficiente in termini di requisiti di memoria o tempo di esecuzione si chiama … (Read full abstract)

La procedura che rende un algoritmo più efficiente in termini di requisiti di memoria o tempo di esecuzione si chiama ottimizzazione e rappresenta un passaggio cruciale nell'elaborazione di immagini e video. È raro che il processo di ottimizzazione produca un algoritmo ottimo in senso assoluto, ma spesso occorre raggiungere un compromesso tra i requisiti di tempo e quelli di memoria. Ad ogni modo, esistono molti scenari in cui il tempo di esecuzione totale richiesto per completare un'attività è il vincolo più restrittivo. Gli algoritmi di elaborazione di immagini binarie, ad esempio, rappresentano un'operazione fondamentale nella maggior parte dei sistemi di analisi di immagini e video all'avanguardia, anche quando questi sono basati su tecniche di deep learning. Avere un'implementazione efficiente è quindi essenziale, specialmente quando questi sistemi devono essere impiegati in scenari con vincoli temporali, dove compromettere la qualità del risultato, o fare affidamento su hardware più performante, non è una strada percorribile. Questa tesi introduce ed esplora diversi approcci per l'ottimizzazione degli algoritmi di elaborazione di immagini binarie modellabili con tabelle decisionali. Esistono diversi problemi che possono essere definiti in questo modo: l’etichettatura delle componenti connesse, il thinning, il chain code e gli operatori morfologici sono alcuni di questi. In generale, tutti gli algoritmi in cui il valore di output per ciascun pixel dell'immagine è ottenuto dal valore del pixel stesso e di alcuni dei suoi vicini possono essere definiti utilizzando tabelle decisionali. Concentrandosi sull'etichettatura delle componenti connesse, vengono analizzati gli approcci all'avanguardia sia per ambienti sequenziali basati su CPU che per ambienti paralleli basati su CPU e GPU, focalizzandosi su come misurare in modo equo le prestazioni. Vengono quindi introdotti nuovi approcci per migliorare ulteriormente le prestazioni in termini di tempo totale di esecuzione, mostrando come queste tecniche possano essere generalizzate per migliorare qualsiasi algoritmo modellabile con tabelle decisionali. Infine, viene presentato un framework che consente di applicare automaticamente molte delle strategie di ottimizzazione precedentemente descritte ed analizzate ad un determinato algoritmo. Il framework, chiamato GRAPHGEN, prende come input una definizione del problema in termini di condizioni da verificare e azioni da eseguire ed è in grado di produrre come output il codice C/C++ che include tutte le ottimizzazioni necessarie. Rispetto agli approcci esistenti, gli algoritmi generati con GRAPHGEN hanno prestazioni significativamente migliori, sia su set di dati reali che su quelli sintetici.

2020 Tesi di dottorato

Predicting the oncogenic potential of gene fusions using convolutional neural networks

Authors: Lovino, Marta; Urgese, Gianvito; Macii, Enrico; Santa Di Cataldo, ; Ficarra, Elisa

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Predicting the oncogenic potential of a gene fusion transcript is an important and challenging task in the study of cancer … (Read full abstract)

Predicting the oncogenic potential of a gene fusion transcript is an important and challenging task in the study of cancer development. To this date, the available approaches mostly rely on protein domain analysis to provide a probability score explaining the oncogenic potential of a gene fusion. In this paper, a Convolutional Neural Network model is proposed to discriminate gene fusions into oncogenic or non-oncogenic, exploiting only the protein sequence without protein domain information. Our proposed model obtained accuracy value close to 90% on a dataset of fused sequences.

2020 Relazione in Atti di Convegno

Predicting WNV circulation in Italy using earth observation data and extreme gradient boosting model

Authors: Candeloro, L.; Ippoliti, C.; Iapaolo, F.; Monaco, F.; Morelli, D.; Cuccu, R.; Fronte, P.; Calderara, S.; Vincenzi, S.; Porrello, A.; D'Alterio, N.; Calistri, P.; Conte, A.

Published in: REMOTE SENSING

West Nile Disease (WND) is one of the most spread zoonosis in Italy and Europe caused by a vector-borne virus. … (Read full abstract)

West Nile Disease (WND) is one of the most spread zoonosis in Italy and Europe caused by a vector-borne virus. Its transmission cycle is well understood, with birds acting as the primary hosts and mosquito vectors transmitting the virus to other birds, while humans and horses are occasional dead-end hosts. Identifying suitable environmental conditions across large areas containing multiple species of potential hosts and vectors can be difficult. The recent and massive availability of Earth Observation data and the continuous development of innovative Machine Learning methods can contribute to automatically identify patterns in big datasets and to make highly accurate identification of areas at risk. In this paper, we investigated the West Nile Virus (WNV) circulation in relation to Land Surface Temperature, Normalized Difference Vegetation Index and Surface Soil Moisture collected during the 160 days before the infection took place, with the aim of evaluating the predictive capacity of lagged remotely sensed variables in the identification of areas at risk for WNV circulation. WNV detection in mosquitoes, birds and horses in 2017, 2018 and 2019, has been collected from the National Information System for Animal Disease Notification. An Extreme Gradient Boosting model was trained with data from 2017 and 2018 and tested for the 2019 epidemic, predicting the spatio-temporal WNV circulation two weeks in advance with an overall accuracy of 0.84. This work lays the basis for a future early warning system that could alert public authorities when climatic and environmental conditions become favourable to the onset and spread of WNV.

2020 Articolo su rivista

Rethinking Experience Replay: a Bag of Tricks for Continual Learning

Authors: Buzzega, Pietro; Boschini, Matteo; Porrello, Angelo; Calderara, Simone

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

In Continual Learning, a Neural Network is trained on a stream of data whose distribution shifts over time. Under these … (Read full abstract)

In Continual Learning, a Neural Network is trained on a stream of data whose distribution shifts over time. Under these assumptions, it is especially challenging to improve on classes appearing later in the stream while remaining accurate on previous ones. This is due to the infamous problem of catastrophic forgetting, which causes a quick performance degradation when the classifier focuses on learning new categories. Recent literature proposed various approaches to tackle this issue, often resorting to very sophisticated techniques. In this work, we show that naïve rehearsal can be patched to achieve similar performance. We point out some shortcomings that restrain Experience Replay (ER) and propose five tricks to mitigate them. Experiments show that ER, thus enhanced, displays an accuracy gain of 51.2 and 26.9 percentage points on the CIFAR-10 and CIFAR-100 datasets respectively (memory buffer size 1000). As a result, it surpasses current state-of-the-art rehearsal-based methods.

2020 Relazione in Atti di Convegno

Robust Re-Identification by Multiple Views Knowledge Distillation

Authors: Porrello, Angelo; Bergamini, Luca; Calderara, Simone

Published in: LECTURE NOTES IN COMPUTER SCIENCE

To achieve robustness in Re-Identification, standard methods leverage tracking information in a Video-To-Video fashion. However, these solutions face a large … (Read full abstract)

To achieve robustness in Re-Identification, standard methods leverage tracking information in a Video-To-Video fashion. However, these solutions face a large drop in performance for single image queries (e.g., Image-To-Video setting). Recent works address this severe degradation by transferring temporal information from a Video-based network to an Image-based one. In this work, we devise a training strategy that allows the transfer of a superior knowledge, arising from a set of views depicting the target object. Our proposal - Views Knowledge Distillation (VKD) - pins this visual variety as a supervision signal within a teacher-student framework, where the teacher educates a student who observes fewer views. As a result, the student outperforms not only its teacher but also the current state-of-the-art in Image-To-Video by a wide margin (6.3% mAP on MARS, 8.6% on Duke-Video-ReId and 5% on VeRi-776). A thorough analysis - on Person, Vehicle and Animal Re-ID - investigates the properties of VKD from a qualitatively and quantitatively perspective.

2020 Relazione in Atti di Convegno

Scoring pleurisy in slaughtered pigs using convolutional neural networks

Authors: Trachtman, A. R.; Bergamini, L.; Palazzi, A.; Porrello, A.; Capobianco Dondona, A.; Del Negro, E.; Paolini, A.; Vignola, G.; Calderara, S.; Marruchella, G.

Published in: VETERINARY RESEARCH

Diseases of the respiratory system are known to negatively impact the profitability of the pig industry, worldwide. Considering the relatively … (Read full abstract)

Diseases of the respiratory system are known to negatively impact the profitability of the pig industry, worldwide. Considering the relatively short lifespan of pigs, lesions can be still evident at slaughter, where they can be usefully recorded and scored. Therefore, the slaughterhouse represents a key check-point to assess the health status of pigs, providing unique and valuable feedback to the farm, as well as an important source of data for epidemiological studies. Although relevant, scoring lesions in slaughtered pigs represents a very time-consuming and costly activity, thus making difficult their systematic recording. The present study has been carried out to train a convolutional neural network-based system to automatically score pleurisy in slaughtered pigs. The automation of such a process would be extremely helpful to enable a systematic examination of all slaughtered livestock. Overall, our data indicate that the proposed system is well able to differentiate half carcasses affected with pleurisy from healthy ones, with an overall accuracy of 85.5%. The system was better able to recognize severely affected half carcasses as compared with those showing less severe lesions. The training of convolutional neural networks to identify and score pneumonia, on the one hand, and the achievement of trials in large capacity slaughterhouses, on the other, represent the natural pursuance of the present study. As a result, convolutional neural network-based technologies could provide a fast and cheap tool to systematically record lesions in slaughtered pigs, thus supplying an enormous amount of useful data to all stakeholders in the pig industry.

2020 Articolo su rivista

SMArT: Training Shallow Memory-aware Transformers for Robotic Explainability

Authors: Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita

Published in: IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION

The ability to generate natural language explanations conditioned on the visual perception is a crucial step towards autonomous agents which … (Read full abstract)

The ability to generate natural language explanations conditioned on the visual perception is a crucial step towards autonomous agents which can explain themselves and communicate with humans. While the research efforts in image and video captioning are giving promising results, this is often done at the expense of the computational requirements of the approaches, limiting their applicability to real contexts. In this paper, we propose a fully-attentive captioning algorithm which can provide state-of-the-art performances on language generation while restricting its computational demands. Our model is inspired by the Transformer model and employs only two Transformer layers in the encoding and decoding stages. Further, it incorporates a novel memory-aware encoding of image regions. Experiments demonstrate that our approach achieves competitive results in terms of caption quality while featuring reduced computational demands. Further, to evaluate its applicability on autonomous agents, we conduct experiments on simulated scenes taken from the perspective of domestic robots.

2020 Relazione in Atti di Convegno

Spaghetti Labeling: Directed Acyclic Graphs for Block-Based Connected Components Labeling

Authors: Bolelli, Federico; Allegretti, Stefano; Baraldi, Lorenzo; Grana, Costantino

Published in: IEEE TRANSACTIONS ON IMAGE PROCESSING

Connected Components Labeling is an essential step of many Image Processing and Computer Vision tasks. Since the first proposal of … (Read full abstract)

Connected Components Labeling is an essential step of many Image Processing and Computer Vision tasks. Since the first proposal of a labeling algorithm, which dates back to the sixties, many approaches have optimized the computational load needed to label an image. In particular, the use of decision forests and state prediction have recently appeared as valuable strategies to improve performance. However, due to the overhead of the manual construction of prediction states and the size of the resulting machine code, the application of these strategies has been restricted to small masks, thus ignoring the benefit of using a block-based approach. In this paper, we combine a block-based mask with state prediction and code compression: the resulting algorithm is modeled as a Directed Rooted Acyclic Graph with multiple entry points, which is automatically generated without manual intervention. When tested on synthetic and real datasets, in comparison with optimized implementations of state-of-the-art algorithms, the proposed approach shows superior performance, surpassing the results obtained by all compared approaches in all settings.

2020 Articolo su rivista

The Role of the Input in Natural Language Video Description

Authors: Cascianelli, Silvia; Costante, Gabriele; Devo, Alessandro; Ciarfuglia, Thomas A; Valigi, Paolo; Fravolini, Mario L

Published in: IEEE TRANSACTIONS ON MULTIMEDIA

2020 Articolo su rivista

Page 42 of 106 • Total publications: 1054