Publications

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

How many Observations are Enough? Knowledge Distillation for Trajectory Forecasting

Authors: Monti, A.; Porrello, A.; Calderara, S.; Coscia, P.; Ballan, L.; Cucchiara, R.

Accurate prediction of future human positions is an essential task for modern video-surveillance systems. Current state-of-the-art models usually rely on … (Read full abstract)

Accurate prediction of future human positions is an essential task for modern video-surveillance systems. Current state-of-the-art models usually rely on a "history" of past tracked locations (e.g., 3 to 5 seconds) to predict a plausible sequence of future locations (e.g., up to the next 5 seconds). We feel that this common schema neglects critical traits of realistic applications: as the collection of input trajectories involves machine perception (i.e., detection and tracking), incorrect detection and fragmentation errors may accumulate in crowded scenes, leading to tracking drifts. On this account, the model would be fed with corrupted and noisy input data, thus fatally affecting its prediction performance.In this regard, we focus on delivering accurate predictions when only few input observations are used, thus potentially lowering the risks associated with automatic perception. To this end, we conceive a novel distillation strategy that allows a knowledge transfer from a teacher network to a student one, the latter fed with fewer observations (just two ones). We show that a properly defined teacher supervision allows a student network to perform comparably to state-of-the-art approaches that demand more observations. Besides, extensive experiments on common trajectory forecasting datasets highlight that our student network better generalizes to unseen scenarios.

2022 Relazione in Atti di Convegno

Identifying the oncogenic potential of gene fusions exploiting miRNAs

Authors: Lovino, M.; Montemurro, M.; Barrese, V. S.; Ficarra, E.

Published in: JOURNAL OF BIOMEDICAL INFORMATICS

It is estimated that oncogenic gene fusions cause about 20% of human cancer morbidity. Identifying potentially oncogenic gene fusions may … (Read full abstract)

It is estimated that oncogenic gene fusions cause about 20% of human cancer morbidity. Identifying potentially oncogenic gene fusions may improve affected patients’ diagnosis and treatment. Previous approaches to this issue included exploiting specific gene-related information, such as gene function and regulation. Here we propose a model that profits from the previous findings and includes the microRNAs in the oncogenic assessment. We present ChimerDriver, a tool to classify gene fusions as oncogenic or not oncogenic. ChimerDriver is based on a specifically designed neural network and trained on genetic and post-transcriptional information to obtain a reliable classification. The designed neural network integrates information related to transcription factors, gene ontologies, microRNAs and other detailed information related to the functions of the genes involved in the fusion and the gene fusion structure. As a result, the performances on the test set reached 0.83 f1-score and 96% recall. The comparison with state-of-the-art tools returned comparable or higher results. Moreover, ChimerDriver performed well in a real-world case where 21 out of 24 validated gene fusion samples were detected by the gene fusion detection tool Starfusion. ChimerDriver integrates transcriptional and post-transcriptional information in an ad-hoc designed neural network to effectively discriminate oncogenic gene fusions from passenger ones. ChimerDriver source code is freely available at https://github.com/martalovino/ChimerDriver.

2022 Articolo su rivista

Improving Segmentation of the Inferior Alveolar Nerve through Deep Label Propagation

Authors: Cipriano, Marco; Allegretti, Stefano; Bolelli, Federico; Pollastri, Federico; Grana, Costantino

Published in: PROCEEDINGS IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION

2022 Relazione in Atti di Convegno

Incremental Training of Face Morphing Detectors

Authors: Borghi, Guido; Graffieti, Gabriele; Franco, Annalisa; Maltoni, Davide

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

2022 Relazione in Atti di Convegno

Information fusion as an integrative cross-cutting enabler to achieve robust, explainable, and trustworthy medical artificial intelligence

Authors: Holzinger, A.; Dehmer, M.; Emmert-Streib, F.; Cucchiara, R.; Augenstein, I.; Ser, J. D.; Samek, W.; Jurisica, I.; Diaz-Rodriguez, N.

Published in: INFORMATION FUSION

Medical artificial intelligence (AI) systems have been remarkably successful, even outperforming human performance at certain tasks. There is no doubt … (Read full abstract)

Medical artificial intelligence (AI) systems have been remarkably successful, even outperforming human performance at certain tasks. There is no doubt that AI is important to improve human health in many ways and will disrupt various medical workflows in the future. Using AI to solve problems in medicine beyond the lab, in routine environments, we need to do more than to just improve the performance of existing AI methods. Robust AI solutions must be able to cope with imprecision, missing and incorrect information, and explain both the result and the process of how it was obtained to a medical expert. Using conceptual knowledge as a guiding model of reality can help to develop more robust, explainable, and less biased machine learning models that can ideally learn from less data. Achieving these goals will require an orchestrated effort that combines three complementary Frontier Research Areas: (1) Complex Networks and their Inference, (2) Graph causal models and counterfactuals, and (3) Verification and Explainability methods. The goal of this paper is to describe these three areas from a unified view and to motivate how information fusion in a comprehensive and integrative manner can not only help bring these three areas together, but also have a transformative role by bridging the gap between research and practical applications in the context of future trustworthy medical AI. This makes it imperative to include ethical and legal aspects as a cross-cutting discipline, because all future solutions must not only be ethically responsible, but also legally compliant.

2022 Articolo su rivista

Investigating Bidimensional Downsampling in Vision Transformer Models

Authors: Bruno, Paolo; Amoroso, Roberto; Cornia, Marcella; Cascianelli, Silvia; Baraldi, Lorenzo; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Vision Transformers (ViT) and other Transformer-based architectures for image classification have achieved promising performances in the last two years. However, … (Read full abstract)

Vision Transformers (ViT) and other Transformer-based architectures for image classification have achieved promising performances in the last two years. However, ViT-based models require large datasets, memory, and computational power to obtain state-of-the-art results compared to more traditional architectures. The generic ViT model, indeed, maintains a full-length patch sequence during inference, which is redundant and lacks hierarchical representation. With the goal of increasing the efficiency of Transformer-based models, we explore the application of a 2D max-pooling operator on the outputs of Transformer encoders. We conduct extensive experiments on the CIFAR-100 dataset and the large ImageNet dataset and consider both accuracy and efficiency metrics, with the final goal of reducing the token sequence length without affecting the classification performance. Experimental results show that bidimensional downsampling can outperform previous classification approaches while requiring relatively limited computation resources.

2022 Relazione in Atti di Convegno

LaRA 2: parallel and vectorized program for sequence–structure alignment of RNA sequences

Authors: Winkler, J.; Urgese, G.; Ficarra, E.; Reinert, K.

Published in: BMC BIOINFORMATICS

Background: The function of non-coding RNA sequences is largely determined by their spatial conformation, namely the secondary structure of the … (Read full abstract)

Background: The function of non-coding RNA sequences is largely determined by their spatial conformation, namely the secondary structure of the molecule, formed by Watson–Crick interactions between nucleotides. Hence, modern RNA alignment algorithms routinely take structural information into account. In order to discover yet unknown RNA families and infer their possible functions, the structural alignment of RNAs is an essential task. This task demands a lot of computational resources, especially for aligning many long sequences, and it therefore requires efficient algorithms that utilize modern hardware when available. A subset of the secondary structures contains overlapping interactions (called pseudoknots), which add additional complexity to the problem and are often ignored in available software. Results: We present the SeqAn-based software LaRA 2 that is significantly faster than comparable software for accurate pairwise and multiple alignments of structured RNA sequences. In contrast to other programs our approach can handle arbitrary pseudoknots. As an improved re-implementation of the LaRA tool for structural alignments, LaRA 2 uses multi-threading and vectorization for parallel execution and a new heuristic for computing a lower boundary of the solution. Our algorithmic improvements yield a program that is up to 130 times faster than the previous version. Conclusions: With LaRA 2 we provide a tool to analyse large sets of RNA secondary structures in relatively short time, based on structural alignment. The produced alignments can be used to derive structural motifs for the search in genomic databases.

2022 Articolo su rivista

Learning the Quality of Machine Permutations in Job Shop Scheduling

Authors: Corsini, A.; Calderara, S.; Dell'Amico, M.

Published in: IEEE ACCESS

In recent years, the power demonstrated by Machine Learning (ML) has increasingly attracted the interest of the optimization community that … (Read full abstract)

In recent years, the power demonstrated by Machine Learning (ML) has increasingly attracted the interest of the optimization community that is starting to leverage ML for enhancing and automating the design of algorithms. One combinatorial optimization problem recently tackled with ML is the Job Shop scheduling Problem (JSP). Most of the works on the JSP using ML focus on Deep Reinforcement Learning (DRL), and only a few of them leverage supervised learning techniques. The recurrent reasons for avoiding supervised learning seem to be the difficulty in casting the right learning task, i.e., what is meaningful to predict, and how to obtain labels. Therefore, we first propose a novel supervised learning task that aims at predicting the quality of machine permutations. Then, we design an original methodology to estimate this quality, and we use these estimations to create an accurate sequential deep learning model (binary accuracy above 95%). Finally, we empirically demonstrate the value of predicting the quality of machine permutations by enhancing the performance of a simple Tabu Search algorithm inspired by the works in the literature.

2022 Articolo su rivista

Long-Range 3D Self-Attention for MRI Prostate Segmentation

Authors: Pollastri, Federico; Cipriano, Marco; Bolelli, Federico; Grana, Costantino

Published in: PROCEEDINGS INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING

The problem of prostate segmentation from Magnetic Resonance Imaging (MRI) is an intense research area, due to the increased use … (Read full abstract)

The problem of prostate segmentation from Magnetic Resonance Imaging (MRI) is an intense research area, due to the increased use of MRI in the diagnosis and treatment planning of prostate cancer. The lack of clear boundaries and huge variation of texture and shapes between patients makes the task very challenging, and the 3D nature of the data makes 2D segmentation algorithms suboptimal for the task. With this paper, we propose a novel architecture to fill the gap between the most recent advances in 2D computer vision and 3D semantic segmentation. In particular, the designed model retrieves multi-scale 3D features with dilated convolutions and makes use of a self-attention transformer to gain a global field of view. The proposed Long-Range 3D Self-Attention block allows the convolutional neural network to build significant features by merging together contextual information collected at various scales. Experimental results show that the proposed method improves the state-of-the-art segmentation accuracy on MRI prostate segmentation.

2022 Relazione in Atti di Convegno

Matching Faces and Attributes Between the Artistic and the Real Domain: the PersonArt Approach

Authors: Cornia, Marcella; Tomei, Matteo; Baraldi, Lorenzo; Cucchiara, Rita

Published in: ACM TRANSACTIONS ON MULTIMEDIA COMPUTING, COMMUNICATIONS AND APPLICATIONS

In this article, we present an approach for retrieving similar faces between the artistic and the real domain. The application … (Read full abstract)

In this article, we present an approach for retrieving similar faces between the artistic and the real domain. The application we refer to is an interactive exhibition inside a museum, in which a visitor can take a photo of himself and search for a lookalike in the collection of paintings. The task requires not only to identify faces but also to extract discriminative features from artistic and photo-realistic images, tackling a significant domain shift. Our method integrates feature extraction networks which account for the aesthetic similarity of two faces and their correspondences in terms of semantic attributes. Also, it addresses the domain shift between realistic images and paintings by translating photo-realistic images into the artistic domain. Noticeably, by exploiting the same technique, our model does not need to rely on annotated data in the artistic domain. Experimental results are conducted on different paired datasets to show the effectiveness of the proposed solution in terms of identity and attribute preservation. The approach is also evaluated on unpaired settings and in combination with an interactive relevance feedback strategy. Finally, we show how the proposed algorithm has been implemented in a real showcase at the Gallerie Estensi museum in Italy, with the participation of more than 1,100 visitors in just three days.

2022 Articolo su rivista

Page 28 of 106 • Total publications: 1054