Publications

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

A Novel Attention-based Aggregation Function to Combine Vision and Language

Authors: Stefanini, Matteo; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

The joint understanding of vision and language has been recently gaining a lot of attention in both the Computer Vision … (Read full abstract)

The joint understanding of vision and language has been recently gaining a lot of attention in both the Computer Vision and Natural Language Processing communities, with the emergence of tasks such as image captioning, image-text matching, and visual question answering. As both images and text can be encoded as sets or sequences of elements - like regions and words - proper reduction functions are needed to transform a set of encoded elements into a single response, like a classification or similarity score. In this paper, we propose a novel fully-attentive reduction method for vision and language. Specifically, our approach computes a set of scores for each element of each modality employing a novel variant of cross-attention, and performs a learnable and cross-modal reduction, which can be used for both classification and ranking. We test our approach on image-text matching and visual question answering, building fair comparisons with other reduction choices, on both COCO and VQA 2.0 datasets. Experimentally, we demonstrate that our approach leads to a performance increase on both tasks. Further, we conduct ablation studies to validate the role of each component of the approach.

2021 Relazione in Atti di Convegno

A Novel Proof-of-concept Framework for the Exploitation of ConvNets on Whole Slide Images

Authors: Alessio, Mascolini; Puzzo, S.; Incatasciato, G.; Ponzio, F.; Ficarra, E.; Di Cataldo, S.

Published in: SMART INNOVATION, SYSTEMS AND TECHNOLOGIES

Traditionally, the analysis of histological samples is visually performed by a pathologist, who inspects under the microscope the tissue samples, … (Read full abstract)

Traditionally, the analysis of histological samples is visually performed by a pathologist, who inspects under the microscope the tissue samples, looking for malignancies and anomalies. This visual assessment is both time consuming and highly unreliable due to the subjectivity of the evaluation. Hence, there are growing efforts towards the automatisation of such analysis, oriented to the development of computer-aided diagnostic tools, with a ever-growing role of techniques based on deep learning. In this work, we analyze some of the issues commonly associated with providing deep learning based techniques to medical professionals. We thus introduce a tool, aimed at both researchers and medical professionals, which simplifies and accelerates the training and exploitation of such models. The outcome of the tool is an attention map representing cancer probability distribution on top of the Whole Slide Image, driving the pathologist through a faster and more accurate diagnostic procedure.

2021 Capitolo/Saggio

A Systematic Comparison of Depth Map Representations for Face Recognition

Authors: Pini, Stefano; Borghi, Guido; Vezzani, Roberto; Maltoni, Davide; Cucchiara, Rita

Published in: SENSORS

2021 Articolo su rivista

A Unified Objective for Novel Class Discovery

Authors: Fini, Enrico; Sangineto, Enver; Lathuilière, Stéphane; Zhong, Zhun; Nabi, Moin; Ricci, Elisa

Published in: PROCEEDINGS IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION

2021 Relazione in Atti di Convegno

AC-VRNN: Attentive Conditional-VRNN for multi-future trajectory prediction

Authors: Bertugli, A.; Calderara, S.; Coscia, P.; Ballan, L.; Cucchiara, R.

Published in: COMPUTER VISION AND IMAGE UNDERSTANDING

Anticipating human motion in crowded scenarios is essential for developing intelligent transportation systems, social-aware robots and advanced video surveillance applications. … (Read full abstract)

Anticipating human motion in crowded scenarios is essential for developing intelligent transportation systems, social-aware robots and advanced video surveillance applications. A key component of this task is represented by the inherently multi-modal nature of human paths which makes socially acceptable multiple futures when human interactions are involved. To this end, we propose a generative architecture for multi-future trajectory predictions based on Conditional Variational Recurrent Neural Networks (C-VRNNs). Conditioning mainly relies on prior belief maps, representing most likely moving directions and forcing the model to consider past observed dynamics in generating future positions. Human interactions are modelled with a graph-based attention mechanism enabling an online attentive hidden state refinement of the recurrent estimation. To corroborate our model, we perform extensive experiments on publicly-available datasets (e.g., ETH/UCY, Stanford Drone Dataset, STATS SportVU NBA, Intersection Drone Dataset and TrajNet++) and demonstrate its effectiveness in crowded scenes compared to several state-of-the-art methods.

2021 Articolo su rivista

Appearance and Pose-Conditioned Human Image Generation using Deformable GANs

Authors: Siarohin, Aliaksandr; Lathuilière, Stéphane; Sangineto, Enver; Sebe, Niculae

Published in: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

In this paper, we address the problem of generating person images conditioned on both pose and appearance information. Specifically, given … (Read full abstract)

In this paper, we address the problem of generating person images conditioned on both pose and appearance information. Specifically, given an image xa of a person and a target pose P(xb), extracted from an image xb, we synthesize a new image of that person in pose P(xb), while preserving the visual details in xa. In order to deal with pixel-to-pixel misalignments caused by the pose differences between P(xa) and P(xb), we introduce deformable skip connections in the generator of our Generative Adversarial Network. Moreover, a nearest-neighbour loss is proposed instead of the common L1 and L2 losses in order to match the details of the generated image with the target image. Quantitative and qualitative results, using common datasets and protocols recently proposed for this task, show that our approach is competitive with respect to the state of the art. Moreover, we conduct an extensive evaluation using off-the-shell person re-identification (Re-ID) systems trained with person-generation based augmented data, which is one of themain important applications for this task. Our experiments show that our Deformable GANs can significantly boost the Re-ID accuracy and are even better than data-augmentation methods specifically trained using Re-ID losses.

2021 Articolo su rivista

Assessing the Role of Boundary-level Objectives in Indoor Semantic Segmentation

Authors: Amoroso, Roberto; Baraldi, Lorenzo; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Providing fine-grained and accurate segmentation maps of indoor scenes is a challenging task with relevant applications in the fields of … (Read full abstract)

Providing fine-grained and accurate segmentation maps of indoor scenes is a challenging task with relevant applications in the fields of augmented reality, image retrieval, and personalized robotics. While most of the recent literature on semantic segmentation has focused on outdoor scenarios, the generation of accurate indoor segmentation maps has been partially under-investigated. With the goal of increasing the accuracy of semantic segmentation in indoor scenarios, we focus on the analysis of boundary-level objectives, which foster the generation of fine-grained boundaries between different semantic classes and which have never been explored in the case of indoor segmentation. In particular, we test and devise variants of both the Boundary and Active Boundary losses, two recent proposals which deal with the prediction of semantic boundaries. Through experiments on the NYUDv2 dataset, we quantify the role of such losses in terms of accuracy and quality of boundary prediction and demonstrate the accuracy gain of the proposed variants.

2021 Relazione in Atti di Convegno

Automated Artifact Retouching in Morphed Images with Attention Maps

Authors: Borghi, G.; Franco, A.; Graffieti, G.; Maltoni, D.

Published in: IEEE ACCESS

Morphing attack is an important security threat for automatic face recognition systems. High-quality morphed images, i.e. images without significant visual … (Read full abstract)

Morphing attack is an important security threat for automatic face recognition systems. High-quality morphed images, i.e. images without significant visual artifacts such as ghosts, noise, and blurring, exhibit higher chances of success, being able to fool both human examiners and commercial face verification algorithms. Therefore, the availability of large sets of high-quality morphs is fundamental for training and testing robust morphing attack detection algorithms. However, producing a high-quality morphed image is an expensive and time-consuming task since manual post-processing is generally required to remove the typical artifacts generated by landmark-based morphing techniques. This work describes an approach based on the Conditional Generative Adversarial Network paradigm for automated morphing artifact retouching and the use of Attention Maps to guide the generation process and limit the retouch to specific areas. In order to work with high-resolution images, the framework is applied on different facial crops, which, once processed and retouched, are accurately blended to reconstruct the whole morphed face. Specifically, we focus on four different squared face regions, i.e. the right and left eyes, the nose, and the mouth, that are frequently affected by artifacts. Several qualitative and quantitative experimental evaluations have been conducted to confirm the effectiveness of the proposal in terms of, among the others, pixel-wise metrics, identity preservation, and human observer analysis. Results confirm the feasibility and the accuracy of the proposed framework.

2021 Articolo su rivista

Avalanche: An end-to-end library for continual learning

Authors: Lomonaco, V.; Pellegrini, L.; Cossu, A.; Carta, A.; Graffieti, G.; Hayes, T. L.; De Lange, M.; Masana, M.; Pomponi, J.; Van De Ven, G. M.; Mundt, M.; She, Q.; Cooper, K.; Forest, J.; Belouadah, E.; Calderara, S.; Parisi, G. I.; Cuzzolin, F.; Tolias, A. S.; Scardapane, S.; Antiga, L.; Ahmad, S.; Popescu, A.; Kanan, C.; Van De Weijer, J.; Tuytelaars, T.; Bacciu, D.; Maltoni, D.

Published in: IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS

Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have … (Read full abstract)

Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standard benchmarks are hard to reproduce. In this work, we propose Avalanche, an open-source end-to-end library for continual learning research based on PyTorch. Avalanche is designed to provide a shared and collaborative codebase for fast prototyping, training, and reproducible evaluation of continual learning algorithms.

2021 Relazione in Atti di Convegno

Circular RNA profiling distinguishes medulloblastoma groups and shows aberrant RMST overexpression in WNT medulloblastoma

Authors: Rickert, Daniel; Bartl, Jasmin; Picard, Daniel; Bernardi, Flavia; Qin, Nan; Lovino, Marta; Puget, Stéphanie; Meyer, Frauke-Dorothee; Mahoungou Koumba, Idriss; Beez, Thomas; Varlet, Pascale; Dufour, Christelle; Fischer, Ute; Borkhardt, Arndt; Reifenberger, Guido; Ayrault, Olivier; Remke, Marc

Published in: ACTA NEUROPATHOLOGICA

2021 Articolo su rivista

Page 32 of 106 • Total publications: 1054