Publications

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Out of the Box: Embodied Navigation in the Real World

Authors: Bigazzi, Roberto; Landi, Federico; Cornia, Marcella; Cascianelli, Silvia; Baraldi, Lorenzo; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

The research field of Embodied AI has witnessed substantial progress in visual navigation and exploration thanks to powerful simulating platforms … (Read full abstract)

The research field of Embodied AI has witnessed substantial progress in visual navigation and exploration thanks to powerful simulating platforms and the availability of 3D data of indoor and photorealistic environments. These two factors have opened the doors to a new generation of intelligent agents capable of achieving nearly perfect PointGoal Navigation. However, such architectures are commonly trained with millions, if not billions, of frames and tested in simulation. Together with great enthusiasm, these results yield a question: how many researchers will effectively benefit from these advances? In this work, we detail how to transfer the knowledge acquired in simulation into the real world. To that end, we describe the architectural discrepancies that damage the Sim2Real adaptation ability of models trained on the Habitat simulator and propose a novel solution tailored towards the deployment in real-world scenarios. We then deploy our models on a LoCoBot, a Low-Cost Robot equipped with a single Intel RealSense camera. Different from previous work, our testing scene is unavailable to the agent in simulation. The environment is also inaccessible to the agent beforehand, so it cannot count on scene-specific semantic priors. In this way, we reproduce a setting in which a research group (potentially from other fields) needs to employ the agent visual navigation capabilities as-a-Service. Our experiments indicate that it is possible to achieve satisfying results when deploying the obtained model in the real world.

2021 Relazione in Atti di Convegno

PhyliCS: a Python library to explore scCNA data and quantify spatial tumor heterogeneity

Authors: Montemurro, M.; Grassi, E.; Pizzino, C. G.; Bertotti, A.; Ficarra, E.; Urgese, G.

Published in: BMC BIOINFORMATICS

Background: Tumors are composed by a number of cancer cell subpopulations (subclones), characterized by a distinguishable set of mutations. This … (Read full abstract)

Background: Tumors are composed by a number of cancer cell subpopulations (subclones), characterized by a distinguishable set of mutations. This phenomenon, known as intra-tumor heterogeneity (ITH), may be studied using Copy Number Aberrations (CNAs). Nowadays ITH can be assessed at the highest possible resolution using single-cell DNA (scDNA) sequencing technology. Additionally, single-cell CNA (scCNA) profiles from multiple samples of the same tumor can in principle be exploited to study the spatial distribution of subclones within a tumor mass. However, since the technology required to generate large scDNA sequencing datasets is relatively recent, dedicated analytical approaches are still lacking. Results: We present PhyliCS, the first tool which exploits scCNA data from multiple samples from the same tumor to estimate whether the different clones of a tumor are well mixed or spatially separated. Starting from the CNA data produced with third party instruments, it computes a score, the Spatial Heterogeneity score, aimed at distinguishing spatially intermixed cell populations from spatially segregated ones. Additionally, it provides functionalities to facilitate scDNA analysis, such as feature selection and dimensionality reduction methods, visualization tools and a flexible clustering module. Conclusions: PhyliCS represents a valuable instrument to explore the extent of spatial heterogeneity in multi-regional tumour sampling, exploiting the potential of scCNA data.

2021 Articolo su rivista

RefiNet: 3D Human Pose Refinement with Depth Maps

Authors: D’Eusanio, Andrea; Pini, Stefano; Borghi, Guido; Vezzani, Roberto; Cucchiara, Rita

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

Human Pose Estimation is a fundamental task for many applications in the Computer Vision community and it has been widely … (Read full abstract)

Human Pose Estimation is a fundamental task for many applications in the Computer Vision community and it has been widely investigated in the 2D domain, i.e. intensity images. Therefore, most of the available methods for this task are mainly based on 2D Convolutional Neural Networks and huge manually-annotated RGB datasets, achieving stunning results. In this paper, we propose RefiNet, a multi-stage framework that regresses an extremely-precise 3D human pose estimation from a given 2D pose and a depth map. The framework consists of three different modules, each one specialized in a particular refinement and data representation, i.e. depth patches, 3D skeleton and point clouds. Moreover, we present a new dataset, called Baracca, acquired with RGB, depth and thermal cameras and specifically created for the automotive context. Experimental results confirm the quality of the refinement procedure that largely improves the human pose estimations of off-the-shelf 2D methods.

2021 Relazione in Atti di Convegno

Revisiting The Evaluation of Class Activation Mapping for Explainability: A Novel Metric and Experimental Analysis

Authors: Poppi, Samuele; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita

Published in: IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS

As the request for deep learning solutions increases, the need for explainability is even more fundamental. In this setting, particular … (Read full abstract)

As the request for deep learning solutions increases, the need for explainability is even more fundamental. In this setting, particular attention has been given to visualization techniques, that try to attribute the right relevance to each input pixel with respect to the output of the network. In this paper, we focus on Class Activation Mapping (CAM) approaches, which provide an effective visualization by taking weighted averages of the activation maps. To enhance the evaluation and the reproducibility of such approaches, we propose a novel set of metrics to quantify explanation maps, which show better effectiveness and simplify comparisons between approaches. To evaluate the appropriateness of the proposal, we compare different CAM-based visualization methods on the entire ImageNet validation set, fostering proper comparisons and reproducibility.

2021 Relazione in Atti di Convegno

RMS-Net: Regression and Masking for Soccer Event Spotting

Authors: Tomei, Matteo; Baraldi, Lorenzo; Calderara, Simone; Bronzin, Simone; Cucchiara, Rita

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

2021 Relazione in Atti di Convegno

SHREC 2021: Skeleton-based hand gesture recognition in the wild

Authors: Caputo, Ariel; Giacchetti, Andrea; Soso, Simone; Pintani, Deborah; D'Eusanio, Andrea; Pini, Stefano; Borghi, Guido; Simoni, Alessandro; Vezzani, Roberto; Cucchiara, Rita; Ranieri, Andrea; Giannini, Franca; Lupinetti, Katia; Monti, Marina; Maghoumi, Mehran; Laviola Jr, Joseph; Le, Minh-Quan; Nguyen, Hai-Dang; Tran, Minh-Triet

Published in: COMPUTERS & GRAPHICS

This paper presents the results of the Eurographics 2019 SHape Retrieval Contest track on online gesture recognition. The goal of … (Read full abstract)

This paper presents the results of the Eurographics 2019 SHape Retrieval Contest track on online gesture recognition. The goal of this contest was to test state-of-the-art methods that can be used to online detect command gestures from hands' movements tracking on a basic benchmark where simple gestures are performed interleaving them with other actions. Unlike previous contests and benchmarks on trajectory-based gesture recognition, we proposed an online gesture recognition task, not providing pre-segmented gestures, but asking the participants to find gestures within recorded trajectories. The results submitted by the participants show that an online detection and recognition of sets of very simple gestures from 3D trajectories captured with a cheap sensor can be effectively performed. The best methods proposed could be, therefore, directly exploited to design effective gesture-based interfaces to be used in different contexts, from Virtual and Mixed reality applications to the remote control of home devices.

2021 Articolo su rivista

Smart Node Networks Orchestration: A New E2E Approach for Analysis and Design for Agile 4.0 Implementation

Authors: Bertoli, Annalisa; Cervo, Andrea; Rosati, Carlo Alberto; Fantuzzi, Cesare

Published in: SENSORS

Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation

Authors: Liu, Yahui; Sangineto, Enver; Chen, Yajing; Bao, Linchao; Zhang, Haoxian; Sebe, Nicu; Lepri, Bruno; Wang, Wei; Nadai, Marco De

Published in: PROCEEDINGS - IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION

2021 Relazione in Atti di Convegno

Supporting Skin Lesion Diagnosis with Content-Based Image Retrieval

Authors: Allegretti, Stefano; Bolelli, Federico; Pollastri, Federico; Longhitano, Sabrina; Pellacani, Giovanni; Grana, Costantino

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

In recent years, many attempts have been dedicated to the creation of automated devices that could assist both expert and … (Read full abstract)

In recent years, many attempts have been dedicated to the creation of automated devices that could assist both expert and beginner dermatologists towards fast and early diagnosis of skin lesions. Tasks such as skin lesion classification and segmentation have been extensively addressed with deep learning algorithms, which in some cases reach a diagnostic accuracy comparable to that of expert physicians. However, the general lack of interpretability and reliability severely hinders the ability of those approaches to actually support dermatologists in the diagnosis process. In this paper a novel skin image retrieval system is presented, which exploits features extracted by Convolutional Neural Networks to gather similar images from a publicly available dataset, in order to assist the diagnosis process of both expert and novice practitioners. In the proposed framework, ResNet-50 is initially trained for the classification of dermoscopic images; then, the feature extraction part is isolated, and an embedding network is built on top of it. The embedding learns an alternative representation, which allows to check image similarity by means of a distance measure. Experimental results reveal that the proposed method is able to select meaningful images, which can effectively boost the classification accuracy of human dermatologists.

2021 Relazione in Atti di Convegno

The color out of space: learning self-supervised representations for Earth Observation imagery

Authors: Vincenzi, Stefano; Porrello, Angelo; Buzzega, Pietro; Cipriano, Marco; Fronte, Pietro; Cuccu, Roberto; Ippoliti, Carla; Conte, Annamaria; Calderara, Simone

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

The recent growth in the number of satellite images fosters the development of effective deep-learning techniques for Remote Sensing (RS). … (Read full abstract)

The recent growth in the number of satellite images fosters the development of effective deep-learning techniques for Remote Sensing (RS). However, their full potential is untapped due to the lack of large annotated datasets. Such a problem is usually countered by fine-tuning a feature extractor that is previously trained on the ImageNet dataset. Unfortunately, the domain of natural images differs from the RS one, which hinders the final performance. In this work, we propose to learn meaningful representations from satellite imagery, leveraging its high-dimensionality spectral bands to reconstruct the visible colors. We conduct experiments on land cover classification (BigEarthNet) and West Nile Virus detection, showing that colorization is a solid pretext task for training a feature extractor. Furthermore, we qualitatively observe that guesses based on natural images and colorization rely on different parts of the input. This paves the way to an ensemble model that eventually outperforms both the above-mentioned techniques.

2021 Relazione in Atti di Convegno

Page 36 of 106 • Total publications: 1054