Publications

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Robust single-sample face recognition by sparsity-driven sub-dictionary learning using deep features

Authors: Cuculo, Vittorio; D'Amelio, Alessandro; Grossi, Giuliano; Lanzarotti, Raffaella; Lin, Jianyi

Published in: SENSORS

Face recognition using a single reference image per subject is challenging, above all when referring to a large gallery of … (Read full abstract)

Face recognition using a single reference image per subject is challenging, above all when referring to a large gallery of subjects. Furthermore, the problem hardness seriously increases when the images are acquired in unconstrained conditions. In this paper we address the challenging Single Sample Per Person (SSPP) problem considering large datasets of images acquired in the wild, thus possibly featuring illumination, pose, face expression, partial occlusions, and low-resolution hurdles. The proposed technique alternates a sparse dictionary learning technique based on the method of optimal direction and the iterative ℓ 0 -norm minimization algorithm called k-LIMAPS. It works on robust deep-learned features, provided that the image variability is extended by standard augmentation techniques. Experiments show the effectiveness of our method against the hardness introduced above: first, we report extensive experiments on the unconstrained LFW dataset when referring to large galleries up to 1680 subjects; second, we present experiments on very low-resolution test images up to 8 × 8 pixels; third, tests on the AR dataset are analyzed against specific disguises such as partial occlusions, facial expressions, and illumination problems. In all the three scenarios our method outperforms the state-of-the-art approaches adopting similar configurations.

2019 Articolo su rivista

Segmentation Guided Scoring of Pathological Lesions in Swine Through CNNs

Authors: Bergamini, L.; Trachtman, A. R.; Palazzi, A.; Negro, E. D.; Capobianco Dondona, A.; Marruchella, G.; Calderara, S.

Published in: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE

The slaughterhouse is widely recognised as a useful checkpoint for assessing the health status of livestock. At the moment, this … (Read full abstract)

The slaughterhouse is widely recognised as a useful checkpoint for assessing the health status of livestock. At the moment, this is implemented through the application of scoring systems by human experts. The automation of this process would be extremely helpful for veterinarians to enable a systematic examination of all slaughtered livestock, positively influencing herd management. However, such systems are not yet available, mainly because of a critical lack of annotated data. In this work we: (i) introduce a large scale dataset to enable the development and benchmarking of these systems, featuring more than 4000 high-resolution swine carcass images annotated by domain experts with pixel-level segmentation; (ii) exploit part of this annotation to train a deep learning model in the task of pleural lesion scoring. In this setting, we propose a segmentation-guided framework which stacks together a fully convolutional neural network performing semantic segmentation with a rule-based classifier integrating a-priori veterinary knowledge in the process. Thorough experimental analysis against state-of-the-art baselines proves our method to be superior both in terms of accuracy and in terms of model interpretability. Code and dataset are publicly available here: https://github.com/lucabergamini/swine-lesion-scoring.

2019 Relazione in Atti di Convegno

Self Paced Deep Learning for Weakly Supervised Object Detection

Authors: Sangineto, E.; Nabi, M.; Culibrk, D.; Sebe, N.

Published in: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

In a weakly-supervised scenario object detectors need to be trained using image-level annotation alone. Since bounding-box-level ground truth is not … (Read full abstract)

In a weakly-supervised scenario object detectors need to be trained using image-level annotation alone. Since bounding-box-level ground truth is not available, most of the solutions proposed so far are based on an iterative, Multiple Instance Learning framework in which the current classifier is used to select the highest-confidence boxes in each image, which are treated as pseudo-ground truth in the next training iteration. However, the errors of an immature classifier can make the process drift, usually introducing many of false positives in the training dataset. To alleviate this problem, we propose in this paper a training protocol based on the self-paced learning paradigm. The main idea is to iteratively select a subset of images and boxes that are the most reliable, and use them for training. While in the past few years similar strategies have been adopted for SVMs and other classifiers, we are the first showing that a self-paced approach can be used with deep-network-based classifiers in an end-to-end training pipeline. The method we propose is built on the fully-supervised Fast-RCNN architecture and can be applied to similar architectures which represent the input image as a bag of boxes. We show state-of-the-art results on Pascal VOC 2007, Pascal VOC 2010 and ILSVRC 2013. OnILSVRC 2013 our results based on a low-capacity AlexNet network outperform even those weakly-supervised approaches which are based on much higher-capacity networks.

2019 Articolo su rivista

Self-Supervised Optical Flow Estimation by Projective Bootstrap

Authors: Alletto, Stefano; Abati, Davide; Calderara, Simone; Cucchiara, Rita; Rigazio, Luca

Published in: IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

Dense optical flow estimation is complex and time consuming, with state-of-the-art methods relying either on large synthetic data sets or … (Read full abstract)

Dense optical flow estimation is complex and time consuming, with state-of-the-art methods relying either on large synthetic data sets or on pipelines requiring up to a few minutes per frame pair. In this paper, we address the problem of optical flow estimation in the automotive scenario in a self-supervised manner. We argue that optical flow can be cast as a geometrical warping between two successive video frames and devise a deep architecture to estimate such transformation in two stages. First, a dense pixel-level flow is computed with a projective bootstrap on rigid surfaces. We show how such global transformation can be approximated with a homography and extend spatial transformer layers so that they can be employed to compute the flow field implied by such transformation. Subsequently, we refine the prediction by feeding a second, deeper network that accounts for moving objects. A final reconstruction loss compares the warping of frame Xₜ with the subsequent frame Xₜ₊₁ and guides both estimates. The model has the speed advantages of end-to-end deep architectures while achieving competitive performances, both outperforming recent unsupervised methods and showing good generalization capabilities on new automotive data sets.

2019 Articolo su rivista

Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions

Authors: Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita

Published in: IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION

Current captioning approaches can describe images using black-box architectures whose behavior is hardly controllable and explainable from the exterior. As … (Read full abstract)

Current captioning approaches can describe images using black-box architectures whose behavior is hardly controllable and explainable from the exterior. As an image can be described in infinite ways depending on the goal and the context at hand, a higher degree of controllability is needed to apply captioning algorithms in complex scenarios. In this paper, we introduce a novel framework for image captioning which can generate diverse descriptions by allowing both grounding and controllability. Given a control signal in the form of a sequence or set of image regions, we generate the corresponding caption through a recurrent architecture which predicts textual chunks explicitly grounded on regions, following the constraints of the given control. Experiments are conducted on Flickr30k Entities and on COCO Entities, an extended version of COCO in which we add grounding annotations collected in a semi-automatic manner. Results demonstrate that our method achieves state of the art performances on controllable image captioning, in terms of caption quality and diversity. Code and annotations are publicly available at: https://github.com/aimagelab/show-control-and-tell.

2019 Relazione in Atti di Convegno

SHREC 2019 Track: Online Gesture Recognition

Authors: Caputo, F. M.; Burato, S.; Pavan, G.; Voillemin, T.; Wannous, H.; Vandeborre, J. P.; Maghoumi, M.; Taranta, E. M.; Razmjoo, A.; J. J. Laviola Jr., ; Manganaro, Fabio; Pini, S.; Borghi, G.; Vezzani, R.; Cucchiara, R.; Nguyen, H.; Tran, M. T.; Giachetti, A.

This paper presents the results of the Eurographics 2019 SHape Retrieval Contest track on online gesture recognition. The goal of … (Read full abstract)

This paper presents the results of the Eurographics 2019 SHape Retrieval Contest track on online gesture recognition. The goal of this contest was to test state-of-the-art methods that can be used to online detect command gestures from hands' movements tracking on a basic benchmark where simple gestures are performed interleaving them with other actions. Unlike previous contests and benchmarks on trajectory-based gesture recognition, we proposed an online gesture recognition task, not providing pre-segmented gestures, but asking the participants to find gestures within recorded trajectories. The results submitted by the participants show that an online detection and recognition of sets of very simple gestures from 3D trajectories captured with a cheap sensor can be effectively performed. The best methods proposed could be, therefore, directly exploited to design effective gesture-based interfaces to be used in different contexts, from Virtual and Mixed reality applications to the remote control of home devices.

2019 Relazione in Atti di Convegno

Single-cell DNA Sequencing Data: a Pipeline for Multi-Sample Analysis

Authors: Marilisa, Montemurro; Grassi, Elena; Urgese, Gianvito; Emanuele, Parisi; Gabriele Pizzino, Carmelo; Bertotti, Andrea; Ficarra, Elisa

Nowadays, single-cell DNA (sc-DNA) sequencing is showing up to be a valuable instrument to investigate intra and inter-tumor heterogeneity and … (Read full abstract)

Nowadays, single-cell DNA (sc-DNA) sequencing is showing up to be a valuable instrument to investigate intra and inter-tumor heterogeneity and infer its evolutionary dynamics, by using the high-resolution data it produces. That is why the demand for analytical tools to manage this kind of data is increasing. Here we propose a pipeline capable of producing multi-sample copy-number variation (CNV) analysis on large-scale single-cell DNA sequencing data and investigate spatial and temporal tumor heterogeneity.

2019 Relazione in Atti di Convegno

Single-cell DNA Sequencing Data: a Pipeline for Multi-Sample Analysis

Authors: Montemurro, Marilisa; Grassi, Elena; Urgese, Gianvito; Gabriele Pizzino, Carmelo; Bertotti, Andrea; Ficarra, Elisa

In order to help cancer researchers in understanding tumor heterogeneity and its evolutionary dynamics, we propose a software pipeline to … (Read full abstract)

In order to help cancer researchers in understanding tumor heterogeneity and its evolutionary dynamics, we propose a software pipeline to explore intra-tumor heterogeneity by means of scDNA sequencing data.

2019 Abstract in Atti di Convegno

Skin Lesion Segmentation Ensemble with Diverse Training Strategies

Authors: Canalini, Laura; Pollastri, Federico; Bolelli, Federico; Cancilla, Michele; Allegretti, Stefano; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

This paper presents a novel strategy to perform skin lesion segmentation from dermoscopic images. We design an effective segmentation pipeline, … (Read full abstract)

This paper presents a novel strategy to perform skin lesion segmentation from dermoscopic images. We design an effective segmentation pipeline, and explore several pre-training methods to initialize the features extractor, highlighting how different procedures lead the Convolutional Neural Network (CNN) to focus on different features. An encoder-decoder segmentation CNN is employed to take advantage of each pre-trained features extractor. Experimental results reveal how multiple initialization strategies can be exploited, by means of an ensemble method, to obtain state-of-the-art skin lesion segmentation accuracy.

2019 Relazione in Atti di Convegno

Social traits from stochastic paths in the core affect space

Authors: Boccignone, Giuseppe; Cuculo, Vittorio; D'Amelio, Alessandro; Lanzarotti, Raffaella

We discuss a preliminary investigation on the feasibility of inferring traits of social participation from the observable behaviour of individuals … (Read full abstract)

We discuss a preliminary investigation on the feasibility of inferring traits of social participation from the observable behaviour of individuals involved in dyadic interactions. Trait inference relies on a stochastic model of the dynamics occurring in the individual core affect state-space. Results obtained on a publicly available interaction dataset are presented and examined.

2019 Relazione in Atti di Convegno

Page 47 of 106 • Total publications: 1054