Publications

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Coarse-to-fine gaze redirection with numerical and pictorial guidance

Authors: Chen, J.; Zhang, J.; Sangineto, E.; Chen, T.; Fan, J.; Sebe, N.

Gaze redirection aims at manipulating the gaze of a given face image with respect to a desired direction (i.e., a … (Read full abstract)

Gaze redirection aims at manipulating the gaze of a given face image with respect to a desired direction (i.e., a reference angle) and it can be applied to many real life scenarios, such as video-conferencing or taking group photos. However, previous work on this topic mainly suffers of two limitations: (1) Low-quality image generation and (2) Low redirection precision. In this paper, we propose to alleviate these problems by means of a novel gaze redirection framework which exploits both a numerical and a pictorial direction guidance, jointly with a coarse-to-fine learning strategy. Specifically, the coarse branch learns the spatial transformation which warps input image according to desired gaze. On the other hand, the fine-grained branch consists of a generator network with conditional residual image learning and a multi-task discriminator. This second branch reduces the gap between the previously warped image and the ground-truth image and recovers finer texture details. Moreover, we propose a numerical and pictorial guidance module (NPG) which uses a pictorial gazemap description and numerical angles as an extra guide to further improve the precision of gaze redirection. Extensive experiments on a benchmark dataset show that the proposed method outperforms the state-of-the-art approaches in terms of both image quality and redirection precision. The code is available at https://github.com/jingjingchen777/CFGR

2021 Relazione in Atti di Convegno

Confidence Calibration for Deep Renal Biopsy Immunofluorescence Image Classification

Authors: Pollastri, Federico; Maroñas, Juan; Bolelli, Federico; Ligabue, Giulia; Paredes, Roberto; Magistroni, Riccardo; Grana, Costantino

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

With this work we tackle immunofluorescence classification in renal biopsy, employing state-of-the-art Convolutional Neural Networks. In this setting, the aim … (Read full abstract)

With this work we tackle immunofluorescence classification in renal biopsy, employing state-of-the-art Convolutional Neural Networks. In this setting, the aim of the probabilistic model is to assist an expert practitioner towards identifying the location pattern of antibody deposits within a glomerulus. Since modern neural networks often provide overconfident outputs, we stress the importance of having a reliable prediction, demonstrating that Temperature Scaling (TS), a recently introduced re-calibration technique, can be successfully applied to immunofluorescence classification in renal biopsy. Experimental results demonstrate that the designed model yields good accuracy on the specific task, and that TS is able to provide reliable probabilities, which are highly valuable for such a task given the low inter-rater agreement.

2021 Relazione in Atti di Convegno

DAG-Net: Double Attentive Graph Neural Network for Trajectory Forecasting

Authors: Monti, Alessio; Bertugli, Alessia; Calderara, Simone; Cucchiara, Rita

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

Understanding human motion behaviour is a critical task for several possible applications like self-driving cars or social robots, and in … (Read full abstract)

Understanding human motion behaviour is a critical task for several possible applications like self-driving cars or social robots, and in general for all those settings where an autonomous agent has to navigate inside a human-centric environment. This is non-trivial because human motion is inherently multi-modal: given a history of human motion paths, there are many plausible ways by which people could move in the future. Additionally, people activities are often driven by goals, e.g. reaching particular locations or interacting with the environment. We address the aforementioned aspects by proposing a new recurrent generative model that considers both single agents' future goals and interactions between different agents. The model exploits a double attention-based graph neural network to collect information about the mutual influences among different agents and to integrate it with data about agents' possible future objectives. Our proposal is general enough to be applied to different scenarios: the model achieves state-of-the-art results in both urban environments and also in sports applications.

2021 Relazione in Atti di Convegno

Data‐based design of robust fault detection and isolation residuals via LASSO optimization and Bayesian filtering

Authors: Cascianelli, Silvia; Costante, Gabriele; Crocetti, Francesco; Ricci, Elisa; Valigi, Paolo; Luca Fravolini, Mario

Published in: ASIAN JOURNAL OF CONTROL

2021 Articolo su rivista

Efficient Training of Visual Transformers with Small-Size Datasets

Authors: Liu, Yahui; Sangineto, Enver; Bi, Wei; Sebe, Nicu; Lepri, Bruno; De Nadai, Marco

Published in: ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS

2021 Relazione in Atti di Convegno

Estimating (and fixing) the Effect of Face Obfuscation in Video Recognition

Authors: Tomei, Matteo; Baraldi, Lorenzo; Bronzin, Simone; Cucchiara, Rita

Published in: IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS

2021 Relazione in Atti di Convegno

Exploration of Convolutional Neural Network models for source code classification

Authors: Barchi, F.; Parisi, E.; Urgese, G.; Ficarra, E.; Acquaviva, A.

Published in: ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE

The application of Artificial Intelligence is becoming common in many engineering fields. Among them, one of the newest and rapidly … (Read full abstract)

The application of Artificial Intelligence is becoming common in many engineering fields. Among them, one of the newest and rapidly evolving is software generation, where AI can be used to automatically optimise the implementation of an algorithm for a given computing platform. In particular, Deep Learning technologies can be used to the decide how to allocate pieces of code to hardware platforms with multiple cores and accelerators, that are common in high performance and edge computing applications. In this work, we explore the use of Convolutional Neural Networks (CNN)s to analyse the application source code and decide the best compute unit to minimise the execution time. We demonstrate that CNN models can be successfully applied to source code classification, providing higher accuracy with consistently reduced learning time with respect to state-of-the-art methods. Moreover, we show the robustness of the method with respect to source code pre-processing, compiler options and hyper-parameters selection.

2021 Articolo su rivista

Explore and Explain: Self-supervised Navigation and Recounting

Authors: Bigazzi, Roberto; Landi, Federico; Cornia, Marcella; Cascianelli, Silvia; Baraldi, Lorenzo; Cucchiara, Rita

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

Embodied AI has been recently gaining attention as it aims to foster the development of autonomous and intelligent agents. In … (Read full abstract)

Embodied AI has been recently gaining attention as it aims to foster the development of autonomous and intelligent agents. In this paper, we devise a novel embodied setting in which an agent needs to explore a previously unknown environment while recounting what it sees during the path. In this context, the agent needs to navigate the environment driven by an exploration goal, select proper moments for description, and output natural language descriptions of relevant objects and scenes. Our model integrates a novel self-supervised exploration module with penalty, and a fully-attentive captioning model for explanation. Also, we investigate different policies for selecting proper moments for explanation, driven by information coming from both the environment and the navigation. Experiments are conducted on photorealistic environments from the Matterport3D dataset and investigate the navigation and explanation capabilities of the agent as well as the role of their interactions.

2021 Relazione in Atti di Convegno

Extracting accurate long-term behavior changes from a large pig dataset

Authors: Bergamini, L.; Pini, S.; Simoni, A.; Vezzani, R.; Calderara, S.; Eath, R. B. D.; Fisher, R. B.

Visual observation of uncontrolled real-world behavior leads to noisy observations, complicated by occlusions, ambiguity, variable motion rates, detection and tracking … (Read full abstract)

Visual observation of uncontrolled real-world behavior leads to noisy observations, complicated by occlusions, ambiguity, variable motion rates, detection and tracking errors, slow transitions between behaviors, etc. We show in this paper that reliable estimates of long-term trends can be extracted given enough data, even though estimates from individual frames may be noisy. We validate this concept using a new public dataset of approximately 20+ million daytime pig observations over 6 weeks of their main growth stage, and we provide annotations for various tasks including 5 individual behaviors. Our pipeline chains detection, tracking and behavior classification combining deep and shallow computer vision techniques. While individual detections may be noisy, we show that long-term behavior changes can still be extracted reliably, and we validate these results qualitatively on the full dataset. Eventually, starting from raw RGB video data we are able to both tell what pigs main daily activities are, and how these change through time.

2021 Relazione in Atti di Convegno

FashionSearch++: Improving Consumer-to-Shop Clothes Retrieval with Hard Negatives

Authors: Morelli, Davide; Cornia, Marcella; Cucchiara, Rita

Published in: CEUR WORKSHOP PROCEEDINGS

Consumer-to-shop clothes retrieval has recently emerged in computer vision and multimedia communities with the development of architectures that can find … (Read full abstract)

Consumer-to-shop clothes retrieval has recently emerged in computer vision and multimedia communities with the development of architectures that can find similar in-shop clothing images given a query photo. Due to its nature, the main challenge lies in the domain gap between user-acquired and in-shop images. In this paper, we follow the most recent successful research in this area employing convolutional neural networks as feature extractors and propose to enhance the training supervision through a modified triplet loss that takes into account hard negative examples. We test the proposed approach on the Street2Shop dataset, achieving results comparable to state-of-the-art solutions and demonstrating good generalization properties when dealing with different settings and clothing categories.

2021 Relazione in Atti di Convegno

Page 33 of 106 • Total publications: 1054