Publications by Guido Borghi

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Active filters (Clear): Author: Guido Borghi

A Transformer-Based Network for Dynamic Hand Gesture Recognition

Authors: D’Eusanio, Andrea; Simoni, Alessandro; Pini, Stefano; Borghi, Guido; Vezzani, Roberto; Cucchiara, Rita

Transformer-based neural networks represent a successful self-attention mechanism that achieves state-of-the-art results in language understanding and sequence modeling. However, their … (Read full abstract)

Transformer-based neural networks represent a successful self-attention mechanism that achieves state-of-the-art results in language understanding and sequence modeling. However, their application to visual data and, in particular, to the dynamic hand gesture recognition task has not yet been deeply investigated. In this paper, we propose a transformer-based architecture for the dynamic hand gesture recognition task. We show that the employment of a single active depth sensor, specifically the usage of depth maps and the surface normals estimated from them, achieves state-of-the-art results, overcoming all the methods available in the literature on two automotive datasets, namely NVidia Dynamic Hand Gesture and Briareo. Moreover, we test the method with other data types available with common RGB-D devices, such as infrared and color data. We also assess the performance in terms of inference time and number of parameters, showing that the proposed framework is suitable for an online in-car infotainment system.

2020 Relazione in Atti di Convegno

Anomaly Detection for Vision-based Railway Inspection

Authors: Gasparini, Riccardo; Pini, Stefano; Borghi, Guido; Scaglione, Giuseppe; Calderara, Simone; Fedeli, Eugenio; Cucchiara, Rita

Published in: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE

2020 Relazione in Atti di Convegno

Anomaly Detection, Localization and Classification for Railway Inspection

Authors: Gasparini, Riccardo; D'Eusanio, Andrea; Borghi, Guido; Pini, Stefano; Scaglione, Giuseppe; Calderara, Simone; Fedeli, Eugenio; Cucchiara, Rita

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

2020 Relazione in Atti di Convegno

Baracca: a Multimodal Dataset for Anthropometric Measurements in Automotive

Authors: Pini, Stefano; D'Eusanio, Andrea; Borghi, Guido; Vezzani, Roberto; Cucchiara, Rita

2020 Relazione in Atti di Convegno

Face-from-Depth for Head Pose Estimation on Depth Images

Authors: Borghi, Guido; Fabbri, Matteo; Vezzani, Roberto; Calderara, Simone; Cucchiara, Rita

Published in: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Depth cameras allow to set up reliable solutions for people monitoring and behavior understanding, especially when unstable or poor illumination … (Read full abstract)

Depth cameras allow to set up reliable solutions for people monitoring and behavior understanding, especially when unstable or poor illumination conditions make unusable common RGB sensors. Therefore, we propose a complete framework for the estimation of the head and shoulder pose based on depth images only. A head detection and localization module is also included, in order to develop a complete end-to-end system. The core element of the framework is a Convolutional Neural Network, called POSEidon+, that receives as input three types of images and provides the 3D angles of the pose as output. Moreover, a Face-from-Depth component based on a Deterministic Conditional GAN model is able to hallucinate a face from the corresponding depth image. We empirically demonstrate that this positively impacts the system performances. We test the proposed framework on two public datasets, namely Biwi Kinect Head Pose and ICT-3DHP, and on Pandora, a new challenging dataset mainly inspired by the automotive setup. Experimental results show that our method overcomes several recent state-of-art works based on both intensity and depth input data, running in real-time at more than 30 frames per second.

2020 Articolo su rivista

Learn to See by Events: Color Frame Synthesis from Event and RGB Cameras

Authors: Pini, Stefano; Borghi, Guido; Vezzani, Roberto

Event cameras are biologically-inspired sensors that gather the temporal evolution of the scene. They capture pixel-wise brightness variations and output … (Read full abstract)

Event cameras are biologically-inspired sensors that gather the temporal evolution of the scene. They capture pixel-wise brightness variations and output a corresponding stream of asynchronous events. Despite having multiple advantages with respect to traditional cameras, their use is partially prevented by the limited applicability of traditional data processing and vision algorithms. To this aim, we present a framework which exploits the output stream of event cameras to synthesize RGB frames, relying on an initial or a periodic set of color key-frames and the sequence of intermediate events. Differently from existing work, we propose a deep learning-based frame synthesis method, consisting of an adversarial architecture combined with a recurrent module. Qualitative results and quantitative per-pixel, perceptual, and semantic evaluation on four public datasets confirm the quality of the synthesized images.

2020 Relazione in Atti di Convegno

Mercury: a vision-based framework for Driver Monitoring

Authors: Borghi, Guido; Pini, Stefano; Vezzani, Roberto; Cucchiara, Rita

In this paper, we propose a complete framework, namely Mercury, that combines Computer Vision and Deep Learning algorithms to continuously … (Read full abstract)

In this paper, we propose a complete framework, namely Mercury, that combines Computer Vision and Deep Learning algorithms to continuously monitor the driver during the driving activity. The proposed solution complies to the require-ments imposed by the challenging automotive context: the light invariance, in or-der to have a system able to work regardless of the time of day and the weather conditions. Therefore, infrared-based images, i.e. depth maps (in which each pixel corresponds to the distance between the sensor and that point in the scene), have been exploited in conjunction with traditional intensity images. Second, the non-invasivity of the system is required, since driver’s movements must not be impeded during the driving activity: in this context, the use of camer-as and vision-based algorithms is one of the best solutions. Finally, real-time per-formance is needed since a monitoring system must immediately react as soon as a situation of potential danger is detected.

2020 Relazione in Atti di Convegno

Multimodal Hand Gesture Classification for the Human-Car Interaction

Authors: D’Eusanio, Andrea; Simoni, Alessandro; Pini, Stefano; Borghi, Guido; Vezzani, Roberto; Cucchiara, Rita

Published in: INFORMATICS

2020 Articolo su rivista

Driver Face Verification with Depth Maps

Authors: Borghi, Guido; Pini, Stefano; Vezzani, Roberto; Cucchiara, Rita

Published in: SENSORS

Face verification is the task of checking if two provided images contain the face of the same person or not. … (Read full abstract)

Face verification is the task of checking if two provided images contain the face of the same person or not. In this work, we propose a fully-convolutional Siamese architecture to tackle this task, achieving state-of-the-art results on three publicly-released datasets, namely Pandora, High-Resolution Range-based Face Database (HRRFaceD), and CurtinFaces. The proposed method takes depth maps as the input, since depth cameras have been proven to be more reliable in different illumination conditions. Thus, the system is able to work even in the case of the total or partial absence of external light sources, which is a key feature for automotive applications. From the algorithmic point of view, we propose a fully-convolutional architecture with a limited number of parameters, capable of dealing with the small amount of depth data available for training and able to run in real time even on a CPU and embedded boards. The experimental results show acceptable accuracy to allow exploitation in real-world applications with in-board cameras. Finally, exploiting the presence of faces occluded by various head garments and extreme head poses available in the Pandora dataset, we successfully test the proposed system also during strong visual occlusions. The excellent results obtained confirm the efficacy of the proposed method.

2019 Articolo su rivista

Face Verification from Depth using Privileged Information

Authors: Borghi, Guido; Pini, Stefano; Grazioli, Filippo; Vezzani, Roberto; Cucchiara, Rita

In this paper, a deep Siamese architecture for depth-based face verification is presented. The proposed approach efficiently verifies if two … (Read full abstract)

In this paper, a deep Siamese architecture for depth-based face verification is presented. The proposed approach efficiently verifies if two face images belong to the same person while handling a great variety of head poses and occlusions. The architecture, namely JanusNet, consists in a combination of a depth, a RGB and a hybrid Siamese network. During the training phase, the hybrid network learns to extract complementary mid-level convolutional features which mimic the features of the RGB network, simultaneously leveraging on the light invariance of depth images. At testing time, the model, relying only on depth data, achieves state-of-art results and real time performance, despite the lack of deep-oriented depth-based datasets.

2019 Relazione in Atti di Convegno

Page 6 of 9 • Total publications: 81