Publications by Guido Borghi
Explore our research publications: papers, articles, and conference proceedings from AImageLab.
PopEYE - Infrared Ocular Image Dataset for Eye State and Gaze-Direction Classification
Authors: Gibertoni, Giovanni; Borghi, Guido; Rovati, Luigi
The PopEYE dataset is a specialized collection of 14,976 near-infrared (NIR) images of the human eye region, specifically designed to … (Read full abstract)
The PopEYE dataset is a specialized collection of 14,976 near-infrared (NIR) images of the human eye region, specifically designed to support the development and benchmarking of computer vision algorithms for eye-state detection and coarse gaze-direction classification. Each image is provided in a fixed resolution of 772 × 520 pixels in 8-bit grayscale PNG format. The acquisition was performed frontally using a custom-developed Maxwellian-view optical configuration, consisting of a board-level CMOS camera and a specialized lens system where the subject's eye is precisely positioned at the focal point. This setup ensures a high-contrast representation of the anterior segment, making the pupil, iris, limbus, and portions of the sclera and eyelids clearly distinguishable under stable 850 nm infrared illumination. The dataset is categorized into six mutually exclusive classes identified through manual annotation supported by fixed visual aids and an expert system algorithm. The classification includes a correct positioning class for eyes open and properly aligned for clinical measurements (8,160 images), a closed class representing full eye closures such as blinks or sustained lid closure (1,790 images), and four directional classes representing gaze shifts relative to the central optical axis, specifically up (1,379 images), down (1,015 images), left (1,296 images), and right (1,336 images). The data captures the natural anatomical variability of 22 subjects and incorporates common real-world artifacts such as specular reflections from NIR sources and partial pupil occlusions by eyelashes or eyelids. By providing standardized labels and high-resolution NIR imagery, PopEYE serves as a robust resource for training machine learning models intended for real-time patient monitoring during ophthalmic examinations.
3D Pose Nowcasting: Forecast the future to improve the present
Authors: Simoni, A.; Marchetti, F.; Borghi, G.; Becattini, F.; Seidenari, L.; Vezzani, R.; Del Bimbo, A.
Published in: COMPUTER VISION AND IMAGE UNDERSTANDING
Technologies to enable safe and effective collaboration and coexistence between humans and robots have gained significant importance in the last … (Read full abstract)
Technologies to enable safe and effective collaboration and coexistence between humans and robots have gained significant importance in the last few years. A critical component useful for realizing this collaborative paradigm is the understanding of human and robot 3D poses using non-invasive systems. Therefore, in this paper, we propose a novel vision-based system leveraging depth data to accurately establish the 3D locations of skeleton joints. Specifically, we introduce the concept of Pose Nowcasting, denoting the capability of the proposed system to enhance its current pose estimation accuracy by jointly learning to forecast future poses. The experimental evaluation is conducted on two different datasets, providing accurate and real-time performance and confirming the validity of the proposed method on both the robotic and human scenarios.
AURALYS: smart glasses to improve audio selection and perception in educational and working contexts
Authors: Filippini, Gianluca; Borghi, Guido; Giliberti, Enrico; Damiani, Paola; Vezzani, Roberto
Depth-Based Privileged Information for Boosting 3D Human Pose Estimation on RGB
Authors: Simoni, A.; Marchetti, F.; Borghi, G.; Becattini, F.; Davoli, D.; Garattoni, L.; Francesca, G.; Seidenari, L.; Vezzani, R.
Published in: LECTURE NOTES IN COMPUTER SCIENCE
LLMs and Humanoid Robot Diversity: The Pose Generation Challenge
Authors: Catalini, Riccardo; Biagi, Federico; Salici, Giacomo; Borghi, Guido; Vezzani, Roberto; Biagiotti, Luigi
Humanoid robots are increasingly being integrated into diverse scenarios, such as healthcare facilities, social settings, and workplaces. As the need … (Read full abstract)
Humanoid robots are increasingly being integrated into diverse scenarios, such as healthcare facilities, social settings, and workplaces. As the need for intuitive control by non-expert users grows, many studies have explored the use of Artificial Intelligence to enable communication and control. However, these approaches are often tailored to specific robots due to the absence of standardized conventions and notation. This study addresses the challenges posed by these inconsistencies and investigates their impact on the ability of Large Language Models (LLMs) to generate accurate 3D robot poses, even when detailed robot specifications are provided as input.