Publications
Explore our research publications: papers, articles, and conference proceedings from AImageLab.
Tip: type @ to pick an author and # to pick a keyword.
Bits2Bites: Intra-oral Scans Occlusal Classification
Authors: Borghi, Lorenzo; Lumetti, Luca; Cremonini, Francesca; Rizzo, Federico; Grana, Costantino; Lombardo, Luca; Bolelli, Federico
We introduce Bits2Bites, the first publicly available dataset for occlusal classification from intra-oral scans, comprising 200 paired upper and lower … (Read full abstract)
We introduce Bits2Bites, the first publicly available dataset for occlusal classification from intra-oral scans, comprising 200 paired upper and lower dental arches annotated across multiple clinically relevant dimensions (sagittal, vertical, transverse, and midline relationships). Leveraging this resource, we propose a multi-task learning benchmark that jointly predicts five occlusal traits from raw 3D point clouds using state-of-the-art point-based neural architectures. Our approach includes extensive ablation studies assessing the benefits of multi-task learning against single-task baselines, as well as the impact of automatically-predicted anatomical landmarks as input features. Results demonstrate the feasibility of directly inferring comprehensive occlusion information from unstructured 3D data, achieving promising performance across all tasks. Our entire dataset, code, and pretrained models are publicly released to foster further research in automated orthodontic diagnosis.
Causal Graphical Models for Vision-Language Compositional Understanding
Authors: Parascandolo, Fiorenzo; Moratelli, Nicholas; Sangineto, Enver; Baraldi, Lorenzo; Cucchiara, Rita
CLOSED-FORM MERGING OF PARAMETER-EFFICIENT MODULES FOR FEDERATED CONTINUAL LEARNING
Authors: Salami, R.; Buzzega, P.; Mosconi, M.; Bonato, J.; Sabetta, L.; Calderara, S.
Model merging has emerged as a crucial technique in Deep Learning, enabling the integration of multiple models into a unified … (Read full abstract)
Model merging has emerged as a crucial technique in Deep Learning, enabling the integration of multiple models into a unified system while preserving performance and scalability. In this respect, the compositional properties of low-rank adaptation techniques (e.g., LoRA) have proven beneficial, as simple averaging LoRA modules yields a single model that mostly integrates the capabilities of all individual modules. Building on LoRA, we take a step further by imposing that the merged model matches the responses of all learned modules. Solving this objective in closed form yields an indeterminate system with A and B as unknown variables, indicating the existence of infinitely many closed-form solutions. To address this challenge, we introduce LoRM, an alternating optimization strategy that trains one LoRA matrix at a time. This allows solving for each unknown variable individually, thus finding a unique solution. We apply our proposed methodology to Federated Class-Incremental Learning (FCIL), ensuring alignment of model responses both between clients and across tasks. Our method demonstrates state-of-the-art performance across a range of FCIL scenarios. The code to reproduce our experiments is available at this http URL.
Context-guided Prompt Learning for Continual WSI Classification
Authors: Corso, Giulia; Miccolis, Francesca; Porrello, Angelo; Bolelli, Federico; Calderara, Simone; Ficarra, Elisa
Whole Slide Images (WSIs) are crucial in histological diagnostics, providing high-resolution insights into cellular structures. In addition to challenges like … (Read full abstract)
Whole Slide Images (WSIs) are crucial in histological diagnostics, providing high-resolution insights into cellular structures. In addition to challenges like the gigapixel scale of WSIs and the lack of pixel-level annotations, privacy restrictions further complicate their analysis. For instance, in a hospital network, different facilities need to collaborate on WSI analysis without the possibility of sharing sensitive patient data. A more practical and secure approach involves sharing models capable of continual adaptation to new data. However, without proper measures, catastrophic forgetting can occur. Traditional continual learning techniques rely on storing previous data, which violates privacy restrictions. To address this issue, this paper introduces Context Optimization Multiple Instance Learning (CooMIL), a rehearsal-free continual learning framework explicitly designed for WSI analysis. It employs a WSI-specific prompt learning procedure to adapt classification models across tasks, efficiently preventing catastrophic forgetting. Evaluated on four public WSI datasets from TCGA projects, our model significantly outperforms state-of-the-art methods within the WSI-based continual learning framework. The source code is available at https://github.com/FrancescaMiccolis/CooMIL.
Decoding Facial Expressions in Video: A Multiple Instance Learning Perspective on Action Units
Authors: Del Gaudio, Livia; Cuculo, Vittorio; Cucchiara, Rita
Facial expression recognition (FER) in video sequences is a longstanding challenge in affective computing and computer vision, particularly due to … (Read full abstract)
Facial expression recognition (FER) in video sequences is a longstanding challenge in affective computing and computer vision, particularly due to the temporal complexity and subtlety of emotional expressions. In this paper, we propose a novel pipeline that leverages facial Action Units (AUs) as structured time series descriptors of facial muscle activity, enabling emotion classification in videos through a Multiple Instance Learning (MIL) framework. Our approach models each video as a bag of AU-based instances, capturing localized temporal patterns, and allows for robust learning even when only coarse video-level emotion labels are available. Crucially, the approach incorporates interpretability mechanisms that highlight the temporal segments most influential to the final prediction, providing informed decision-making and facilitating downstream analysis. Experimental results on benchmark FER video datasets demonstrate that our method achieves competitive performance using only visual data, without requiring multimodal signals or frame-level supervision. This highlights its potential as an interpretable and efficient solution for weakly supervised emotion recognition in real-world scenarios.
Deep Learning for Classifying Anti-Shigella Opsono- Phagocytosis-Promoting Monoclonal Antibodies
Authors: Pianfetti, Elena; Cardamone, Dario; Roscioli, Emanuele; Ciano, Giorgio; Maccari, Giuseppe; Sala, Claudia; Micoli, Francesca; Rappuoli, Rino; Medini, Duccio; Ficarra, Elisa
Published in: LECTURE NOTES IN COMPUTER SCIENCE
Shigellosis is an acute small intestine infection caused by different species of Shigella. Worldwide, the emergence of antibiotic-resistant strains aggravates … (Read full abstract)
Shigellosis is an acute small intestine infection caused by different species of Shigella. Worldwide, the emergence of antibiotic-resistant strains aggravates the impact of Shigella infections. In this context, human monoclonal antibodies (mAbs) offer an alternative to traditional antimicrobials. However, identifying a potent candidate mAb requires intense and meticulous efforts. Here, we show the potential of Deep Learning to screen mAbs rapidly. We measured the phagocytosis-promoting activity of mAbs by analyzing images collected with a high-throughput and high-content confocal fluorescence microscope. We acquired images of S. sonnei and S. flexneri infecting THP-1-derived macrophages and evaluated the effect of different mAbs and of a wide selection of Deep Learning tools. We found that our model can generalize on strains and mAbs not encountered in training. Importantly, our approach enables the screening and characterization of multiple anti-Shigella mAbs at the same time, facilitating the identification of potent antibacterial candidates. Our code is available on the GitHub repository vOPA_Shigella.
Depth-Based Privileged Information for Boosting 3D Human Pose Estimation on RGB
Authors: Simoni, A.; Marchetti, F.; Borghi, G.; Becattini, F.; Davoli, D.; Garattoni, L.; Francesca, G.; Seidenari, L.; Vezzani, R.
Published in: LECTURE NOTES IN COMPUTER SCIENCE