Publications by Simone Calderara

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Active filters (Clear): Author: Simone Calderara

Beyond the Surface: Comprehensive Analysis of Implicit Bias in Vision-Language Models

Authors: Capitani, Giacomo; Lucarini, Alice; Bonicelli, Lorenzo; Bolelli, Federico; Calderara, Simone; Vezzali, Loris; Ficarra, Elisa

Implicit biases, subtle and unconscious attitudes, permeate various facets of human decision-making and are similarly pervasive in Artificial Intelligence (AI) … (Read full abstract)

Implicit biases, subtle and unconscious attitudes, permeate various facets of human decision-making and are similarly pervasive in Artificial Intelligence (AI) systems. These biases can stem from shortcut learning, where models rely on superficial patterns that do not capture the underlying phenomena. Inspired by social psychology studies, we introduce two novel metrics to analyze implicit biases in visual-language models. Our comprehensive analysis of 90 open-clip models reveals widespread anomalies related to ethnicity and gender. The first metric considers the cosine similarity between images and text prompts related to social stereotypes. The second metric adapts the Implicit Association Test (IAT), which evaluates prejudice and hidden discrimination within human behavior. Our findings illustrate that conventional text-based debiasing efforts can inadvertently amplify second-order biases instead of mitigating them. Furthermore, in expanding our evaluation to multimodal Large Language Models (LLMs), we demonstrate disparities in the tendency to generate semantically positive or negative outputs, depending on the ethnicity or gender of the individuals depicted in the input images.

2024 Relazione in Atti di Convegno

CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning

Authors: Frascaroli, Emanuele; Panariello, Aniello; Buzzega, Pietro; Bonicelli, Lorenzo; Porrello, Angelo; Calderara, Simone

With the emergence of Transformers and Vision-Language Models (VLMs) such as CLIP, fine-tuning large pre-trained models has recently become a … (Read full abstract)

With the emergence of Transformers and Vision-Language Models (VLMs) such as CLIP, fine-tuning large pre-trained models has recently become a prevalent strategy in Continual Learning. This has led to the development of numerous prompting strategies to adapt transformer-based models without incurring catastrophic forgetting. However, these strategies often compromise the original zero-shot capabilities of the pre-trained CLIP model and struggle to adapt to domains that significantly deviate from the pre-training data. In this work, we propose Continual Generative training for Incremental prompt-Learning, a simple and novel approach to mitigate forgetting while adapting CLIP. Briefly, we employ Variational Autoencoders (VAEs) to learn class-conditioned distributions within the embedding space of the visual encoder. We then exploit these distributions to sample new synthetic visual embeddings and train the corresponding class-specific textual prompts during subsequent tasks. Through extensive experiments on different domains, we show that such a generative replay approach can adapt to new tasks while improving zero-shot capabilities, evaluated using a novel metric tailored for CL scenarios. Notably, further analysis reveals that our approach can bridge the gap with joint prompt tuning. The codebase is available at https://github.com/aimagelab/mammoth.

2024 Relazione in Atti di Convegno

ClusterFix: A Cluster-Based Debiasing Approach without Protected-Group Supervision

Authors: Capitani, Giacomo; Bolelli, Federico; Porrello, Angelo; Calderara, Simone; Ficarra, Elisa

The failures of Deep Networks can sometimes be ascribed to biases in the data or algorithmic choices. Existing debiasing approaches … (Read full abstract)

The failures of Deep Networks can sometimes be ascribed to biases in the data or algorithmic choices. Existing debiasing approaches exploit prior knowledge to avoid unintended solutions; we acknowledge that, in real-world settings, it could be unfeasible to gather enough prior information to characterize the bias, or it could even raise ethical considerations. We hence propose a novel debiasing approach, termed ClusterFix, which does not require any external hint about the nature of biases. Such an approach alters the standard empirical risk minimization and introduces a per-example weight, encoding how critical and far from the majority an example is. Notably, the weights consider how difficult it is for the model to infer the correct pseudo-label, which is obtained in a self-supervised manner by dividing examples into multiple clusters. Extensive experiments show that the misclassification error incurred in identifying the correct cluster allows for identifying examples prone to bias-related issues. As a result, our approach outperforms existing methods on standard benchmarks for bias removal and fairness.

2024 Relazione in Atti di Convegno

Is Multiple Object Tracking a Matter of Specialization?

Authors: Mancusi, Gianluca; Bernardi, Mattia; Panariello, Aniello; Porrello, Angelo; Cucchiara, Rita; Calderara, Simone

Published in: ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS

End-to-end transformer-based trackers have achieved remarkable performance on most human-related datasets. However, training these trackers in heterogeneous scenarios poses significant … (Read full abstract)

End-to-end transformer-based trackers have achieved remarkable performance on most human-related datasets. However, training these trackers in heterogeneous scenarios poses significant challenges, including negative interference - where the model learns conflicting scene-specific parameters - and limited domain generalization, which often necessitates expensive fine-tuning to adapt the models to new domains. In response to these challenges, we introduce Parameter-efficient Scenario-specific Tracking Architecture (PASTA), a novel framework that combines Parameter-Efficient Fine-Tuning (PEFT) and Modular Deep Learning (MDL). Specifically, we define key scenario attributes (e.g, camera-viewpoint, lighting condition) and train specialized PEFT modules for each attribute. These expert modules are combined in parameter space, enabling systematic generalization to new domains without increasing inference time. Extensive experiments on MOTSynth, along with zero-shot evaluations on MOT17 and PersonPath22 demonstrate that a neural tracker built from carefully selected modules surpasses its monolithic counterpart. We release models and code.

2024 Relazione in Atti di Convegno

Latent spectral regularization for continual learning

Authors: Frascaroli, Emanuele; Benaglia, Riccardo; Boschini, Matteo; Moschella, Luca; Fiorini, Cosimo; Rodolà, Emanuele; Calderara, Simone

Published in: PATTERN RECOGNITION LETTERS

While biological intelligence grows organically as new knowledge is gathered throughout life, Artificial Neural Networks forget catastrophically whenever they face … (Read full abstract)

While biological intelligence grows organically as new knowledge is gathered throughout life, Artificial Neural Networks forget catastrophically whenever they face a changing training data distribution. Rehearsal-based Continual Learning (CL) approaches have been established as a versatile and reliable solution to overcome this limitation; however, sudden input disruptions and memory constraints are known to alter the consistency of their predictions. We study this phenomenon by investigating the geometric characteristics of the learner’s latent space and find that replayed data points of different classes increasingly mix up, interfering with classification. Hence, we propose a geometric regularizer that enforces weak requirements on the Laplacian spectrum of the latent space, promoting a partitioning behavior. Our proposal, called Continual Spectral Regularizer for Incremental Learning (CaSpeR-IL), can be easily combined with any rehearsal-based CL approach and improves the performance of SOTA methods on standard benchmarks.

2024 Articolo su rivista

May the Forgetting Be with You: Alternate Replay for Learning with Noisy Labels

Authors: Millunzi, Monica; Bonicelli, Lorenzo; Porrello, Angelo; Credi, Jacopo; Kolm, Petter N.; Calderara, Simone

Forgetting presents a significant challenge during incremental training, making it particularly demanding for contemporary AI systems to assimilate new knowledge … (Read full abstract)

Forgetting presents a significant challenge during incremental training, making it particularly demanding for contemporary AI systems to assimilate new knowledge in streaming data environments. To address this issue, most approaches in Continual Learning (CL) rely on the replay of a restricted buffer of past data. However, the presence of noise in real-world scenarios, where human annotation is constrained by time limitations or where data is automatically gathered from the web, frequently renders these strategies vulnerable. In this study, we address the problem of CL under Noisy Labels (CLN) by introducing Alternate Experience Replay (AER), which takes advantage of forgetting to maintain a clear distinction between clean, complex, and noisy samples in the memory buffer. The idea is that complex or mislabeled examples, which hardly fit the previously learned data distribution, are most likely to be forgotten. To grasp the benefits of such a separation, we equip AER with Asymmetric Balanced Sampling (ABS): a new sample selection strategy that prioritizes purity on the current task while retaining relevant samples from the past. Through extensive computational comparisons, we demonstrate the effectiveness of our approach in terms of both accuracy and purity of the obtained buffer, resulting in a remarkable average gain of 4.71% points in accuracy with respect to existing loss-based purification strategies. Code is available at https://github.com/aimagelab/mammoth

2024 Relazione in Atti di Convegno

Saliency-driven Experience Replay for Continual Learning

Authors: Bellitto, Giovanni; Proietto Salanitri, Federica; Pennisi, Matteo; Boschini, Matteo; Bonicelli, Lorenzo; Porrello, Angelo; Calderara, Simone; Palazzo, Simone; Spampinato, Concetto

Published in: ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS

2024 Relazione in Atti di Convegno

Self-Labeling the Job Shop Scheduling Problem

Authors: Corsini, Andrea; Porrello, Angelo; Calderara, Simone; Dell'Amico, Mauro

Published in: ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS

This work proposes a self-supervised training strategy designed for combinatorial problems. An obstacle in applying supervised paradigms to such problems … (Read full abstract)

This work proposes a self-supervised training strategy designed for combinatorial problems. An obstacle in applying supervised paradigms to such problems is the need for costly target solutions often produced with exact solvers. Inspired by semi- and self-supervised learning, we show that generative models can be trained by sampling multiple solutions and using the best one according to the problem objective as a pseudo-label. In this way, we iteratively improve the model generation capability by relying only on its self-supervision, eliminating the need for optimality information. We validate this Self-Labeling Improvement Method (SLIM) on the Job Shop Scheduling (JSP), a complex combinatorial problem that is receiving much attention from the neural combinatorial community. We propose a generative model based on the well-known Pointer Network and train it with SLIM. Experiments on popular benchmarks demonstrate the potential of this approach as the resulting models outperform constructive heuristics and state-of-the-art learning proposals for the JSP. Lastly, we prove the robustness of SLIM to various parameters and its generality by applying it to the Traveling Salesman Problem.

2024 Relazione in Atti di Convegno

Spotting Culex pipiens from satellite: modeling habitat suitability in central Italy using Sentinel-2 and deep learning techniques

Authors: Ippoliti, Carla; Bonicelli, Lorenzo; De Ascentis, Matteo; Tora, Susanna; Di Lorenzo, Alessio; Gerardo D’Alessio, Silvio; Porrello, Angelo; Bonanni, Americo; Cioci, Daniela; Goffredo, Maria; Calderara, Simone; Conte, Annamaria

Published in: FRONTIERS IN VETERINARY SCIENCE

Culex pipiens, an important vector of many vector borne diseases, is a species capable to feeding on a wide variety … (Read full abstract)

Culex pipiens, an important vector of many vector borne diseases, is a species capable to feeding on a wide variety of hosts and adapting to different environments. To predict the potential distribution of Cx. pipiens in central Italy, this study integrated presence/absence data from a four-year entomological survey (2019-2022) carried out in the Abruzzo and Molise regions, with a datacube of spectral bands acquired by Sentinel-2 satellites, as patches of 224 x 224 pixels of 20 meters spatial resolution around each site and for each satellite revisit time. We investigated three scenarios: the baseline model, which considers the environmental conditions at the time of collection; the multitemporal model, focusing on conditions in the 2 months preceding the collection; and the MultiAdjacency Graph Attention Network (MAGAT) model, which accounts for similarities in temperature and nearby sites using a graph architecture. For the baseline scenario, a deep convolutional neural network (DCNN) analyzed a single multi-band Sentinel-2 image. The DCNN in the multitemporal model extracted temporal patterns from a sequence of 10 multispectral images; the MAGAT model incorporated spatial and climatic relationships among sites through a graph neural network aggregation method. For all models, we also evaluated temporal lags between the multi-band Earth Observation datacube date of acquisition and the mosquito collection, from 0 to 50 days. The study encompassed a total of 2,555 entomological collections, and 108,064 images (patches) at 20 meters spatial resolution. The baseline model achieved an F1 score higher than 75.8% for any temporal lag, which increased up to 81.4% with the multitemporal model. The MAGAT model recorded the highest F1 score of 80.9%. The study confirms the widespread presence of Cx. pipiens throughout the majority of the surveyed area. Utilizing only Sentinel-2 spectral bands, the models effectively capture early in advance the temporal patterns of the mosquito population, offering valuable insights for directing surveillance activities during the vector season. The methodology developed in this study can be scaled up to the national territory and extended to other vectors, in order to support the Ministry of Health in the surveillance and control strategies for the vectors and the diseases they transmit.

2024 Articolo su rivista

Buffer-MIL: Robust Multi-instance Learning with a Buffer-Based Approach

Authors: Bontempo, G.; Lumetti, L.; Porrello, A.; Bolelli, F.; Calderara, S.; Ficarra, E.

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Histopathological image analysis is a critical area of research with the potential to aid pathologists in faster and more accurate … (Read full abstract)

Histopathological image analysis is a critical area of research with the potential to aid pathologists in faster and more accurate diagnoses. However, Whole-Slide Images (WSIs) present challenges for deep learning frameworks due to their large size and lack of pixel-level annotations. Multi-Instance Learning (MIL) is a popular approach that can be employed for handling WSIs, treating each slide as a bag composed of multiple patches or instances. In this work we propose Buffer-MIL, which aims at tackling the covariate shift and class imbalance characterizing most of the existing histopathological datasets. With this goal, a buffer containing the most representative instances of each disease-positive slide of the training set is incorporated into our model. An attention mechanism is then used to compare all the instances against the buffer, to find the most critical ones in a given slide. We evaluate Buffer-MIL on two publicly available WSI datasets, Camelyon16 and TCGA lung cancer, outperforming current state-of-the-art models by 2.2% of accuracy on Camelyon16.

2023 Relazione in Atti di Convegno

Page 3 of 16 • Total publications: 155