CAMNet: Leveraging Cooperative Awareness Messages for Vehicle Trajectory Prediction
Authors: Grasselli, Mattia; Porrello, Angelo; Grazia, Carlo Augusto
Explore our research publications: papers, articles, and conference proceedings from AImageLab.
Authors: Grasselli, Mattia; Porrello, Angelo; Grazia, Carlo Augusto
Authors: Rinaldi, Filippo; Panariello, Aniello; Salici, Giacomo; Liu, Fengyuan; Ciccone, Marco; Porrello, Angelo; Calderara, Simone
When a new release of a foundation model is published, practitioners typically need to repeat fine-tuning, even if the same task was already tackled in the previous version. A promising alternative is to reuse the parameter changes (i.e., task vectors) that capture how a model adapts to a specific task. However, these vectors often fail to transfer across different pre-trained models because their parameter spaces are misaligned. In this work, we show that successful transfer depends strongly on the gradient-sign structure of the new model. Based on this insight, we propose GradFix, which approximates the ideal sign structure and leverages it to transfer knowledge using only a handful of labeled samples. Notably, this requires no additional fine-tuning: we only compute a few target-model gradients without parameter updates and mask the source task vector accordingly. This yields an update that is locally aligned with the target loss landscape, effectively rebasing the task vector onto the new pre-training. We provide a theoretical guarantee that our method ensures first-order descent. Empirically, we demonstrate significant performance gains on vision and language benchmarks, consistently outperforming naive task vector addition and few-shot fine-tuning. We further show that transporting task vectors improves multi-task and multi-source model merging. Code is available at https://github.com/fillo-rinaldi/GradFix.
Authors: Porrello, Angelo; Bonicelli, Lorenzo; Buzzega, Pietro; Millunzi, Monica; Calderara, Simone; Cucchiara, Rita
Authors: Panariello, Aniello; Marczak, Daniel; Magistri, Simone; Porrello, Angelo; Twardowski, Bartłomiej; D Bagdanov, Andrew; Calderara, Simone; Van De Weijer, Joost
Authors: Menabue, Martin; Frascaroli, Emanuele; Boschini, Matteo; Bonicelli, Lorenzo; Porrello, Angelo; Calderara, Simone
Published in: LECTURE NOTES IN COMPUTER SCIENCE
The field of Continual Learning (CL) has inspired numerous researchers over the years, leading to increasingly advanced countermeasures to the issue of catastrophic forgetting. Most studies have focused on the single-class scenario, where each example comes with a single label. The recent literature has successfully tackled such a setting, with impressive results. Differently, we shift our attention to the multi-label scenario, as we feel it to be more representative of real-world open problems. In our work, we show that existing state-of-the-art CL methods fail to achieve satisfactory performance, thus questioning the real advance claimed in recent years. Therefore, we assess both old-style and novel strategies and propose, on top of them, an approach called Selective Class Attention Distillation (SCAD). It relies on a knowledge transfer technique that seeks to align the representations of the student network – which trains continuously and is subject to forgetting – with the teacher ones, which is pretrained and kept frozen. Importantly, our method is able to selectively transfer the relevant information from the teacher to the student, thereby preventing irrelevant information from harming the student’s performance during online training. To demonstrate the merits of our approach, we conduct experiments on two different multi-label datasets, showing that our method outperforms the current state-of-the-art Continual Learning methods. Our findings highlight the importance of addressing the unique challenges posed by multi-label environments in the field of Continual Learning. The code of SCAD is available at https://github.com/aimagelab/SCAD-LOD-2024.
Authors: Corso, Giulia; Miccolis, Francesca; Porrello, Angelo; Bolelli, Federico; Calderara, Simone; Ficarra, Elisa
Whole Slide Images (WSIs) are crucial in histological diagnostics, providing high-resolution insights into cellular structures. In addition to challenges like the gigapixel scale of WSIs and the lack of pixel-level annotations, privacy restrictions further complicate their analysis. For instance, in a hospital network, different facilities need to collaborate on WSI analysis without the possibility of sharing sensitive patient data. A more practical and secure approach involves sharing models capable of continual adaptation to new data. However, without proper measures, catastrophic forgetting can occur. Traditional continual learning techniques rely on storing previous data, which violates privacy restrictions. To address this issue, this paper introduces Context Optimization Multiple Instance Learning (CooMIL), a rehearsal-free continual learning framework explicitly designed for WSI analysis. It employs a WSI-specific prompt learning procedure to adapt classification models across tasks, efficiently preventing catastrophic forgetting. Evaluated on four public WSI datasets from TCGA projects, our model significantly outperforms state-of-the-art methods within the WSI-based continual learning framework. The source code is available at https://github.com/FrancescaMiccolis/CooMIL.
Authors: Cappellino, Chiara; Mancusi, Gianluca; Mosconi, Matteo; Porrello, Angelo; Calderara, Simone; Cucchiara, Rita
Authors: Sommariva, Thomas; Calderara, Simone; Porrello, Angelo
Authors: Mosconi, Matteo; Sorokin, Andriy; Panariello, Aniello; Porrello, Angelo; Bonato, Jacopo; Cotogni, Marco; Sabetta, Luigi; Calderara, Simone; Cucchiara, Rita
Published in: LECTURE NOTES IN COMPUTER SCIENCE
The use of skeletal data allows deep learning models to perform action recognition efficiently and effectively. Herein, we believe that exploring this problem within the context of Continual Learning is crucial. While numerous studies focus on skeleton-based action recognition from a traditional offline perspective, only a handful venture into online approaches. In this respect, we introduce CHARON (Continual Human Action Recognition On skeletoNs), which maintains consistent performance while operating within an efficient framework. Through techniques like uniform sampling, interpolation, and a memory-efficient training stage based on masking, we achieve improved recognition accuracy while minimizing computational overhead. Our experiments on Split NTU-60 and the proposed Split NTU-120 datasets demonstrate that CHARON sets a new benchmark in this domain. The code is available at https://github.com/Sperimental3/CHARON.
Authors: Panariello, Aniello; Frascaroli, Emanuele; Buzzega, Pietro; Bonicelli, Lorenzo; Porrello, Angelo; Calderara, Simone