Publications

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Metodo di localizzazione

Authors: Masserdotti, Alessandro; Cuculo, Vittorio; Ciminieri, Daniele

La presente invenzione riguarda il settore tecnico dei metodi e dei sistemi di localizzazione In particolare, la presente invenzione riguarda … (Read full abstract)

La presente invenzione riguarda il settore tecnico dei metodi e dei sistemi di localizzazione In particolare, la presente invenzione riguarda un metodo per la localizzazione di un terminale all'interno di un'area predefinita ed il relativo sistema specificatamente configurato per l'esecuzione del metodo. Negli ultimi decenni, la possibilità di fornire informazioni alle persone in base alla loro posizione geografica ha incoraggiato lo sviluppo di sistemi per la localizzazione di dispositivi e oggetti, anche all'interno di edifici. L'utilizzo di questa tecnologia è individuabile soprattutto in applicazioni di geomarketing che includono, ad esempio, la ricerca e la navigazione verso esercizi commerciali, la pubblicità mirata e l'analisi dei flussi dei clienti. Tuttavia, anche altri scenari hanno beneficiato di questa tecnologia, spaziando dalla ottimizzazione della logistica di magazzino al potenziamento dell'esperienza utente in ambito museale; dalle tecnologie innovative per la salute e telemedicina al monitoraggio delle prestazioni sportive.

2022 Brevetto

On the Effectiveness of Lipschitz-Driven Rehearsal in Continual Learning

Authors: Bonicelli, Lorenzo; Boschini, Matteo; Porrello, Angelo; Spampinato, Concetto; Calderara, Simone

Published in: ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS

Rehearsal approaches enjoy immense popularity with Continual Learning (CL) practitioners. These methods collect samples from previously encountered data distributions in … (Read full abstract)

Rehearsal approaches enjoy immense popularity with Continual Learning (CL) practitioners. These methods collect samples from previously encountered data distributions in a small memory buffer; subsequently, they repeatedly optimize on the latter to prevent catastrophic forgetting. This work draws attention to a hidden pitfall of this widespread practice: repeated optimization on a small pool of data inevitably leads to tight and unstable decision boundaries, which are a major hindrance to generalization. To address this issue, we propose Lipschitz-DrivEn Rehearsal (LiDER), a surrogate objective that induces smoothness in the backbone network by constraining its layer-wise Lipschitz constants w.r.t. replay examples. By means of extensive experiments, we show that applying LiDER delivers a stable performance gain to several state-of-the-art rehearsal CL methods across multiple datasets, both in the presence and absence of pre-training. Through additional ablative experiments, we highlight peculiar aspects of buffer overfitting in CL and better characterize the effect produced by LiDER. Code is available at https://github.com/aimagelab/LiDER

2022 Relazione in Atti di Convegno

One DAG to Rule Them All

Authors: Bolelli, Federico; Allegretti, Stefano; Grana, Costantino

Published in: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

In this paper, we present novel strategies for optimizing the performance of many binary image processing algorithms. These strategies are … (Read full abstract)

In this paper, we present novel strategies for optimizing the performance of many binary image processing algorithms. These strategies are collected in an open-source framework, GRAPHGEN, that is able to automatically generate optimized C++ source code implementing the desired optimizations. Simply starting from a set of rules, the algorithms introduced with the GRAPHGEN framework can generate decision trees with minimum average path-length, possibly considering image pattern frequencies, apply state prediction and code compression by the use of Directed Rooted Acyclic Graphs (DRAGs). Moreover, the proposed algorithmic solutions allow to combine different optimization techniques and significantly improve performance. Our proposal is showcased on three classical and widely employed algorithms (namely Connected Components Labeling, Thinning, and Contour Tracing). When compared to existing approaches —in 2D and 3D—, implementations using the generated optimal DRAGs perform significantly better than previous state-of-the-art algorithms, both on CPU and GPU.

2022 Articolo su rivista

Predicting gene expression levels from DNA sequences and post-transcriptional information with transformers

Authors: Pipoli, Vittorio; Cappelli, Mattia; Palladini, Alessandro; Peluso, Carlo; Lovino, Marta; Ficarra, Elisa

Published in: COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE

Background and objectives: In the latest years, the prediction of gene expression levels has been crucial due to its potential … (Read full abstract)

Background and objectives: In the latest years, the prediction of gene expression levels has been crucial due to its potential applications in the clinics. In this context, Xpresso and others methods based on Convolutional Neural Networks and Transformers were firstly proposed to this aim. However, all these methods embed data with a standard one-hot encoding algorithm, resulting in impressively sparse matrices. In addition, post-transcriptional regulation processes, which are of uttermost importance in the gene expression process, are not considered in the model.Methods: This paper presents Transformer DeepLncLoc, a novel method to predict the abundance of the mRNA (i.e., gene expression levels) by processing gene promoter sequences, managing the problem as a regression task. The model exploits a transformer-based architecture, introducing the DeepLncLoc method to perform the data embedding. Since DeepLncloc is based on word2vec algorithm, it avoids the sparse matrices problem.Results: Post-transcriptional information related to mRNA stability and transcription factors is included in the model, leading to significantly improved performances compared to the state-of-the-art works. Transformer DeepLncLoc reached 0.76 of R-2 evaluation metric compared to 0.74 of Xpresso.Conclusion: The Multi-Headed Attention mechanisms which characterizes the transformer methodology is suitable for modeling the interactions between DNA's locations, overcoming the recurrent models. Finally, the integration of the transcription factors data in the pipeline leads to impressive gains in predictive power. (C) 2022 Elsevier B.V. All rights reserved.

2022 Articolo su rivista

pyVHR: a Python framework for remote photoplethysmography

Authors: Boccignone, G.; Conte, Donatello; Cuculo, V.; D’Amelio, Alessandro; Grossi, Giuliano; Lanzarotti, R.; Mortara, Edoardo

Published in: PEERJ. COMPUTER SCIENCE.

Remote photoplethysmography (rPPG) aspires to automatically estimate heart rate (HR) variability from videos in realistic environments. A number of effective … (Read full abstract)

Remote photoplethysmography (rPPG) aspires to automatically estimate heart rate (HR) variability from videos in realistic environments. A number of effective methods relying on data-driven, model-based and statistical approaches have emerged in the past two decades. They exhibit increasing ability to estimate the blood volume pulse (BVP) signal upon which BPMs (Beats per Minute) can be estimated. Furthermore, learning-based rPPG methods have been recently proposed. The present pyVHR framework represents a multi-stage pipeline covering the whole process for extracting and analyzing HR fluctuations. It is designed for both theoretical studies and practical applications in contexts where wearable sensors are inconvenient to use. Namely, pyVHR supports either the development, assessment and statistical analysis of novel rPPG methods, either traditional or learning-based, or simply the sound comparison of well-established methods on multiple datasets. It is built up on accelerated Python libraries for video and signal processing as well as equipped with parallel/accelerated ad-hoc procedures paving the way to online processing on a GPU. The whole accelerated process can be safely run in real-time for 30 fps HD videos with an average speedup of around 5. This paper is shaped in the form of a gentle tutorial presentation of the framework.

2022 Articolo su rivista

Quest for Speed: The Epic Saga of Record-Breaking on OpenCV Connected Components Extraction

Authors: Bolelli, Federico; Allegretti, Stefano; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Connected Components Labeling (CCL) represents an essential part of many Image Processing and Computer Vision pipelines. Given its relevance on … (Read full abstract)

Connected Components Labeling (CCL) represents an essential part of many Image Processing and Computer Vision pipelines. Given its relevance on the field, it has been part of most cutting-edge Computer Vision libraries. In this paper, all the algorithms included in the OpenCV during the years are reviewed, from sequential to parallel/GPU-based implementations. Our goal is to provide a better understanding of what has changed and why one algorithm should be preferred to another both in terms of memory usage and execution speed.

2022 Relazione in Atti di Convegno

Retrieval-Augmented Transformer for Image Captioning

Authors: Sarto, Sara; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita

Image captioning models aim at connecting Vision and Language by providing natural language descriptions of input images. In the past … (Read full abstract)

Image captioning models aim at connecting Vision and Language by providing natural language descriptions of input images. In the past few years, the task has been tackled by learning parametric models and proposing visual feature extraction advancements or by modeling better multi-modal connections. In this paper, we investigate the development of an image captioning approach with a kNN memory, with which knowledge can be retrieved from an external corpus to aid the generation process. Our architecture combines a knowledge retriever based on visual similarities, a differentiable encoder, and a kNN-augmented attention layer to predict tokens based on the past context and on text retrieved from the external memory. Experimental results, conducted on the COCO dataset, demonstrate that employing an explicit external memory can aid the generation process and increase caption quality. Our work opens up new avenues for improving image captioning models at larger scale.

2022 Relazione in Atti di Convegno

SARS-CoV-2 variants classification and characterization

Authors: Borgato, S.; Bottino, M.; Lovino, M.; Ficarra, E.

Published in: EPIC SERIES IN COMPUTING

As of late 2019, the SARS-CoV-2 virus has spread globally, giving several variants over time. These variants, unfortunately, differ from … (Read full abstract)

As of late 2019, the SARS-CoV-2 virus has spread globally, giving several variants over time. These variants, unfortunately, differ from the original sequence identified in Wuhan, thus risking compromising the efficacy of the vaccines developed. Some software has been released to recognize currently known and newly spread variants. However, some of these tools are not entirely automatic. Some others, instead, do not return a detailed characterization of all the mutations in the samples. Indeed, such characterization can be helpful for biologists to understand the variability between samples. This paper presents a Machine Learning (ML) approach to identifying existing and new variants completely automatically. In addition, a detailed table showing all the alterations and mutations found in the samples is provided in output to the user. SARS-CoV-2 sequences are obtained from the GISAID database, and a list of features is custom designed (e.g., number of mutations in each gene of the virus) to train the algorithm. The recognition of existing variants is performed through a Random Forest classifier while identifying newly spread variants is accomplished by the DBSCAN algorithm. Both Random Forest and DBSCAN techniques demonstrated high precision on a new variant that arose during the drafting of this paper (used only in the testing phase of the algorithm). Therefore, researchers will significantly benefit from the proposed algorithm and the detailed output with the main alterations of the samples. Data availability: the tool is freely available at https://github.com/sofiaborgato/-SARS-CoV-2-variants-classification-and-characterization.

2022 Relazione in Atti di Convegno

SeeFar: Vehicle Speed Estimation and Flow Analysis from a Moving UAV

Authors: Ning, M.; Ma, X.; Lu, Y.; Calderara, S.; Cucchiara, R.

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Visual perception from drones has been largely investigated for Intelligent Traffic Monitoring System (ITMS) recently. In this paper, we introduce … (Read full abstract)

Visual perception from drones has been largely investigated for Intelligent Traffic Monitoring System (ITMS) recently. In this paper, we introduce SeeFar to achieve vehicle speed estimation and traffic flow analysis based on YOLOv5 and DeepSORT from a moving drone. SeeFar differs from previous works in three key ways: the speed estimation and flow analysis components are integrated into a unified framework; our method of predicting car speed has the least constraints while maintaining a high accuracy; our flow analysor is direction-aware and outlier-aware. Specifically, we design the speed estimator only using the camera imaging geometry, where the transformation between world space and image space is completed by the variable Ground Sampling Distance. Besides, previous papers do not evaluate their speed estimators at scale due to the difficulty of obtaining the ground truth, we therefore propose a simple yet efficient approach to estimate the true speeds of vehicles via the prior size of the road signs. We evaluate SeeFar on our ten videos that contain 929 vehicle samples. Experiments on these sequences demonstrate the effectiveness of SeeFar by achieving 98.0% accuracy of speed estimation and 99.1% accuracy of traffic volume prediction, respectively.

2022 Relazione in Atti di Convegno

Self-configuring BLE deep sleep network for fault tolerant WSN

Authors: Rosati, C. A.; Cervo, A.; Bertoli, A.; Santacaterina, M.; Battilani, N.; Fantuzzi, C.

Published in: IFAC-PAPERSONLINE

This paper is focused on Wireless Sensor Network (WSN) leveraging on Bluetooth Low Energy (BLE) connectivity for low energy applications … (Read full abstract)

This paper is focused on Wireless Sensor Network (WSN) leveraging on Bluetooth Low Energy (BLE) connectivity for low energy applications which is fault tolerant versus communication path failures. The topic is important to create a robust sensorized environment to be applied in industrial context or smart infrastructure to enable scheduled monitoring with low power consumption applications. Currently BLE applications are mainly thought for smart home solutions, health care and positioning systems. In those applications the BLE nodes are continuously supplied by external power suppliers. Our goal is to design a self-configuring network with a synchronized deep sleep behavior, aimed to optimize the energy consumption, with an overall active time interval constraint optimized with a data-driven method. The aim is to find a tradeoff between the on time and the ability to collect all the nodes data, pursuing a low power consumption. Our research is based on BLE protocols, interaction between edge systems for data collection and cloud system for data analysis and software agent optimization system. The paper analyses different configurations and describes the possible optimization algorithm to be used for the software agent design, in order to reach a fine-tuned control to improve the fault tolerance and fault diagnosis of the system. Finally experimental results are compared with the estimates obtained via a software simulation tool implemented for this architectural pattern.

Page 29 of 106 • Total publications: 1054