Publications

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Unveiling the Truth: Exploring Human Gaze Patterns in Fake Images

Authors: Cartella, Giuseppe; Cuculo, Vittorio; Cornia, Marcella; Cucchiara, Rita

Published in: IEEE SIGNAL PROCESSING LETTERS

Creating high-quality and realistic images is now possible thanks to the impressive advancements in image generation. A description in natural … (Read full abstract)

Creating high-quality and realistic images is now possible thanks to the impressive advancements in image generation. A description in natural language of your desired output is all you need to obtain breathtaking results. However, as the use of generative models grows, so do concerns about the propagation of malicious content and misinformation. Consequently, the research community is actively working on the development of novel fake detection techniques, primarily focusing on low-level features and possible fingerprints left by generative models during the image generation process. In a different vein, in our work, we leverage human semantic knowledge to investigate the possibility of being included in frameworks of fake image detection. To achieve this, we collect a novel dataset of partially manipulated images using diffusion models and conduct an eye-tracking experiment to record the eye movements of different observers while viewing real and fake stimuli. A preliminary statistical analysis is conducted to explore the distinctive patterns in how humans perceive genuine and altered images. Statistical findings reveal that, when perceiving counterfeit samples, humans tend to focus on more confined regions of the image, in contrast to the more dispersed observational pattern observed when viewing genuine images. Our dataset is publicly available at: https://github.com/aimagelab/unveiling-the-truth.

2024 Articolo su rivista

V-MAD: Video-based Morphing Attack Detection in Operational Scenarios

Authors: Borghi, G.; Franco, A.; Di Domenico, N.; Ferrara, M.; Maltoni, D.

In response to the rising threat of the face morphing attack, this paper introduces and explores the potential of Video-based … (Read full abstract)

In response to the rising threat of the face morphing attack, this paper introduces and explores the potential of Video-based Morphing Attack Detection (V-MAD) systems in real-world operational scenarios. While current morphing attack detection methods primarily focus on a single or a pair of images, V-MAD is based on video sequences, exploiting the video streams acquired by face verification tools available, for instance, at airport gates. We show for the first time the advantages that the availability of multiple probe frames brings to the morphing attack detection task, especially in scenarios where the quality of probe images is varied. Experimental results on a real operational database demonstrate that video sequences represent valuable information for increasing the performance of morphing attack detection systems.

2024 Relazione in Atti di Convegno

Video Surveillance and Privacy: A Solvable Paradox?

Authors: Cucchiara, Rita; Baraldi, Lorenzo; Cornia, Marcella; Sarto, Sara

Published in: COMPUTER

Video Surveillance started decades ago to remotely monitor specific areas and allow control from human inspectors. Later, Computer Vision gradually … (Read full abstract)

Video Surveillance started decades ago to remotely monitor specific areas and allow control from human inspectors. Later, Computer Vision gradually replaced human monitoring, firstly through motion alerts and now with Deep Learning techniques. From the beginning of this journey, people have worried about the risk of privacy violations. This article surveys the main steps of Computer Vision in Video Surveillance, from early approaches for people detection and tracking to action analysis and language description, outlining the most relevant directions on the topic to deal with privacy concerns. We show how the relationship between Video Surveillance and privacy is a biased paradox since surveillance provides increased safety but does not necessarily require the people identification. Through experiments on action recognition and natural language description, we showcase that the paradox of surveillance and privacy can be solved by Artificial Intelligence and that the respect of human rights is not an impossible chimera.

2024 Articolo su rivista

What’s Outside the Intersection? Fine-grained Error Analysis for Semantic Segmentation Beyond IoU

Authors: Bernhard, Maximilian; Amoroso, Roberto; Kindermann, Yannic; Baraldi, Lorenzo; Cucchiara, Rita; Tresp, Volker; Schubert, Matthias

2024 Relazione in Atti di Convegno

Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs

Authors: Caffagni, Davide; Cocchi, Federico; Moratelli, Nicholas; Sarto, Sara; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita

Published in: IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS

Multimodal LLMs are the natural evolution of LLMs and enlarge their capabilities so as to work beyond the pure textual … (Read full abstract)

Multimodal LLMs are the natural evolution of LLMs and enlarge their capabilities so as to work beyond the pure textual modality. As research is being carried out to design novel architectures and vision-and-language adapters in this paper we concentrate on endowing such models with the capability of answering questions that require external knowledge. Our approach termed Wiki-LLaVA aims at integrating an external knowledge source of multimodal documents which is accessed through a hierarchical retrieval pipeline. Relevant passages using this approach are retrieved from the external knowledge source and employed as additional context for the LLM augmenting the effectiveness and precision of generated dialogues. We conduct extensive experiments on datasets tailored for visual question answering with external data and demonstrate the appropriateness of our approach.

2024 Relazione in Atti di Convegno

A Framework to Improve the Comparability and Reproducibility of Morphing Attack Detectors

Authors: Di Domenico, Nicolò; Borghi, Guido; Franco, Annalisa; Ferrara, Matteo; Maltoni, Davide

2023 Relazione in Atti di Convegno

Annotating the Inferior Alveolar Canal: the Ultimate Tool

Authors: Lumetti, Luca; Pipoli, Vittorio; Bolelli, Federico; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

The Inferior Alveolar Nerve (IAN) is of main interest in the maxillofacial field, as an accurate localization of such nerve … (Read full abstract)

The Inferior Alveolar Nerve (IAN) is of main interest in the maxillofacial field, as an accurate localization of such nerve reduces the risks of injury during surgical procedures. Although recent literature has focused on developing novel deep learning techniques to produce accurate segmentation masks of the canal containing the IAN, there are still strong limitations due to the scarce amount of publicly available 3D maxillofacial datasets. In this paper, we present an improved version of a previously released tool, IACAT (Inferior Alveolar Canal Annotation Tool), today used by medical experts to produce 3D ground truth annotation. In addition, we release a new dataset, ToothFairy, which is part of the homonymous MICCAI2023 challenge hosted by the Grand-Challenge platform, as an extension of the previously released Maxillo dataset, which was the only publicly available. With ToothFairy, the number of annotations has been increased as well as the quality of existing data.

2023 Relazione in Atti di Convegno

Artificial intelligence evaluation of confocal microscope prostate images: our preliminary experience

Authors: Bianchi, G.; Puliatti, S.; Rodriguez, N.; Micali, S.; Bertoni, L.; Reggiani Bonetti, L.; Caramaschi, S.; Bolelli, F.; Pinamonti, M.; Rozze, D.; Grana, C.

Published in: MINERVA UROLOGY AND NEPHROLOGY

2023 Articolo su rivista

Avoiding the Pitfalls on Stock Market: Challenges and Solutions in Developing Quantitative Strategies

Authors: Bergianti, M.; Cioffo, N.; Del Buono, F.; Paganelli, M.; Porrello, A.

Published in: CEUR WORKSHOP PROCEEDINGS

Quantitative stock trading based on Machine Learning (ML) and Deep Learning (DL) has gained great attention in recent years thanks … (Read full abstract)

Quantitative stock trading based on Machine Learning (ML) and Deep Learning (DL) has gained great attention in recent years thanks to the ever-increasing availability of financial data and the ability of this technology to analyze the complex dynamics of the stock market. Despite the plethora of approaches present in literature, a large gap exists between the solutions produced by the scientific community and the practices adopted in real-world systems. Most of these works in fact lack a practical vision of the problem and ignore the main issues afflicting fintech practitioners. To fill such a gap, we provide a systematic review of the main dangers affecting the development of an ML/DL pipeline in the financial domain. They include managing the stochastic and non-stationary characteristics of stock data, various types of bias, overfitting of models and devising impartial valuation methods. Finally, we present possible solutions to these critical issues.

2023 Relazione in Atti di Convegno

BERT Classifies SARS-CoV-2 Variants

Authors: Ghione, G.; Lovino, M.; Ficarra, E.; Cirrincione, G.

Published in: SMART INNOVATION, SYSTEMS AND TECHNOLOGIES

Medical diagnostics faced numerous difficulties during the COVID-19 pandemic. One of these has been the need for ongoing monitoring of … (Read full abstract)

Medical diagnostics faced numerous difficulties during the COVID-19 pandemic. One of these has been the need for ongoing monitoring of SARS-CoV-2 mutations. Genomics is the technique most frequently used for precisely identifying variants. The ongoing global gathering of RNA samples of the virus has made such an approach possible. Nevertheless, variant identification techniques are frequently resource-intensive. As a result, the diagnostic capability of small medical laboratories might not be sufficient. In this work, an effective deep learning strategy for identifying SARS-CoV-2 variants is presented. This work makes two contributions: (1) a fine-tuning architecture of Bidirectional Encoder Representations from Transformers (BERT) to identify SARS-CoV-2 variants; (2) providing biological insights by exploiting BERT self-attention. Such an approach enables the analysis of the S gene of the virus to quickly recognize its variant. The selected model BERT is a transformer-based neural network first developed for natural language processing. Nonetheless, it has been effectively used in numerous applications, such as genomic sequence analysis. Thus, the fine-tuning of BERT was performed to adapt it to the RNA sequence domain, achieving a 98.59% F1-score on test data: it was successful in identifying variants circulating to date. The interpretability of the model was examined, since BERT utilizes the self-attention mechanism. In fact, it was discovered that by attending particular areas of the S gene, BERT extracts pertinent biological information on variants. Thus, the presented approach allows obtaining insights into the particular characteristics of SARS-CoV-2 RNA samples.

2023 Capitolo/Saggio

Page 19 of 106 • Total publications: 1054