Publications by Fabio Quattrini

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Active filters (Clear): Author: Fabio Quattrini

Alfie: Democratising RGBA image generation with no $$$

Authors: Quattrini, Fabio; Pippi, Vittorio; Cascianelli, Silvia; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

2025 Relazione in Atti di Convegno

Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas

Authors: Quattrini, F.; Pippi, V.; Cascianelli, S.; Cucchiara, R.

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Diffusion models have become the State-of-the-Art for text-to-image generation, and increasing research effort has been dedicated to adapting the inference … (Read full abstract)

Diffusion models have become the State-of-the-Art for text-to-image generation, and increasing research effort has been dedicated to adapting the inference process of pretrained diffusion models to achieve zero-shot capabilities. An example is the generation of panorama images, which has been tackled in recent works by combining independent diffusion paths over overlapping latent features, which is referred to as joint diffusion, obtaining perceptually aligned panoramas. However, these methods often yield semantically incoherent outputs and trade-off diversity for uniformity. To overcome this limitation, we propose the Merge-Attend-Diffuse operator, which can be plugged into different types of pretrained diffusion models used in a joint diffusion setting to improve the perceptual and semantical coherence of the generated panorama images. Specifically, we merge the diffusion paths, reprogramming self- and cross-attention to operate on the aggregated latent space. Extensive quantitative and qualitative experimental analysis, together with a user study, demonstrate that our method maintains compatibility with the input prompt and visual quality of the generated images while increasing their semantic coherence. We release the code at https://github.com/aimagelab/MAD.

2025 Relazione in Atti di Convegno

Zero-Shot Styled Text Image Generation, but Make It Autoregressive

Authors: Pippi, Vittorio; Quattrini, Fabio; Cascianelli, Silvia; Tonioni, Alessio; Cucchiara, Rita

2025 Relazione in Atti di Convegno

Binarizing Documents by Leveraging both Space and Frequency

Authors: Quattrini, F.; Pippi, V.; Cascianelli, S.; Cucchiara, R.

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Document Image Binarization is a well-known problem in Document Analysis and Computer Vision, although it is far from being solved. … (Read full abstract)

Document Image Binarization is a well-known problem in Document Analysis and Computer Vision, although it is far from being solved. One of the main challenges of this task is that documents generally exhibit degradations and acquisition artifacts that can greatly vary throughout the page. Nonetheless, even when dealing with a local patch of the document, taking into account the overall appearance of a wide portion of the page can ease the prediction by enriching it with semantic information on the ink and background conditions. In this respect, approaches able to model both local and global information have been proven suitable for this task. In particular, recent applications of Vision Transformer (ViT)-based models, able to model short and long-range dependencies via the attention mechanism, have demonstrated their superiority over standard Convolution-based models, which instead struggle to model global dependencies. In this work, we propose an alternative solution based on the recently introduced Fast Fourier Convolutions, which overcomes the limitation of standard convolutions in modeling global information while requiring fewer parameters than ViTs. We validate the effectiveness of our approach via extensive experimental analysis considering different types of degradations.

2024 Relazione in Atti di Convegno

HWD: A Novel Evaluation Score for Styled Handwritten Text Generation

Authors: Pippi, V.; Quattrini, F.; Cascianelli, S.; Cucchiara, R.

2023 Relazione in Atti di Convegno

Volumetric Fast Fourier Convolution for Detecting Ink on the Carbonized Herculaneum Papyri

Authors: Quattrini, F.; Pippi, V.; Cascianelli, S.; Cucchiara, R.

Published in: ... IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS

Recent advancements in Digital Document Restoration (DDR) have led to significant breakthroughs in analyzing highly damaged written artifacts. Among those, … (Read full abstract)

Recent advancements in Digital Document Restoration (DDR) have led to significant breakthroughs in analyzing highly damaged written artifacts. Among those, there has been an increasing interest in applying Artificial Intelligence techniques for virtually unwrapping and automatically detecting ink on the Herculaneum papyri collection. This collection consists of carbonized scrolls and fragments of documents, which have been digitized via X-ray tomography to allow the development of ad-hoc deep learning-based DDR solutions. In this work, we propose a modification of the Fast Fourier Convolution operator for volumetric data and apply it in a segmentation architecture for ink detection on the challenging Herculaneum papyri, demonstrating its suitability via deep experimental analysis. To encourage the research on this task and the application of the proposed operator to other tasks involving volumetric data, we will release our implementation (https://github.com/aimagelab/vffc).

2023 Relazione in Atti di Convegno