Publications

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Inferior Alveolar Canal Automatic Detection with Deep Learning CNNs on CBCTs: Development of a Novel Model and Release of Open-Source Dataset and Algorithm

Authors: Di Bartolomeo, Mattia; Pellacani, Arrigo; Bolelli, Federico; Cipriano, Marco; Lumetti, Luca; Negrello, Sara; Allegretti, Stefano; Minafra, Paolo; Pollastri, Federico; Nocini, Riccardo; Colletti, Giacomo; Chiarini, Luigi; Grana, Costantino; Anesi, Alexandre

Published in: APPLIED SCIENCES

Introduction: The need of accurate three-dimensional data of anatomical structures is increasing in the surgical field. The development of convolutional … (Read full abstract)

Introduction: The need of accurate three-dimensional data of anatomical structures is increasing in the surgical field. The development of convolutional neural networks (CNNs) has been helping to fill this gap by trying to provide efficient tools to clinicians. Nonetheless, the lack of a fully accessible datasets and open-source algorithms is slowing the improvements in this field. In this paper, we focus on the fully automatic segmentation of the Inferior Alveolar Canal (IAC), which is of immense interest in the dental and maxillo-facial surgeries. Conventionally, only a bidimensional annotation of the IAC is used in common clinical practice. A reliable convolutional neural network (CNNs) might be timesaving in daily practice and improve the quality of assistance. Materials and methods: Cone Beam Computed Tomography (CBCT) volumes obtained from a single radiological center using the same machine were gathered and annotated. The course of the IAC was annotated on the CBCT volumes. A secondary dataset with sparse annotations and a primary dataset with both dense and sparse annotations were generated. Three separate experiments were conducted in order to evaluate the CNN. The IoU and Dice scores of every experiment were recorded as the primary endpoint, while the time needed to achieve the annotation was assessed as the secondary end-point. Results: A total of 347 CBCT volumes were collected, then divided into primary and secondary datasets. Among the three experiments, an IoU score of 0.64 and a Dice score of 0.79 were obtained thanks to the pre-training of the CNN on the secondary dataset and the creation of a novel deep label propagation model, followed by proper training on the primary dataset. To the best of our knowledge, these results are the best ever published in the segmentation of the IAC. The datasets is publicly available and algorithm is published as open-source software. On average, the CNN could produce a 3D annotation of the IAC in 6.33 s, compared to 87.3 s needed by the radiology technician to produce a bidimensional annotation. Conclusions: To resume, the following achievements have been reached. A new state of the art in terms of Dice score was achieved, overcoming the threshold commonly considered of 0.75 for the use in clinical practice. The CNN could fully automatically produce accurate three-dimensional segmentation of the IAC in a rapid setting, compared to the bidimensional annotations commonly used in the clinical practice and generated in a time-consuming manner. We introduced our innovative deep label propagation method to optimize the performance of the CNN in the segmentation of the IAC. For the first time in this field, the datasets and the source codes used were publicly released, granting reproducibility of the experiments and helping in the improvement of IAC segmentation.

2023 Articolo su rivista

Inferring Causal Factors of Core Affect Dynamics on Social Participation through the Lens of the Observer

Authors: D'Amelio, Alessandro; Patania, Sabrina; Buršić, Sathya; Cuculo, Vittorio; Boccignone, Giuseppe

Published in: SENSORS

A core endeavour in current affective computing and social signal processing research is the construction of datasets embedding suitable ground … (Read full abstract)

A core endeavour in current affective computing and social signal processing research is the construction of datasets embedding suitable ground truths to foster machine learning methods. This practice brings up hitherto overlooked intricacies. In this paper, we consider causal factors potentially arising when human raters evaluate the affect fluctuations of subjects involved in dyadic interactions and subsequently categorise them in terms of social participation traits. To gauge such factors, we propose an emulator as a statistical approximation of the human rater, and we first discuss the motivations and the rationale behind the approach.The emulator is laid down in the next section as a phenomenological model where the core affect stochastic dynamics as perceived by the rater are captured through an Ornstein-Uhlenbeck process; its parameters are then exploited to infer potential causal effects in the attribution of social traits. Following that, by resorting to a publicly available dataset, the adequacy of the model is evaluated in terms of both human raters' emulation and machine learning predictive capabilities. We then present the results, which are followed by a general discussion concerning findings and their implications, together with advantages and potential applications of the approach.

2023 Articolo su rivista

Input Perturbation Reduces Exposure Bias in Diffusion Models

Authors: Ning, M.; Sangineto, E.; Porrello, A.; Calderara, S.; Cucchiara, R.

Published in: PROCEEDINGS OF MACHINE LEARNING RESEARCH

Denoising Diffusion Probabilistic Models have shown an impressive generation quality although their long sampling chain leads to high computational costs. … (Read full abstract)

Denoising Diffusion Probabilistic Models have shown an impressive generation quality although their long sampling chain leads to high computational costs. In this paper, we observe that a long sampling chain also leads to an error accumulation phenomenon, which is similar to the exposure bias problem in autoregressive text generation. Specifically, we note that there is a discrepancy between training and testing, since the former is conditioned on the ground truth samples, while the latter is conditioned on the previously generated results. To alleviate this problem, we propose a very simple but effective training regularization, consisting in perturbing the ground truth samples to simulate the inference time prediction errors. We empirically show that, without affecting the recall and precision, the proposed input perturbation leads to a significant improvement in the sample quality while reducing both the training and the inference times. For instance, on CelebA 64×64, we achieve a new state-of-the-art FID score of 1.27, while saving 37.5% of the training time. The code is available at https://github.com/forever208/DDPM-IP.

2023 Relazione in Atti di Convegno

LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On

Authors: Morelli, Davide; Baldrati, Alberto; Cartella, Giuseppe; Cornia, Marcella; Bertini, Marco; Cucchiara, Rita

The rapidly evolving fields of e-commerce and metaverse continue to seek innovative approaches to enhance the consumer experience. At the … (Read full abstract)

The rapidly evolving fields of e-commerce and metaverse continue to seek innovative approaches to enhance the consumer experience. At the same time, recent advancements in the development of diffusion models have enabled generative networks to create remarkably realistic images. In this context, image-based virtual try-on, which consists in generating a novel image of a target model wearing a given in-shop garment, has yet to capitalize on the potential of these powerful generative solutions. This work introduces LaDI-VTON, the first Latent Diffusion textual Inversion-enhanced model for the Virtual Try-ON task. The proposed architecture relies on a latent diffusion model extended with a novel additional autoencoder module that exploits learnable skip connections to enhance the generation process preserving the model's characteristics. To effectively maintain the texture and details of the in-shop garment, we propose a textual inversion component that can map the visual features of the garment to the CLIP token embedding space and thus generate a set of pseudo-word token embeddings capable of conditioning the generation process. Experimental results on Dress Code and VITON-HD datasets demonstrate that our approach outperforms the competitors by a consistent margin, achieving a significant milestone for the task.

2023 Relazione in Atti di Convegno

Let's stay close: An examination of the effects of imagined contact on behavior toward children with disability

Authors: Cocco, V. M.; Bisagno, E.; Bernardo, G. A. D.; Bicocchi, N.; Calderara, S.; Palazzi, A.; Cucchiara, R.; Zambonelli, F.; Cadamuro, A.; Stathi, S.; Crisp, R.; Vezzali, L.

Published in: SOCIAL DEVELOPMENT

In line with current developments in indirect intergroup contact literature, we conducted a field study using the imagined contact paradigm … (Read full abstract)

In line with current developments in indirect intergroup contact literature, we conducted a field study using the imagined contact paradigm among high-status (Italian children) and low-status (children with foreign origins) group members (N = 122; 53 females, mean age = 7.52 years). The experiment aimed to improve attitudes and behavior toward a different low-status group, children with disability. To assess behavior, we focused on an objective measure that captures the physical distance between participants and a child with disability over the course of a five-minute interaction (i.e., while playing together). Results from a 3-week intervention revealed that in the case of high-status children imagined contact, relative to a no-intervention control condition, improved outgroup attitudes and behavior, and strengthened helping and contact intentions. These effects however did not emerge among low-status children. The results are discussed in the context of intergroup contact literature, with emphasis on the implications of imagined contact for educational settings.

2023 Articolo su rivista

Let's ViCE! Mimicking Human Cognitive Behavior in Image Generation Evaluation

Authors: Betti, Federico; Staiano, Jacopo; Baraldi, Lorenzo; Baraldi, Lorenzo; Cucchiara, Rita; Sebe, Nicu

2023 Relazione in Atti di Convegno

Method for generating probabilistic representations and deep neural network

Authors: Garattoni, Lorenzo; Francesca, Gianpiero; Pini, Stefano; Simoni, Alessandro; Vezzani, Roberto; Borghi, Guido

2023 Brevetto

Metodo per stimare una posizione conforme di un occhio, dispositivo per esami oftalmici implementante tale metodo e relativo kit elettronico per aggiornare un dispositivo oftalmico

Authors: Gibertoni, Giovanni; Rovati, Luigi; Borghi, Guido

La presente invenzione riguarda un metodo per stimare automaticamente una posizione conforme della pupilla di un paziente durante l’esecuzione di … (Read full abstract)

La presente invenzione riguarda un metodo per stimare automaticamente una posizione conforme della pupilla di un paziente durante l’esecuzione di un esame oftalmico. Il metodo si basa sull’acquisizione di immagini rappresentative della pupilla e sulla loro elaborazione mediante algoritmi di classificazione, comprendenti tecniche di machine learning, al fine di determinare la posizione della pupilla rispetto all’asse ottico di un dispositivo oftalmico o di valutare un parametro di stato della pupilla. L’invenzione riguarda inoltre un dispositivo per esami oftalmici che implementa tale metodo, comprendente un modulo ottico che include uno specchio dicroico configurato per deviare un segnale luminoso rappresentativo della pupilla verso un sensore ottico di acquisizione di immagini, consentendo al contempo ad un ulteriore segnale luminoso rappresentativo della pupilla di propagarsi senza interferenze rilevanti verso componenti ottiche interne del dispositivo oftalmico per l’esecuzione dell’esame di interesse. L’invenzione comprende altresì un kit elettronico collegabile ad un dispositivo oftalmico esistente, che ne consente l’aggiornamento funzionale per l’esecuzione della stima della posizione della pupilla senza alterare le funzionalità diagnostiche originarie. La soluzione proposta migliora l’affidabilità, la ripetibilità e l’usabilità degli esami oftalmici eseguiti da personale specializzato, mantenendo la compatibilità con la strumentazione oftalmica esistente.

2023 Brevetto

MiREx: mRNA levels prediction from gene sequence and miRNA target knowledge

Authors: Pianfetti, E.; Lovino, M.; Ficarra, E.; Martignetti, L.

Published in: BMC BIOINFORMATICS

Messenger RNA (mRNA) has an essential role in the protein production process. Predicting mRNA expression levels accurately is crucial for … (Read full abstract)

Messenger RNA (mRNA) has an essential role in the protein production process. Predicting mRNA expression levels accurately is crucial for understanding gene regulation, and various models (statistical and neural network-based) have been developed for this purpose. A few models predict mRNA expression levels from the DNA sequence, exploiting the DNA sequence and gene features (e.g., number of exons/introns, gene length). Other models include information about long-range interaction molecules (i.e., enhancers/silencers) and transcriptional regulators as predictive features, such as transcription factors (TFs) and small RNAs (e.g., microRNAs - miRNAs). Recently, a convolutional neural network (CNN) model, called Xpresso, has been proposed for mRNA expression level prediction leveraging the promoter sequence and mRNAs’ half-life features (gene features). To push forward the mRNA level prediction, we present miREx, a CNN-based tool that includes information about miRNA targets and expression levels in the model. Indeed, each miRNA can target specific genes, and the model exploits this information to guide the learning process. In detail, not all miRNAs are included, only a selected subset with the highest impact on the model. MiREx has been evaluated on four cancer primary sites from the genomics data commons (GDC) database: lung, kidney, breast, and corpus uteri. Results show that mRNA level prediction benefits from selected miRNA targets and expression information. Future model developments could include other transcriptional regulators or be trained with proteomics data to infer protein levels.

2023 Articolo su rivista

Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing

Authors: Baldrati, Alberto; Morelli, Davide; Cartella, Giuseppe; Cornia, Marcella; Bertini, Marco; Cucchiara, Rita

Published in: PROCEEDINGS IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION

Fashion illustration is used by designers to communicate their vision and to bring the design idea from conceptualization to realization, … (Read full abstract)

Fashion illustration is used by designers to communicate their vision and to bring the design idea from conceptualization to realization, showing how clothes interact with the human body. In this context, computer vision can thus be used to improve the fashion design process. Differently from previous works that mainly focused on the virtual try-on of garments, we propose the task of multimodal-conditioned fashion image editing, guiding the generation of human-centric fashion images by following multimodal prompts, such as text, human body poses, and garment sketches. We tackle this problem by proposing a new architecture based on latent diffusion models, an approach that has not been used before in the fashion domain. Given the lack of existing datasets suitable for the task, we also extend two existing fashion datasets, namely Dress Code and VITON-HD, with multimodal annotations collected in a semi-automatic manner. Experimental results on these new datasets demonstrate the effectiveness of our proposal, both in terms of realism and coherence with the given multimodal inputs. Source code and collected multimodal annotations are publicly available at: https://github.com/aimagelab/multimodal-garment-designer.

2023 Relazione in Atti di Convegno

Page 22 of 106 • Total publications: 1054