Publications by Elisa Ficarra

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Active filters (Clear): Author: Elisa Ficarra

DEEPrior: a deep learning tool for the prioritization of gene fusions

Authors: Lovino, Marta; Ciaburri, Maria Serena; Urgese, Gianvito; Di Cataldo, Santa; Ficarra, Elisa

Published in: BIOINFORMATICS

Summary: In the last decade, increasing attention has been paid to the study of gene fusions. However, the problem of … (Read full abstract)

Summary: In the last decade, increasing attention has been paid to the study of gene fusions. However, the problem of determining whether a gene fusion is a cancer driver or just a passenger mutation is still an open issue. Here we present DEEPrior, an inherently flexible deep learning tool with two modes (Inference and Retraining). Inference mode predicts the probability of a gene fusion being involved in an oncogenic process, by directly exploiting the amino acid sequence of the fused protein. Retraining mode allows to obtain a custom prediction model including new data provided by the user. Availability and implementation: Both DEEPrior and the protein fusions dataset are freely available from GitHub at (https://github.com/bioinformatics-polito/DEEPrior). The tool was designed to operate in Python 3.7, with minimal additional libraries. Supplementary information: Supplementary data are available at Bioinformatics online.

2020 Articolo su rivista

Effective evaluation of clustering algorithms on single-cell CNA data

Authors: Montemurro, Marilisa; Urgese, Gianvito; Grassi, Elena; Pizzino, Carmelo Gabriele; Bertotti, Andrea; Ficarra, Elisa

Clustering methods are increasingly applied to single-cell DNA sequencing (scDNAseq) data to infer the subclonal structure of cancer. However, the … (Read full abstract)

Clustering methods are increasingly applied to single-cell DNA sequencing (scDNAseq) data to infer the subclonal structure of cancer. However, the complexity of these data exacerbates some data-science issues and affects clustering results. Additionally, determining whether such inferences are accurate and clusters recapitulate the real cell phylogeny is not trivial, mainly because ground truth information is not available for most experimental settings. Here, by exploiting simulated sequencing data representing known phylogenies of cancer cells, we propose a formal and systematic assessment of well-known clustering methods to study their performance and identify the approach providing the most accurate reconstruction of phylogenetic relationships.

2020 Relazione in Atti di Convegno

Exploiting "uncertain" deep networks for data cleaning in digital pathology

Authors: Ponzio, Francesco; Deodato, Giacomo; Macii, Enrico; Di Cataldo, Santa; Ficarra, Elisa

Published in: PROCEEDINGS INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING

2020 Relazione in Atti di Convegno

Multi-omics Classification on Kidney Samples Exploiting Uncertainty-Aware Models

Authors: Lovino, Marta; Bontempo, Gianpaolo; Cirrincione, Giansalvo; Ficarra, Elisa

Due to the huge amount of available omic data, classifying samples according to various omics is a complex process. One … (Read full abstract)

Due to the huge amount of available omic data, classifying samples according to various omics is a complex process. One of the most common approaches consists of creating a classifier for each omic and subsequently making a consensus among the classifiers that assign to each sample the most voted class among the outputs on the individual omics. However, this approach does not consider the confidence in the prediction ignoring that biological information coming from a certain omic may be more reliable than others. Therefore, it is here proposed a method consisting of a tree-based multi-layer perceptron (MLP), which estimates the class-membership probabilities for classification. In this way, it is not only possible to give relevance to all the omics, but also to label as Unknown those samples for which the classifier is uncertain in its prediction. The method was applied to a dataset composed of 909 kidney cancer samples for which these three omics were available: gene expression (mRNA), microRNA expression (miRNA), and methylation profiles (meth) data. The method is valid also for other tissues and on other omics (e.g. proteomics, copy number alterations data, single nucleotide polymorphism data). The accuracy and weighted average f1-score of the model are both higher than 95%. This tool can therefore be particularly useful in clinical practice, allowing physicians to focus on the most interesting and challenging samples.

2020 Relazione in Atti di Convegno

Predicting the oncogenic potential of gene fusions using convolutional neural networks

Authors: Lovino, Marta; Urgese, Gianvito; Macii, Enrico; Santa Di Cataldo, ; Ficarra, Elisa

Published in: LECTURE NOTES IN COMPUTER SCIENCE

Predicting the oncogenic potential of a gene fusion transcript is an important and challenging task in the study of cancer … (Read full abstract)

Predicting the oncogenic potential of a gene fusion transcript is an important and challenging task in the study of cancer development. To this date, the available approaches mostly rely on protein domain analysis to provide a probability score explaining the oncogenic potential of a gene fusion. In this paper, a Convolutional Neural Network model is proposed to discriminate gene fusions into oncogenic or non-oncogenic, exploiting only the protein sequence without protein domain information. Our proposed model obtained accuracy value close to 90% on a dataset of fused sequences.

2020 Relazione in Atti di Convegno

Unification of miRNA and isomiR research: the mirGFF3 format and the mirtop API

Authors: Desvignes, Thomas; Loher, Phillipe; Eilbeck, Karen; Ma, Jeffery; Urgese, Gianvito; Fromm, Bastian; Sydes, Jason; Aparicio-Puerta, Ernesto; Barrera, Victor; Espin, Roderic; Londin, Eric; Telonis, Aristeidis G; Ficarra, Elisa; Friedlander, Marc R; Postlethwait, John H; Rigoutsos, Isidore; Hackenberg, Michael; Vlachos, Ioannis S; Halushka, Marc K.; Pantano, Lorena

Published in: BIOINFORMATICS

Motivation MicroRNAs (miRNAs) are small RNA molecules (∼22 nucleotide long) involved in post-transcriptional gene regulation. Advances in high-throughput sequencing technologies … (Read full abstract)

Motivation MicroRNAs (miRNAs) are small RNA molecules (∼22 nucleotide long) involved in post-transcriptional gene regulation. Advances in high-throughput sequencing technologies led to the discovery of isomiRs, which are miRNA sequence variants. While many miRNA-seq analysis tools exist, the diversity of output formats hinders accurate comparisons between tools and precludes data sharing and the development of common downstream analysis methods. Results To overcome this situation, we present here a community-based project, miRNA Transcriptomic Open Project (miRTOP) working towards the optimization of miRNA analyses. The aim of miRTOP is to promote the development of downstream isomiR analysis tools that are compatible with existing detection and quantification tools. Based on the existing GFF3 format, we first created a new standard format, mirGFF3, for the output of miRNA/isomiR detection and quantification results from small RNA-seq data. Additionally, we developed a command line Python tool, mirtop, to create and manage the mirGFF3 format. Currently, mirtop can convert into mirGFF3 the outputs of commonly used pipelines, such as seqbuster, isomiR-SEA, sRNAbench, Prost! as well as BAM files. Some tools have also incorporated the mirGFF3 format directly into their code, such as, miRge2.0, IsoMIRmap and OptimiR. Its open architecture enables any tool or pipeline to output or convert results into mirGFF3. Collectively, this isomiR categorization system, along with the accompanying mirGFF3 and mirtop API, provide a comprehensive solution for the standardization of miRNA and isomiR annotation, enabling data sharing, reporting, comparative analyses and benchmarking, while promoting the development of common miRNA methods focusing on downstream steps of miRNA detection, annotation and quantification. Availability and implementation https://github.com/miRTop/mirGFF3/ and https://github.com/miRTop/mirtop.

2020 Articolo su rivista

Unsupervised Multi-Omic Data Fusion: the Neural Graph Learning Network

Authors: Barbiero, Pietro; Lovino, Marta; Siviero, Mattia; Ciravegna, Gabriele; Randazzo, Vincenzo; Ficarra, Elisa; Cirrincione, Giansalvo

Published in: LECTURE NOTES IN COMPUTER SCIENCE - 16th International Conference on Intelligent Computing, ICIC2020

In recent years, due to the high availability of omic data, data-driven biology has greatly expanded. However, the analysis of … (Read full abstract)

In recent years, due to the high availability of omic data, data-driven biology has greatly expanded. However, the analysis of different data sources is still an open challenge. A few multi-omics approaches have been proposed in the literature, none of which takes into consideration the intrinsic topology of each omic, though. In this work, an unsupervised learning method based on a deep neural network is proposed. Foreach omic, a separate network is trained, whose outputs are fused into a single graph; at this purpose, an innovative loss function has been designed to better represent the data cluster manifolds. The graph adjacency matrix is exploited to determine similarities among samples. With this approach, omics having a different number of features are merged into a unique representation. Quantitative and qualitative analyses show that the proposed method has comparable results to the state of the art. The method has great intrinsic flexibility as it can be customized according to the complexity of the tasks and it has a lot of room for future improvements compared to more fine-tuned methods, opening the way for future research.

2020 Relazione in Atti di Convegno

A Deep Learning Approach to the Screening of Oncogenic Gene Fusions in Humans

Authors: Lovino, Marta; Urgese, Gianvito; Macii, Enrico; Di Cataldo, Santa; Ficarra, Elisa

Published in: INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES

Gene fusions have a very important role in the study of cancer development. In this regard, predicting the probability of … (Read full abstract)

Gene fusions have a very important role in the study of cancer development. In this regard, predicting the probability of protein fusion transcripts of developing into a cancer is a very challenging and yet not fully explored research problem. To this date, all the available approaches in literature try to explain the oncogenic potential of gene fusions based on protein domain analysis, that is cancer-specific and not easy to adapt to newly developed information. In our work, we choose the raw protein sequences as the input baseline, and propose the use of deep learning, and more specifically Convolutional Neural Networks, to infer the oncogenity probability score of gene fusion transcripts and to group them into a number of categories (e.g., oncogenic/not oncogenic). This is an inherently flexible methodology that, unlike previous approaches, can be re-trained with very less efforts on newly available data (for example, from a different cancer). Based on experimental results on a large dataset of pre-annotated gene fusions, our method is able to predict the oncogenity potential of gene fusion transcripts with accuracy of about 72%, which increases to 86% if we consider the only instances that are classified with a high confidence level.

2019 Articolo su rivista

Aneuploid acute myeloid leukemia exhibits a signature of genomic alterations in the cell cycle and protein degradation machinery

Authors: Simonetti, Giorgia; Padella, Antonella; Do Valle, Italo Farìa; Fontana, Maria Chiara; Fonzi, Eugenio; Bruno, Samantha; Baldazzi, Carmen; Guadagnuolo, Viviana; Manfrini, Marco; Ferrari, Anna; Paolini, Stefania; Papayannidis, Cristina; Marconi, Giovanni; Franchini, Eugenia; Zuffa, Elisa; Laginestra, Maria Antonella; Zanotti, Federica; Astolfi, Annalisa; Iacobucci, Ilaria; Bernardi, Simona; Sazzini, Marco; Ficarra, Elisa; Hernandez, Jesus Maria; Vandenberghe, Peter; Cools, Jan; Bullinger, Lars; Ottaviani, Emanuela; Testoni, Nicoletta; Cavo, Michele; Haferlach, Torsten; Castellani, Gastone; Remondini, Daniel; Martinelli, Giovanni

Published in: CANCER

2019 Articolo su rivista

Dealing with Lack of Training Data for Convolutional Neural Networks: The Case of Digital Pathology

Authors: Ponzio, Francesco; Urgese, Gianvito; Ficarra, Elisa; Di Cataldo, Santa

Published in: ELECTRONICS

Thanks to their capability to learn generalizable descriptors directly from images, deep Convolutional Neural Networks (CNNs) seem the ideal solution … (Read full abstract)

Thanks to their capability to learn generalizable descriptors directly from images, deep Convolutional Neural Networks (CNNs) seem the ideal solution to most pattern recognition problems. On the other hand, to learn the image representation, CNNs need huge sets of annotated samples that are unfeasible in many every-day scenarios. This is the case, for example, of Computer-Aided Diagnosis (CAD) systems for digital pathology, where additional challenges are posed by the high variability of the cancerous tissue characteristics. In our experiments, state-of-the-art CNNs trained from scratch on histological images were less accurate and less robust to variability than a traditional machine learning framework, highlighting all the issues of fully training deep networks with limited data from real patients. To solve this problem, we designed and compared three transfer learning frameworks, leveraging CNNs pre-trained on non-medical images. This approach obtained very high accuracy, requiring much less computational resource for the training. Our findings demonstrate that transfer learning is a solution to the automated classification of histological samples and solves the problem of designing accurate and computationally-efficient CAD systems with limited training data.

2019 Articolo su rivista

Page 6 of 16 • Total publications: 156