Publications - AImageLab

FuGePrior: A novel gene fusion prioritization algorithm based on accurate fusion structure analysis in cancer RNA-seq samples

Authors: Paciello, Giulia; Ficarra, Elisa

Published in: BMC BIOINFORMATICS

2017 Articolo su rivista

Generative Adversarial Models for People Attribute Recognition in Surveillance

Authors: Fabbri, Matteo; Calderara, Simone; Cucchiara, Rita

In this paper we propose a deep architecture for detecting people attributes (e.g. gender, race, clothing ...) in surveillance contexts. … (Read full abstract)

In this paper we propose a deep architecture for detecting people attributes (e.g. gender, race, clothing ...) in surveillance contexts. Our proposal explicitly deal with poor resolution and occlusion issues that often occur in surveillance footages by enhancing the images by means of Deep Convolutional Generative Adversarial Networks (DCGAN). Experiments show that by combining both our Generative Reconstruction and Deep Attribute Classification Network we can effectively extract attributes even when resolution is poor and in presence of strong occlusions up to 80% of the whole person figure.

2017 Relazione in Atti di Convegno

IRIS

Guest Editorial Special Issue on Wearable and Ego-Vision Systems for Augmented Experience

Authors: Serra, G.; Cucchiara, R.; Kitani, K. M.; Civera, J.

Published in: IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS

2017 Articolo su rivista

DOI IRIS

Hierarchical Boundary-Aware Neural Encoder for Video Captioning

Authors: Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita

Published in: PROCEEDINGS - IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION

The use of Recurrent Neural Networks for video captioning has recently gained a lot of attention, since they can be … (Read full abstract)

The use of Recurrent Neural Networks for video captioning has recently gained a lot of attention, since they can be used both to encode the input video and to generate the corresponding description. In this paper, we present a recurrent video encoding scheme which can discover and leverage the hierarchical structure of the video. Unlike the classical encoder-decoder approach, in which a video is encoded continuously by a recurrent layer, we propose a novel LSTM cell, which can identify discontinuity points between frames or segments and modify the temporal connections of the encoding layer accordingly. We evaluate our approach on three large-scale datasets: the Montreal Video Annotation dataset, the MPII Movie Description dataset and the Microsoft Video Description Corpus. Experiments show that our approach can discover appropriate hierarchical representations of input videos and improve the state of the art results on movie description datasets.

2017 Relazione in Atti di Convegno

DOI IRIS

Historical Handwritten Text Images Word Spotting through Sliding Window HOG Features

Authors: Bolelli, Federico; Borghi, Guido; Grana, Costantino

Published in: LECTURE NOTES IN COMPUTER SCIENCE

In this paper we present an innovative technique to semi-automatically index handwritten word images. The proposed method is based on … (Read full abstract)

In this paper we present an innovative technique to semi-automatically index handwritten word images. The proposed method is based on HOG descriptors and exploits Dynamic Time Warping technique to compare feature vectors elaborated from single handwritten words. Our strategy is applied to a new challenging dataset extracted from Italian civil registries of the XIX century. Experimental results, compared with some previously developed word spotting strategies, confirmed that our method outperforms competitors.

2017 Relazione in Atti di Convegno

DOI IRIS

Indexing of Historical Document Images: Ad Hoc Dewarping Technique for Handwritten Text

Authors: Bolelli, Federico

Published in: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE

This work presents a research project, named XDOCS, aimed at extending to a much wider audience the possibility to access … (Read full abstract)

This work presents a research project, named XDOCS, aimed at extending to a much wider audience the possibility to access a variety of historical documents published on the web. The paper presents an overview of the indexing process that will be used to achieve the goal, focusing on the adopted dewarping technique. The proposed dewarping approach performs its task with the help of a transformation model which maps the projection of a curved surface to a 2D rectangular area. The novelty introduced with this work regards the possibility of applying dewarping to document images which contain both handwritten and typewritten text.

2017 Relazione in Atti di Convegno

DOI IRIS

isomiR-SEA: miRNA and isomiR expression level detection in seven RNA-Seq datasets

Authors: Urgese, Gianvito; Paciello, Giulia; Macii, Enrico; Acquaviva, Andrea; Ficarra, Elisa

Background: Massive parallel sequencing of transcriptomes revealed the presence of miRNA variants named isomiRs. The sequence variations identified within isomiR … (Read full abstract)

Background: Massive parallel sequencing of transcriptomes revealed the presence of miRNA variants named isomiRs. The sequence variations identified within isomiR molecules can affect their targeting activity, with consequences in gene expression and potential impact in multi-factorial diseases. miRNAs are considered good biomarkers, making their adoption for disease characterization highly desirable. Several methodologies and tools were devised to identify and quantify miRNAs from sequencing data. However, all these tools are built on-top of general-purpose alignment algorithms, providing poorly accurate results and no information concerning isomiRs and conserved miRNA-mRNA interaction sites. Method: To overcome these limitations we developed the isomiR-SEA algorithm. By implementing a miRNA-specific alignment procedure, isomiR-SEA analysis accounts for accurate miRNA/isomiR expression levels and for a precise evaluation of the conserved interaction sites. As first, isomiR-SEA identifies miRNA seeds within the tags. If the seed is found, the alignment is extended and the positions of the encountered mismatches recorded. Then, the collected info is evaluated to distinguish among miRNAs and isomiRs and to assess the conservation of the interaction sites. Results & Conclusion: isomiR-SEA performance was assessed on 7 public RNA-Seq datasets. 40% of reads attributed to miRNAs (189M) comes from mature miRNAs, 50% derives instead from 3’ isomiRs, and the remaining reads account for 5’/SNP isomiRs or combinations between them. Furthermore, about 2% of reads lost some interaction sites. This proves the importance of a miRNA-specific alignment algorithm to correctly evaluate miRNA targeting activity. Expression levels of isomiRs detected in the two experiments were aggregated and classified with two deepness. In experiment 1, isoforms with indel (in one or both ends) are grouped together. Whereas, in experiment 2 we make a distinction between reads aligned on the mature miRNA with insertion (+) or deletion (-) on 5' or 3' ends. This shows the capability of isomiR-SEA to generate enriched results that can be analysed in down-stream analysis customized for the investigation purpose.

2017 Poster

DOI IRIS

Layout analysis and content classification in digitized books

Authors: Corbelli, Andrea; Baraldi, Lorenzo; Balducci, Fabrizio; Grana, Costantino; Cucchiara, Rita

Published in: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE

Automatic layout analysis has proven to be extremely important in the process of digitization of large amounts of documents. In … (Read full abstract)

Automatic layout analysis has proven to be extremely important in the process of digitization of large amounts of documents. In this paper we present a mixed approach to layout analysis, introducing a SVM-aided layout segmentation process and a classification process based on local and geometrical features. The final output of the automatic analysis algorithm is a complete and structured annotation in JSON format, containing the digitalized text as well as all the references to the illustrations of the input page, and which can be used by visualization interfaces as well as annotation interfaces. We evaluate our algorithm on a large dataset built upon the first volume of the “Enciclopedia Treccani”.

2017 Relazione in Atti di Convegno

DOI IRIS

Learning to Map Vehicles into Bird's Eye View

Authors: Palazzi, Andrea; Borghi, Guido; Abati, Davide; Calderara, Simone; Cucchiara, Rita

Awareness of the road scene is an essential component for both autonomous vehicles and Advances Driver Assistance Systems and is … (Read full abstract)

Awareness of the road scene is an essential component for both autonomous vehicles and Advances Driver Assistance Systems and is gaining importance both for the academia and car companies. This paper presents a way to learn a semantic-aware transformation which maps detections from a dashboard camera view onto a broader bird's eye occupancy map of the scene. To this end, a huge synthetic dataset featuring 1M couples of frames, taken from both car dashboard and bird's eye view, has been collected and automatically annotated. A deep-network is then trained to warp detections from the first to the second view. We demonstrate the effectiveness of our model against several baselines and observe that is able to generalize on real-world data despite having been trained solely on synthetic ones.

2017 Relazione in Atti di Convegno

DOI IRIS

Learning Where to Attend Like a Human Driver

Authors: Palazzi, Andrea; Solera, Francesco; Calderara, Simone; Alletto, Stefano; Cucchiara, Rita

Despite the advent of autonomous cars, it's likely - at least in the near future - that human attention will … (Read full abstract)

Despite the advent of autonomous cars, it's likely - at least in the near future - that human attention will still maintain a central role as a guarantee in terms of legal responsibility during the driving task. In this paper we study the dynamics of the driver's gaze and use it as a proxy to understand related attentional mechanisms. First, we build our analysis upon two questions: where and what the driver is looking at? Second, we model the driver's gaze by training a coarse-to-fine convolutional network on short sequences extracted from the DR(eye)VE dataset. Experimental comparison against different baselines reveal that the driver's gaze can indeed be learnt to some extent, despite i) being highly subjective and ii) having only one driver's gaze available for each sequence due to the irreproducibility of the scene. Eventually, we advocate for a new assisted driving paradigm which suggests to the driver, with no intervention, where she should focus her attention.

2017 Relazione in Atti di Convegno

IRIS