Publications
Explore our research publications: papers, articles, and conference proceedings from AImageLab.
Tip: type @ to pick an author and # to pick a keyword.
Gesture Recognition using Wearable Vision Sensors to Enhance Visitors' Museum Experiences
Authors: Baraldi, Lorenzo; Paci, Francesco; Serra, Giuseppe; Cucchiara, Rita
Published in: IEEE SENSORS JOURNAL
We introduce a novel approach to cultural heritage experience: by means of ego-vision embedded devices we develop a system, which … (Read full abstract)
We introduce a novel approach to cultural heritage experience: by means of ego-vision embedded devices we develop a system, which offers a more natural and entertaining way of accessing museum knowledge. Our method is based on distributed self-gesture and artwork recognition, and does not need fixed cameras nor radio-frequency identifications sensors. We propose the use of dense trajectories sampled around the hand region to perform self-gesture recognition, understanding the way a user naturally interacts with an artwork, and demonstrate that our approach can benefit from distributed training. We test our algorithms on publicly available data sets and we extend our experiments to both virtual and real museum scenarios, where our method shows robustness when challenged with real-world data. Furthermore, we run an extensive performance analysis on our ARM-based wearable device.
GOLD: Gaussians of Local Descriptors for Image Representation
Authors: Serra, Giuseppe; Grana, Costantino; Manfredi, Marco; Cucchiara, Rita
Published in: COMPUTER VISION AND IMAGE UNDERSTANDING
The Bag of Words paradigm has been the baseline from which several successful image classification solutions were developed in the … (Read full abstract)
The Bag of Words paradigm has been the baseline from which several successful image classification solutions were developed in the last decade. These represent images by quantizing local descriptors and summarizing their distribution. The quantization step introduces a dependency on the dataset, that even if in some contexts significantly boosts the performance, severely limits its generalization capabilities. Differently, in this paper, we propose to model the local features distribution with a multivariate Gaussian, without any quantization. The full rank covariance matrix, which lies on a Riemannian manifold, is projected on the tangent Euclidean space and concatenated to the mean vector. The resulting representation, a Gaussian of local descriptors (GOLD), allows to use the dot product to closely approximate a distance between distributions without the need for expensive kernel computations. We describe an image by an improved spatial pyramid, which avoids boundary effects with soft assignment: local descriptors contribute to neighboring Gaussians, forming a weighted spatial pyramid of GOLD descriptors. In addition, we extend the model leveraging dataset characteristics in a mixture of Gaussian formulation further improving the classification accuracy. To deal with large scale datasets and high dimensional feature spaces the Stochastic Gradient Descent solver is adopted. Experimental results on several publicly available datasets show that the proposed method obtains state-of-the-art performance.
Innovative IoT-aware Services for a Smart Museum
Authors: Mighali, Vincenzo; Del Fiore, Giuseppe; Patrono, Luigi; Mainetti, Luca; Alletto, Stefano; Serra, Giuseppe; Cucchiara, Rita
Smart cities are a trading topic in both the academic literature and industrial world. The capability to provide the users … (Read full abstract)
Smart cities are a trading topic in both the academic literature and industrial world. The capability to provide the users with added-value services through low-power and low-cost smart objects is very attractive in many fields. Among these, art and culture represent very interesting examples, as the tourism is one of the main driving engines of modern society. In this paper, we propose an IoT-aware architecture to improve the cultural experience of the user, by involving the most important recent innovations in the ICT field. The main components of the proposed architecture are: (i) an indoor localization service based on the Bluetooth Low Energy technology, (ii) a wearable device able to capture and process images related to the user's point of view, (iii) the user's mobile device useful to display customized cultural contents and to share multimedia data in the Cloud, and (iv) a processing center that manage the core of the whole business logic. In particular, it interacts with both wearable and mobile devices, and communicates with the outside world to retrieve contents from the Cloud and to provide services also to external users. The proposal is currently under development and it will be validated in the MUST museum in Lecce.
Learning to Divide and Conquer for Online Multi-Target Tracking
Authors: Solera, Francesco; Calderara, Simone; Cucchiara, Rita
Published in: PROCEEDINGS IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION
Online Multiple Target Tracking (MTT) is often addressed within the tracking-by-detection paradigm. Detections are previously extracted independently in each frame … (Read full abstract)
Online Multiple Target Tracking (MTT) is often addressed within the tracking-by-detection paradigm. Detections are previously extracted independently in each frame and then objects trajectories are built by maximizing specifically designed coherence functions. Nevertheless, ambiguities arise in presence of occlusions or detection errors. In this paper we claim that the ambiguities in tracking could be solved by a selective use of the features, by working with more reliable features if possible and exploiting a deeper representation of the target only if necessary. To this end, we propose an online divide and conquer tracker for static camera scenes, which partitions the assignment problem in local subproblems and solves them by selectively choosing and combining the best features. The complete framework is cast as a structural learning task that unifies these phases and learns tracker parameters from examples. Experiments on two different datasets highlights a significant improvement of tracking performances (MOTA +10%) over the state of the art.
Learning to identify leaders in crowd
Authors: Solera, Francesco; Calderara, Simone; Cucchiara, Rita
Published in: IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS
Leader identification is a crucial task in social analysis, crowd management and emergency planning. In this paper, we investigate a … (Read full abstract)
Leader identification is a crucial task in social analysis, crowd management and emergency planning. In this paper, we investigate a computational model for the individuation of leaders in crowded scenes. We deal with the lack of a formal definition of leadership by learning, in a supervised fashion, a metric space based exclusively on people spatiotemporal information. Based on Tarde's work on crowd psychology, individuals are modeled as nodes of a directed graph and leaders inherits their relevance thanks to other members references. We note this is analogous to the way websites are ranked by the PageRank algorithm. During experiments, we observed different feature weights depending on the specific type of crowd, highlighting the impossibility to provide a unique interpretation of leadership. To our knowledge, this is the first attempt to study leader identification as a metric learning problem
Mapping Appearance Descriptors on 3D Body Models for People Re-identification
Authors: Baltieri, Davide; Vezzani, Roberto; Cucchiara, Rita
Published in: INTERNATIONAL JOURNAL OF COMPUTER VISION
People Re-identification aims at associating multiple instances of a person’s appearance acquired from different points of view, different cameras, or … (Read full abstract)
People Re-identification aims at associating multiple instances of a person’s appearance acquired from different points of view, different cameras, or after a spatial or a limited temporal gap to the same identifier. The basic hypothesis is that the person’s appearance is mostly constant. Many appearance descriptors have been adopted in the past, but they are often subject to severe perspective and view-point issues. In this paper, we propose a complete re-identification framework which exploits non-articulated 3D body models to spatially map appearance descriptors (color and gradient histograms) into the vertices of a regularly sampled 3D body surface. The matching and the shot integration steps are directly handled in the 3D body model, reducing the effects of occlusions, partial views or pose changes, which normally afflict 2D descriptors. A fast and effective model to image alignment is also proposed. It allows operation on common surveillance cameras or image collections. A comprehensive experimental evaluation is presented using the benchmark suite 3DPeS
Measuring scene detection performance
Authors: Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita
Published in: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE
In this paper we evaluate the performance of scene detection techniques, starting from the classic precision/recall approach, moving to the … (Read full abstract)
In this paper we evaluate the performance of scene detection techniques, starting from the classic precision/recall approach, moving to the better designed coverage/overflow measures, and finally proposing an improved metric, in order to solve frequently observed cases in which the numeric interpretation is different from the expected results. Numerical evaluation is performed on two recent proposals for automatic scene detection, and comparing them with a simple but effective novel approach. Experimental results are conducted to show how different measures may lead to different interpretations.
MicroRNA/mRNA interactions underlying colorectal cancer molecular subtypes
Authors: Cantini, Laura; Isella, Claudio; Petti, Consalvo; Picco, Gabriele; Chiola, Simone; Ficarra, Elisa; Caselle, Michele; Medico, Enzo
Published in: NATURE COMMUNICATIONS
Colorectal cancer (CRC) molecular subtypes have been recently identified by gene expression profiling. To search for microRNAs potentially driving the … (Read full abstract)
Colorectal cancer (CRC) molecular subtypes have been recently identified by gene expression profiling. To search for microRNAs potentially driving the subtypes, we designed an analytical pipeline, microRNA Master Regulator Analysis (MMRA). As input, MMRA requires a paired microRNA/mRNA expression dataset, with samples subdivided in two or more subgroups, and gene expression signatures specific for each subgroup. MMRA then identifies candidate regulator microRNAs by assessing their subtype-specific expression, target gene enrichment in subtype signatures and network analysis-based contribution to subtype gene expression. MMRA was applied to a CRC dataset of 450 samples, assigned to various subtypes by three different transcriptional classifiers. In total, 24 microRNA were associated to subtypes, in most cases negatively contributing to the stem/serrated/mesenchymal (SSM) poor prognosis subtype. Functional validation in CRC cell lines confirmed downregulation of the SSM subtype by miR-194, miR-200b, miR-203 and miR-429, and highlighted shared target genes and pathways mediating this effect.