Publications by Lorenzo Baraldi

Explore our research publications: papers, articles, and conference proceedings from AImageLab.

Tip: type @ to pick an author and # to pick a keyword.

Active filters (Clear): Author: Lorenzo Baraldi

Context Change Detection for an Ultra-Low Power Low-Resolution Ego-Vision Imager

Authors: Paci, Francesco; Baraldi, Lorenzo; Serra, Giuseppe; Cucchiara, Rita; Benini, Luca

Published in: LECTURE NOTES IN COMPUTER SCIENCE

With the increasing popularity of wearable cameras, such as GoPro or Narrative Clip, research on continuous activity monitoring from egocentric … (Read full abstract)

With the increasing popularity of wearable cameras, such as GoPro or Narrative Clip, research on continuous activity monitoring from egocentric cameras has received a lot of attention. Research in hardware and software is devoted to find new efficient, stable and long-time running solutions; however, devices are too power-hungry for truly always-on operation, and are aggressively duty-cycled to achieve acceptable lifetimes. In this paper we present a wearable system for context change detection based on an egocentric camera with ultra-low power consumption that can collect data 24/7. Although the resolution of the captured images is low, experimental results in real scenarios demonstrate how our approach, based on Siamese Neural Networks, can achieve visual context awareness. In particular, we compare our solution with hand-crafted features and with state of art technique and propose a novel and challenging dataset composed of roughly 30000 low-resolution images.

2016 Relazione in Atti di Convegno

Historical Document Digitization through Layout Analysis and Deep Content Classification

Authors: Corbelli, Andrea; Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita

Document layout segmentation and recognition is an important task in the creation of digitized documents collections, especially when dealing with … (Read full abstract)

Document layout segmentation and recognition is an important task in the creation of digitized documents collections, especially when dealing with historical documents. This paper presents an hybrid approach to layout segmentation as well as a strategy to classify document regions, which is applied to the process of digitization of an historical encyclopedia. Our layout analysis method merges a classic top-down approach and a bottom-up classification process based on local geometrical features, while regions are classified by means of features extracted from a Convolutional Neural Network merged in a Random Forest classifier. Experiments are conducted on the first volume of the ``Enciclopedia Treccani'', a large dataset containing 999 manually annotated pages from the historical Italian encyclopedia.

2016 Relazione in Atti di Convegno

Multi-Level Net: a Visual Saliency Prediction Model

Authors: Cornia, Marcella; Baraldi, Lorenzo; Serra, Giuseppe; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

State of the art approaches for saliency prediction are based on Full Convolutional Networks, in which saliency maps are built … (Read full abstract)

State of the art approaches for saliency prediction are based on Full Convolutional Networks, in which saliency maps are built using the last layer. In contrast, we here present a novel model that predicts saliency maps exploiting a non-linear combination of features coming from different layers of the network. We also present a new loss function to deal with the imbalance issue on saliency masks. Extensive results on three public datasets demonstrate the robustness of our solution. Our model outperforms the state of the art on SALICON, which is the largest and unconstrained dataset available, and obtains competitive results on MIT300 and CAT2000 benchmarks.

2016 Relazione in Atti di Convegno

Optimized Connected Components Labeling with Pixel Prediction

Authors: Grana, Costantino; Baraldi, Lorenzo; Bolelli, Federico

Published in: LECTURE NOTES IN COMPUTER SCIENCE

In this paper we propose a new paradigm for connected components labeling, which employs a general approach to minimize the … (Read full abstract)

In this paper we propose a new paradigm for connected components labeling, which employs a general approach to minimize the number of memory accesses, by exploiting the information provided by already seen pixels, removing the need to check them again. The scan phase of our proposed algorithm is ruled by a forest of decision trees connected into a single graph. Every tree derives from a reduction of the complete optimal decision tree. Experimental results demonstrated that on low density images our method is slightly faster than the fastest conventional labeling algorithms.

2016 Relazione in Atti di Convegno

Scene-driven Retrieval in Edited Videos using Aesthetic and Semantic Deep Features

Authors: Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita

This paper presents a novel retrieval pipeline for video collections, which aims to retrieve the most significant parts of an … (Read full abstract)

This paper presents a novel retrieval pipeline for video collections, which aims to retrieve the most significant parts of an edited video for a given query, and represent them with thumbnails which are at the same time semantically meaningful and aesthetically remarkable. Videos are first segmented into coherent and story-telling scenes, then a retrieval algorithm based on deep learning is proposed to retrieve the most significant scenes for a textual query. A ranking strategy based on deep features is finally used to tackle the problem of visualizing the best thumbnail. Qualitative and quantitative experiments are conducted on a collection of edited videos to demonstrate the effectiveness of our approach.

2016 Relazione in Atti di Convegno

Shot, scene and keyframe ordering for interactive video re-use

Authors: Baraldi, L.; Grana, C.; Borghi, G.; Vezzani, R.; Cucchiara, R.

This paper presents a complete system for shot and scene detection in broadcast videos, as well as a method to … (Read full abstract)

This paper presents a complete system for shot and scene detection in broadcast videos, as well as a method to select the best representative key-frames, which could be used in new interactive interfaces for accessing large collections of edited videos. The final goal is to enable an improved access to video footage and the re-use of video content with the direct management of user-selected video-clips.

2016 Relazione in Atti di Convegno

YACCLAB - Yet Another Connected Components Labeling Benchmark

Authors: Grana, Costantino; Bolelli, Federico; Baraldi, Lorenzo; Vezzani, Roberto

Published in: INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION

The problem of labeling the connected components (CCL) of a binary image is well-defined and several proposals have been presented … (Read full abstract)

The problem of labeling the connected components (CCL) of a binary image is well-defined and several proposals have been presented in the past. Since an exact solution to the problem exists and should be mandatory provided as output, algorithms mainly differ on their execution speed. In this paper, we propose and describe YACCLAB, Yet Another Connected Components Labeling Benchmark. Together with a rich and varied dataset, YACCLAB contains an open source platform to test new proposals and to compare them with publicly available competitors. Textual and graphical outputs are automatically generated for three kinds of test, which analyze the methods from different perspectives. The fairness of the comparisons is guaranteed by running on the same system and over the same datasets. Examples of usage and the corresponding comparisons among state-of-the-art techniques are reported to confirm the potentiality of the benchmark.

2016 Relazione in Atti di Convegno

A Deep Siamese Network for Scene Detection in Broadcast Videos

Authors: Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita

We present a model that automatically divides broadcast videos into coherent scenes by learning a distance measure between shots. Experiments … (Read full abstract)

We present a model that automatically divides broadcast videos into coherent scenes by learning a distance measure between shots. Experiments are performed to demonstrate the effectiveness of our approach by comparing our algorithm against recent proposals for automatic scene segmentation. We also propose an improved performance measure that aims to reduce the gap between numerical evaluation and expected results, and propose and release a new benchmark dataset.

2015 Relazione in Atti di Convegno

Gesture Recognition using Wearable Vision Sensors to Enhance Visitors' Museum Experiences

Authors: Baraldi, Lorenzo; Paci, Francesco; Serra, Giuseppe; Cucchiara, Rita

Published in: IEEE SENSORS JOURNAL

We introduce a novel approach to cultural heritage experience: by means of ego-vision embedded devices we develop a system, which … (Read full abstract)

We introduce a novel approach to cultural heritage experience: by means of ego-vision embedded devices we develop a system, which offers a more natural and entertaining way of accessing museum knowledge. Our method is based on distributed self-gesture and artwork recognition, and does not need fixed cameras nor radio-frequency identifications sensors. We propose the use of dense trajectories sampled around the hand region to perform self-gesture recognition, understanding the way a user naturally interacts with an artwork, and demonstrate that our approach can benefit from distributed training. We test our algorithms on publicly available data sets and we extend our experiments to both virtual and real museum scenarios, where our method shows robustness when challenged with real-world data. Furthermore, we run an extensive performance analysis on our ARM-based wearable device.

2015 Articolo su rivista

Measuring scene detection performance

Authors: Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita

Published in: LECTURE NOTES IN ARTIFICIAL INTELLIGENCE

In this paper we evaluate the performance of scene detection techniques, starting from the classic precision/recall approach, moving to the … (Read full abstract)

In this paper we evaluate the performance of scene detection techniques, starting from the classic precision/recall approach, moving to the better designed coverage/overflow measures, and finally proposing an improved metric, in order to solve frequently observed cases in which the numeric interpretation is different from the expected results. Numerical evaluation is performed on two recent proposals for automatic scene detection, and comparing them with a simple but effective novel approach. Experimental results are conducted to show how different measures may lead to different interpretations.

2015 Relazione in Atti di Convegno

Page 14 of 15 • Total publications: 144