Source author record

Ferran Diego

Ferran Diego appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Computation and Language Information Retrieval Multimedia Neurons and Cognition

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and Videos

In this work, we propose an effective approach for training unique embedding representations by combining three simultaneous modalities: image and spoken and textual narratives. The proposed methodology departs from a baseline system that spawns a embedding space trained with only spoken narratives and image cues. Our experiments on the EPIC-Kitchen and Places Audio Caption datasets show that introducing the human-generated textual transcriptions of the spoken narratives helps to the training procedure yielding to get better embedding representations. The triad speech, image and words allows for a better estimate of the point embedding and show an improving of the performance within tasks like image and speech retrieval, even when text third modality, text, is not present in the task.

preprint2016arXiv

Sparse convolutional coding for neuronal ensemble identification

Cell ensembles, originally proposed by Donald Hebb in 1949, are subsets of synchronously firing neurons and proposed to explain basic firing behavior in the brain. Despite having been studied for many years no conclusive evidence has been presented yet for their existence and involvement in information processing such that their identification is still a topic of modern research, especially since simultaneous recordings of large neuronal population have become possible in the past three decades. These large recordings pose a challenge for methods allowing to identify individual neurons forming cell ensembles and their time course of activity inside the vast amounts of spikes recorded. Related work so far focused on the identification of purely simulta- neously firing neurons using techniques such as Principal Component Analysis. In this paper we propose a new algorithm based on sparse convolution coding which is also able to find ensembles with temporal structure. Application of our algorithm to synthetically generated datasets shows that it outperforms previous work and is able to accurately identify temporal cell ensembles even when those contain overlapping neurons or when strong background noise is present.

preprint2014arXiv

Road Detection via On--line Label Transfer

Vision-based road detection is an essential functionality for supporting advanced driver assistance systems (ADAS) such as road following and vehicle and pedestrian detection. The major challenges of road detection are dealing with shadows and lighting variations and the presence of other objects in the scene. Current road detection algorithms characterize road areas at pixel level and group pixels accordingly. However, these algorithms fail in presence of strong shadows and lighting variations. Therefore, we propose a road detection algorithm based on video alignment. The key idea of the algorithm is to exploit the similarities occurred when a vehicle follows the same trajectory more than once. In this way, road areas are learned in a first ride and then, this road knowledge is used to infer areas depicting drivable road surfaces in subsequent rides. Two different experiments are conducted to validate the proposal on different video sequences taken at different scenarios and different daytime. The former aims to perform on-line road detection. The latter aims to perform off-line road detection and is applied to automatically generate the ground-truth necessary to validate road detection algorithms. Qualitative and quantitative evaluations prove that the proposed algorithm is a valid road detection approach.