Source author record

Hugo Latapie

Hugo Latapie appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence eess.IV eess.AS Machine Learning Sound

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Learning Omnidirectional Flow in 360-degree Video via Siamese Representation

Optical flow estimation in omnidirectional videos faces two significant issues: the lack of benchmark datasets and the challenge of adapting perspective video-based methods to accommodate the omnidirectional nature. This paper proposes the first perceptually natural-synthetic omnidirectional benchmark dataset with a 360-degree field of view, FLOW360, with 40 different videos and 4,000 video frames. We conduct comprehensive characteristic analysis and comparisons between our dataset and existing optical flow datasets, which manifest perceptual realism, uniqueness, and diversity. To accommodate the omnidirectional nature, we present a novel Siamese representation Learning framework for Omnidirectional Flow (SLOF). We train our network in a contrastive manner with a hybrid loss function that combines contrastive loss and optical flow loss. Extensive experiments verify the proposed framework's effectiveness and show up to 40% performance improvement over the state-of-the-art approaches. Our FLOW360 dataset and code are available at https://siamlof.github.io/.

preprint2021arXiv

A Metamodel and Framework for Artificial General Intelligence From Theory to Practice

This paper introduces a new metamodel-based knowledge representation that significantly improves autonomous learning and adaptation. While interest in hybrid machine learning / symbolic AI systems leveraging, for example, reasoning and knowledge graphs, is gaining popularity, we find there remains a need for both a clear definition of knowledge and a metamodel to guide the creation and manipulation of knowledge. Some of the benefits of the metamodel we introduce in this paper include a solution to the symbol grounding problem, cumulative learning, and federated learning. We have applied the metamodel to problems ranging from time series analysis, computer vision, and natural language understanding and have found that the metamodel enables a wide variety of learning mechanisms ranging from machine learning, to graph network analysis and learning by reasoning engines to interoperate in a highly synergistic way. Our metamodel-based projects have consistently exhibited unprecedented accuracy, performance, and ability to generalize. This paper is inspired by the state-of-the-art approaches to AGI, recent AGI-aspiring work, the granular computing community, as well as Alfred Korzybski's general semantics. One surprising consequence of the metamodel is that it not only enables a new level of autonomous learning and optimal functioning for machine intelligences, but may also shed light on a path to better understanding how to improve human cognition.

preprint2021arXiv

Learning Audio-Visual Correlations from Variational Cross-Modal Generation

People can easily imagine the potential sound while seeing an event. This natural synchronization between audio and visual signals reveals their intrinsic correlations. To this end, we propose to learn the audio-visual correlations from the perspective of cross-modal generation in a self-supervised manner, the learned correlations can be then readily applied in multiple downstream tasks such as the audio-visual cross-modal localization and retrieval. We introduce a novel Variational AutoEncoder (VAE) framework that consists of Multiple encoders and a Shared decoder (MS-VAE) with an additional Wasserstein distance constraint to tackle the problem. Extensive experiments demonstrate that the optimized latent representation of the proposed MS-VAE can effectively learn the audio-visual correlations and can be readily applied in multiple audio-visual downstream tasks to achieve competitive performance even without any given label information during training.

preprint2020arXiv

A Metamodel and Framework for AGI

Can artificial intelligence systems exhibit superhuman performance, but in critical ways, lack the intelligence of even a single-celled organism? The answer is clearly 'yes' for narrow AI systems. Animals, plants, and even single-celled organisms learn to reliably avoid danger and move towards food. This is accomplished via a physical knowledge preserving metamodel that autonomously generates useful models of the world. We posit that preserving the structure of knowledge is critical for higher intelligences that manage increasingly higher levels of abstraction, be they human or artificial. This is the key lesson learned from applying AGI subsystems to complex real-world problems that require continuous learning and adaptation. In this paper, we introduce the Deep Fusion Reasoning Engine (DFRE), which implements a knowledge-preserving metamodel and framework for constructing applied AGI systems. The DFRE metamodel exhibits some important fundamental knowledge preserving properties such as clear distinctions between symmetric and antisymmetric relations, and the ability to create a hierarchical knowledge representation that clearly delineates between levels of abstraction. The DFRE metamodel, which incorporates these capabilities, demonstrates how this approach benefits AGI in specific ways such as managing combinatorial explosion and enabling cumulative, distributed and federated learning. Our experiments show that the proposed framework achieves 94% accuracy on average on unsupervised object detection and recognition. This work is inspired by the state-of-the-art approaches to AGI, recent AGI-aspiring work, the granular computing community, as well as Alfred Korzybski's general semantics.

preprint2020arXiv

Exocentric to Egocentric Image Generation via Parallel Generative Adversarial Network

Cross-view image generation has been recently proposed to generate images of one view from another dramatically different view. In this paper, we investigate exocentric (third-person) view to egocentric (first-person) view image generation. This is a challenging task since egocentric view sometimes is remarkably different from exocentric view. Thus, transforming the appearances across the two views is a non-trivial task. To this end, we propose a novel Parallel Generative Adversarial Network (P-GAN) with a novel cross-cycle loss to learn the shared information for generating egocentric images from exocentric view. We also incorporate a novel contextual feature loss in the learning procedure to capture the contextual information in images. Extensive experiments on the Exo-Ego datasets show that our model outperforms the state-of-the-art approaches.

Hugo Latapie

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Learning Omnidirectional Flow in 360-degree Video via Siamese Representation

A Metamodel and Framework for Artificial General Intelligence From Theory to Practice

Learning Audio-Visual Correlations from Variational Cross-Modal Generation

A Metamodel and Framework for AGI

Exocentric to Egocentric Image Generation via Parallel Generative Adversarial Network