Source author record

Fredrik Kahl

Fredrik Kahl appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Artificial Intelligence math.OC Robotics

Catalog footprint

What is connected

9works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

TimePillars: Temporally-Recurrent 3D LiDAR Object Detection

Object detection applied to LiDAR point clouds is a relevant task in robotics, and particularly in autonomous driving. Single frame methods, predominant in the field, exploit information from individual sensor scans. Recent approaches achieve good performance, at relatively low inference time. Nevertheless, given the inherent high sparsity of LiDAR data, these methods struggle in long-range detection (e.g. 200m) which we deem to be critical in achieving safe automation. Aggregating multiple scans not only leads to a denser point cloud representation, but it also brings time-awareness to the system, and provides information about how the environment is changing. Solutions of this kind, however, are often highly problem-specific, demand careful data processing, and tend not to fulfil runtime requirements. In this context we propose TimePillars, a temporally-recurrent object detection pipeline which leverages the pillar representation of LiDAR data across time, respecting hardware integration efficiency constraints, and exploiting the diversity and long-range information of the novel Zenseact Open Dataset (ZOD). Through experimentation, we prove the benefits of having recurrency, and show how basic building blocks are enough to achieve robust and efficient results.

preprint2022arXiv

A case for using rotation invariant features in state of the art feature matchers

The aim of this paper is to demonstrate that a state of the art feature matcher (LoFTR) can be made more robust to rotations by simply replacing the backbone CNN with a steerable CNN which is equivariant to translations and image rotations. It is experimentally shown that this boost is obtained without reducing performance on ordinary illumination and viewpoint matching sequences.

preprint2022arXiv

DoubleMatch: Improving Semi-Supervised Learning with Self-Supervision

Following the success of supervised learning, semi-supervised learning (SSL) is now becoming increasingly popular. SSL is a family of methods, which in addition to a labeled training set, also use a sizable collection of unlabeled data for fitting a model. Most of the recent successful SSL methods are based on pseudo-labeling approaches: letting confident model predictions act as training labels. While these methods have shown impressive results on many benchmark datasets, a drawback of this approach is that not all unlabeled data are used during training. We propose a new SSL algorithm, DoubleMatch, which combines the pseudo-labeling technique with a self-supervised loss, enabling the model to utilize all unlabeled data in the training process. We show that this method achieves state-of-the-art accuracies on multiple benchmark datasets while also reducing training times compared to existing SSL methods. Code is available at https://github.com/walline/doublematch.

preprint2022arXiv

ZZ-Net: A Universal Rotation Equivariant Architecture for 2D Point Clouds

In this paper, we are concerned with rotation equivariance on 2D point cloud data. We describe a particular set of functions able to approximate any continuous rotation equivariant and permutation invariant function. Based on this result, we propose a novel neural network architecture for processing 2D point clouds and we prove its universality for approximating functions exhibiting these symmetries. We also show how to extend the architecture to accept a set of 2D-2D correspondences as indata, while maintaining similar equivariance properties. Experiments are presented on the estimation of essential matrices in stereo vision.

preprint2021arXiv

How Privacy-Preserving are Line Clouds? Recovering Scene Details from 3D Lines

Visual localization is the problem of estimating the camera pose of a given image with respect to a known scene. Visual localization algorithms are a fundamental building block in advanced computer vision applications, including Mixed and Virtual Reality systems. Many algorithms used in practice represent the scene through a Structure-from-Motion (SfM) point cloud and use 2D-3D matches between a query image and the 3D points for camera pose estimation. As recently shown, image details can be accurately recovered from SfM point clouds by translating renderings of the sparse point clouds to images. To address the resulting potential privacy risks for user-generated content, it was recently proposed to lift point clouds to line clouds by replacing 3D points by randomly oriented 3D lines passing through these points. The resulting representation is unintelligible to humans and effectively prevents point cloud-to-image translation. This paper shows that a significant amount of information about the 3D scene geometry is preserved in these line clouds, allowing us to (approximately) recover the 3D point positions and thus to (approximately) recover image content. Our approach is based on the observation that the closest points between lines can yield a good approximation to the original 3D points. Code is available at https://github.com/kunalchelani/Line2Point.

preprint2020arXiv

Pose Proposal Critic: Robust Pose Refinement by Learning Reprojection Errors

In recent years, considerable progress has been made for the task of rigid object pose estimation from a single RGB-image, but achieving robustness to partial occlusions remains a challenging problem. Pose refinement via rendering has shown promise in order to achieve improved results, in particular, when data is scarce. In this paper we focus our attention on pose refinement, and show how to push the state-of-the-art further in the case of partial occlusions. The proposed pose refinement method leverages on a simplified learning task, where a CNN is trained to estimate the reprojection error between an observed and a rendered image. We experiment by training on purely synthetic data as well as a mixture of synthetic and real data. Current state-of-the-art results are outperformed for two out of three metrics on the Occlusion LINEMOD benchmark, while performing on-par for the final metric.

preprint2020arXiv

Single-Image Depth Prediction Makes Feature Matching Easier

Good local features improve the robustness of many 3D re-localization and multi-view reconstruction pipelines. The problem is that viewing angle and distance severely impact the recognizability of a local feature. Attempts to improve appearance invariance by choosing better local feature points or by leveraging outside information, have come with pre-requisites that made some of them impractical. In this paper, we propose a surprisingly effective enhancement to local feature extraction, which improves matching. We show that CNN-based depths inferred from single RGB images are quite helpful, despite their flaws. They allow us to pre-warp images and rectify perspective distortions, to significantly enhance SIFT and BRISK features, enabling more good matches, even when cameras are looking at the same scene but in opposite directions.

preprint2015arXiv

Multiresolution Search of the Rigid Motion Space for Intensity Based Registration

We study the relation between the target functions of low-resolution and high-resolution intensity-based registration for the class of rigid transformations. Our results show that low resolution target values can tightly bound the high-resolution target function in natural images. This can help with analyzing and better understanding the process of multiresolution image registration. It also gives a guideline for designing multiresolution algorithms in which the search space in higher resolution registration is restricted given the fitness values for lower resolution image pairs. To demonstrate this, we incorporate our multiresolution technique into a Lipschitz global optimization framework. We show that using the multiresolution scheme can result in large gains in the efficiency of such algorithms. The method is evaluated by applying to 2D and 3D registration problems as well as the detection of reflective symmetry in 2D and 3D images.

preprint2011arXiv

A linear framework for region-based image segmentation and inpainting involving curvature penalization

We present the first method to handle curvature regularity in region-based image segmentation and inpainting that is independent of initialization. To this end we start from a new formulation of length-based optimization schemes, based on surface continuation constraints, and discuss the connections to existing schemes. The formulation is based on a \emph{cell complex} and considers basic regions and boundary elements. The corresponding optimization problem is cast as an integer linear program. We then show how the method can be extended to include curvature regularity, again cast as an integer linear program. Here, we are considering pairs of boundary elements to reflect curvature. Moreover, a constraint set is derived to ensure that the boundary variables indeed reflect the boundary of the regions described by the region variables. We show that by solving the linear programming relaxation one gets quite close to the global optimum, and that curvature regularity is indeed much better suited in the presence of long and thin objects compared to standard length regularity.

Fredrik Kahl

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

TimePillars: Temporally-Recurrent 3D LiDAR Object Detection

A case for using rotation invariant features in state of the art feature matchers

DoubleMatch: Improving Semi-Supervised Learning with Self-Supervision

ZZ-Net: A Universal Rotation Equivariant Architecture for 2D Point Clouds

How Privacy-Preserving are Line Clouds? Recovering Scene Details from 3D Lines

Pose Proposal Critic: Robust Pose Refinement by Learning Reprojection Errors

Single-Image Depth Prediction Makes Feature Matching Easier

Multiresolution Search of the Rigid Motion Space for Intensity Based Registration

A linear framework for region-based image segmentation and inpainting involving curvature penalization