Source author record

Steven L. Waslander

Steven L. Waslander appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Robotics Multiagent Systems Networking and Internet Architecture

Catalog footprint

What is connected

11works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Learned Camera Gain and Exposure Control for Improved Visual Feature Detection and Matching

Successful visual navigation depends upon capturing images that contain sufficient useful information. In this letter, we explore a data-driven approach to account for environmental lighting changes, improving the quality of images for use in visual odometry (VO) or visual simultaneous localization and mapping (SLAM). We train a deep convolutional neural network model to predictively adjust camera gain and exposure time parameters such that consecutive images contain a maximal number of matchable features. The training process is fully self-supervised: our training signal is derived from an underlying VO or SLAM pipeline and, as a result, the model is optimized to perform well with that specific pipeline. We demonstrate through extensive real-world experiments that our network can anticipate and compensate for dramatic lighting changes (e.g., transitions into and out of road tunnels), maintaining a substantially higher number of inlier feature matches than competing camera parameter control algorithms.

preprint2022arXiv

Next-Best-View Prediction for Active Stereo Cameras and Highly Reflective Objects

Depth acquisition with the active stereo camera is a challenging task for highly reflective objects. When setup permits, multi-view fusion can provide increased levels of depth completion. However, due to the slow acquisition speed of high-end active stereo cameras, collecting a large number of viewpoints for a single scene is generally not practical. In this work, we propose a next-best-view framework to strategically select camera viewpoints for completing depth data on reflective objects. In particular, we explicitly model the specular reflection of reflective surfaces based on the Phong reflection model and a photometric response function. Given the object CAD model and grayscale image, we employ an RGB-based pose estimator to obtain current pose predictions from the existing data, which is used to form predicted surface normal and depth hypotheses, and allows us to then assess the information gain from a subsequent frame for any candidate viewpoint. Using this formulation, we implement an active perception pipeline which is evaluated on a challenging real-world dataset. The evaluation results demonstrate that our active depth acquisition method outperforms two strong baselines for both depth completion and object pose estimation performance.

preprint2022arXiv

POCD: Probabilistic Object-Level Change Detection and Volumetric Mapping in Semi-Static Scenes

Maintaining an up-to-date map to reflect recent changes in the scene is very important, particularly in situations involving repeated traversals by a robot operating in an environment over an extended period. Undetected changes may cause a deterioration in map quality, leading to poor localization, inefficient operations, and lost robots. Volumetric methods, such as truncated signed distance functions (TSDFs), have quickly gained traction due to their real-time production of a dense and detailed map, though map updating in scenes that change over time remains a challenge. We propose a framework that introduces a novel probabilistic object state representation to track object pose changes in semi-static scenes. The representation jointly models a stationarity score and a TSDF change measure for each object. A Bayesian update rule that incorporates both geometric and semantic information is derived to achieve consistent online map maintenance. To extensively evaluate our approach alongside the state-of-the-art, we release a novel real-world dataset in a warehouse environment. We also evaluate on the public ToyCar dataset. Our method outperforms state-of-the-art methods on the reconstruction quality of semi-static environments.

preprint2022arXiv

Point Density-Aware Voxels for LiDAR 3D Object Detection

LiDAR has become one of the primary 3D object detection sensors in autonomous driving. However, LiDAR's diverging point pattern with increasing distance results in a non-uniform sampled point cloud ill-suited to discretized volumetric feature extraction. Current methods either rely on voxelized point clouds or use inefficient farthest point sampling to mitigate detrimental effects caused by density variation but largely ignore point density as a feature and its predictable relationship with distance from the LiDAR sensor. Our proposed solution, Point Density-Aware Voxel network (PDV), is an end-to-end two stage LiDAR 3D object detection architecture that is designed to account for these point density variations. PDV efficiently localizes voxel features from the 3D sparse convolution backbone through voxel point centroids. The spatially localized voxel features are then aggregated through a density-aware RoI grid pooling module using kernel density estimation (KDE) and self-attention with point density positional encoding. Finally, we exploit LiDAR's point density to distance relationship to refine our final bounding box confidences. PDV outperforms all state-of-the-art methods on the Waymo Open Dataset and achieves competitive results on the KITTI dataset. We provide a code release for PDV which is available at https://github.com/TRAILab/PDV.

preprint2020arXiv

Confidence Guided Stereo 3D Object Detection with Split Depth Estimation

Accurate and reliable 3D object detection is vital to safe autonomous driving. Despite recent developments, the performance gap between stereo-based methods and LiDAR-based methods is still considerable. Accurate depth estimation is crucial to the performance of stereo-based 3D object detection methods, particularly for those pixels associated with objects in the foreground. Moreover, stereo-based methods suffer from high variance in the depth estimation accuracy, which is often not considered in the object detection pipeline. To tackle these two issues, we propose CG-Stereo, a confidence-guided stereo 3D object detection pipeline that uses separate decoders for foreground and background pixels during depth estimation, and leverages the confidence estimation from the depth estimation network as a soft attention mechanism in the 3D object detector. Our approach outperforms all state-of-the-art stereo-based 3D detectors on the KITTI benchmark.

preprint2020arXiv

Object-Centric Stereo Matching for 3D Object Detection

Safe autonomous driving requires reliable 3D object detection-determining the 6 DoF pose and dimensions of objects of interest. Using stereo cameras to solve this task is a cost-effective alternative to the widely used LiDAR sensor. The current state-of-the-art for stereo 3D object detection takes the existing PSMNet stereo matching network, with no modifications, and converts the estimated disparities into a 3D point cloud, and feeds this point cloud into a LiDAR-based 3D object detector. The issue with existing stereo matching networks is that they are designed for disparity estimation, not 3D object detection; the shape and accuracy of object point clouds are not the focus. Stereo matching networks commonly suffer from inaccurate depth estimates at object boundaries, which we define as streaking, because background and foreground points are jointly estimated. Existing networks also penalize disparity instead of the estimated position of object point clouds in their loss functions. We propose a novel 2D box association and object-centric stereo matching method that only estimates the disparities of the objects of interest to address these two issues. Our method achieves state-of-the-art results on the KITTI 3D and BEV benchmarks.

preprint2020arXiv

Vehicle Scheduling Problem

We define a new problem called the Vehicle Scheduling Problem (VSP). The goal is to minimize an objective function, such as the number of tardy vehicles over a transportation network subject to maintaining safety distances, meeting hard deadlines, and maintaining speeds on each link between the allowed minimums and maximums. We prove VSP is an NP-hard problem for multiple objective functions that are commonly used in the context of job shop scheduling. With the number of tardy vehicles as the objective function, we formulate VSP in terms of a Mixed Integer Linear Programming (MIP) and design a heuristic algorithm. We analyze the complexity of our algorithm and compare the quality of the solutions to the optimal solution for the MIP formulation in the small cases. Our main motivation for defining VSP is the upcoming integration of Unmanned Aerial Vehicles (UAVs) into the airspace for which this novel scheduling framework is of paramount importance.

preprint2018arXiv

Aerial Imagery for Roof Segmentation: A Large-Scale Dataset towards Automatic Mapping of Buildings

arXiv admin note: This version has been removed as the user did not have the right to agree to the license at the time of submission

preprint2016arXiv

Internet of Drones

The Internet of Drones (IoD) is a layered network control architecture designed mainly for coordinating the access of unmanned aerial vehicles to controlled airspace, and providing navigation services between locations referred to as nodes. The IoD provides generic services for various drone applications such as package delivery, traffic surveillance, search and rescue and more. In this paper, we present a conceptual model of how such an architecture can be organized and we specify the features that an IoD system based on our architecture should implement. For doing so, we extract key concepts from three existing large scale networks, namely the air traffic control network, the cellular network, and the Internet and explore their connections to our novel architecture for drone traffic management.

preprint2015arXiv

3D Scan Registration using Curvelet Features in Planetary Environments

Topographic mapping in planetary environments relies on accurate 3D scan registration methods. However, most global registration algorithms relying on features such as FPFH and Harris-3D show poor alignment accuracy in these settings due to the poor structure of the Mars-like terrain and variable resolution, occluded, sparse range data that is hard to register without some a-priori knowledge of the environment. In this paper, we propose an alternative approach to 3D scan registration using the curvelet transform that performs multi-resolution geometric analysis to obtain a set of coefficients indexed by scale (coarsest to finest), angle and spatial position. Features are detected in the curvelet domain to take advantage of the directional selectivity of the transform. A descriptor is computed for each feature by calculating the 3D spatial histogram of the image gradients, and nearest neighbor based matching is used to calculate the feature correspondences. Correspondence rejection using Random Sample Consensus identifies inliers, and a locally optimal Singular Value Decomposition-based estimation of the rigid-body transformation aligns the laser scans given the re-projected correspondences in the metric space. Experimental results on a publicly available data-set of planetary analogue indoor facility, as well as simulated and real-world scans from Neptec Design Group's IVIGMS 3D laser rangefinder at the outdoor CSA Mars yard demonstrates improved performance over existing methods in the challenging sparse Mars-like terrain.

preprint2015arXiv

Degenerate Motions in Multicamera Cluster SLAM with Non-overlapping Fields of View

An analysis of the relative motion and point feature model configurations leading to solution degeneracy is presented, for the case of a Simultaneous Localization and Mapping system using multicamera clusters with non-overlapping fields-of-view. The SLAM optimization system seeks to minimize image space reprojection error and is formulated for a cluster containing any number of component cameras, observing any number of point features over two keyframes. The measurement Jacobian is transformed to expose a reduced-dimension representation such that the degeneracy of the system can be determined by the rank of a dense submatrix. A set of relative motions sufficient for degeneracy are identified for certain cluster configurations, independent of target model geometry. Furthermore, it is shown that increasing the number of cameras within the cluster and observing features across different cameras over the two keyframes reduces the size of the degenerate motion sets significantly.

Steven L. Waslander

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Learned Camera Gain and Exposure Control for Improved Visual Feature Detection and Matching

Next-Best-View Prediction for Active Stereo Cameras and Highly Reflective Objects

POCD: Probabilistic Object-Level Change Detection and Volumetric Mapping in Semi-Static Scenes

Point Density-Aware Voxels for LiDAR 3D Object Detection

Confidence Guided Stereo 3D Object Detection with Split Depth Estimation

Object-Centric Stereo Matching for 3D Object Detection

Vehicle Scheduling Problem

Aerial Imagery for Roof Segmentation: A Large-Scale Dataset towards Automatic Mapping of Buildings

Internet of Drones

3D Scan Registration using Curvelet Features in Planetary Environments

Degenerate Motions in Multicamera Cluster SLAM with Non-overlapping Fields of View