Source author record

Kevin Karsch

Kevin Karsch appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Graphics eess.IV

Catalog footprint

What is connected

5works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2019arXiv

Automatic Scene Inference for 3D Object Compositing

We present a user-friendly image editing system that supports a drag-and-drop object insertion (where the user merely drags objects into the image, and the system automatically places them in 3D and relights them appropriately), post-process illumination editing, and depth-of-field manipulation. Underlying our system is a fully automatic technique for recovering a comprehensive 3D scene model (geometry, illumination, diffuse albedo and camera parameters) from a single, low dynamic range photograph. This is made possible by two novel contributions: an illumination inference algorithm that recovers a full lighting model of the scene (including light sources that are not directly visible in the photograph), and a depth estimation algorithm that combines data-driven depth transfer with geometric reasoning about the scene layout. A user study shows that our system produces perceptually convincing results, and achieves the same level of realism as techniques that require significant user interaction.

preprint2019arXiv

Depth Extraction from Video Using Non-parametric Sampling

We describe a technique that automatically generates plausible depth maps from videos using non-parametric depth sampling. We demonstrate our technique in cases where past methods fail (non-translating cameras and dynamic scenes). Our technique is applicable to single images as well as videos. For videos, we use local motion cues to improve the inferred depth maps, while optical flow is used to ensure temporal depth consistency. For training and evaluation, we use a Kinect-based system to collect a large dataset containing stereoscopic videos with known depths. We show that our depth estimation technique outperforms the state-of-the-art on benchmark databases. Our technique can be used to automatically convert a monoscopic video into stereo for 3D visualization, and we demonstrate this through a variety of visually pleasing results for indoor and outdoor scenes, including results from the feature film Charade.

preprint2019arXiv

DepthTransfer: Depth Extraction from Video Using Non-parametric Sampling

preprint2019arXiv

Inverse Rendering Techniques for Physically Grounded Image Editing

From a single picture of a scene, people can typically grasp the spatial layout immediately and even make good guesses at materials properties and where light is coming from to illuminate the scene. For example, we can reliably tell which objects occlude others, what an object is made of and its rough shape, regions that are illuminated or in shadow, and so on. It is interesting how little is known about our ability to make these determinations; as such, we are still not able to robustly "teach" computers to make the same high-level observations as people. This document presents algorithms for understanding intrinsic scene properties from single images. The goal of these inverse rendering techniques is to estimate the configurations of scene elements (geometry, materials, luminaires, camera parameters, etc) using only information visible in an image. Such algorithms have applications in robotics and computer graphics. One such application is in physically grounded image editing: photo editing made easier by leveraging knowledge of the physical space. These applications allow sophisticated editing operations to be performed in a matter of seconds, enabling seamless addition, removal, or relocation of objects in images.

preprint2019arXiv

Lightform: Procedural Effects for Projected AR

Projected augmented reality, also called projection mapping or video mapping, is a form of augmented reality that uses projected light to directly augment 3D surfaces, as opposed to using pass-through screens or headsets. The value of projected AR is its ability to add a layer of digital content directly onto physical objects or environments in a way that can be instantaneously viewed by multiple people, unencumbered by a screen or additional setup. Because projected AR typically involves projecting onto non-flat, textured objects (especially those that are conventionally not used as projection surfaces), the digital content needs to be mapped and aligned to precisely fit the physical scene to ensure a compelling experience. Current projected AR techniques require extensive calibration at the time of installation, which is not conducive to iteration or change, whether intentional (the scene is reconfigured) or not (the projector is bumped or settles). The workflows are undefined and fragmented, thus making it confusing and difficult for many to approach projected AR. For example, a digital artist may have the software expertise to create AR content, but could not complete an installation without experience in mounting, blending, and realigning projector(s); the converse is true for many A/V installation teams/professionals. Projection mapping has therefore been limited to high-end event productions, concerts, and films, because it requires expensive, complex tools, and skilled teams ($100K+ budgets). Lightform provides a technology that makes projected AR approachable, practical, intelligent, and robust through integrated hardware and computer-vision software. Lightform brings together and unites a currently fragmented workflow into a single cohesive process that provides users with an approachable and robust method to create and control projected AR experiences.

Kevin Karsch

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Automatic Scene Inference for 3D Object Compositing

Depth Extraction from Video Using Non-parametric Sampling

DepthTransfer: Depth Extraction from Video Using Non-parametric Sampling

Inverse Rendering Techniques for Physically Grounded Image Editing

Lightform: Procedural Effects for Projected AR