Researcher profile

Kevin Karsch

Kevin Karsch contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2019arXiv

Automatic Scene Inference for 3D Object Compositing

We present a user-friendly image editing system that supports a drag-and-drop object insertion (where the user merely drags objects into the image, and the system automatically places them in 3D and relights them appropriately), post-process illumination editing, and depth-of-field manipulation. Underlying our system is a fully automatic technique for recovering a comprehensive 3D scene model (geometry, illumination, diffuse albedo and camera parameters) from a single, low dynamic range photograph. This is made possible by two novel contributions: an illumination inference algorithm that recovers a full lighting model of the scene (including light sources that are not directly visible in the photograph), and a depth estimation algorithm that combines data-driven depth transfer with geometric reasoning about the scene layout. A user study shows that our system produces perceptually convincing results, and achieves the same level of realism as techniques that require significant user interaction.

preprint2019arXiv

Depth Extraction from Video Using Non-parametric Sampling

We describe a technique that automatically generates plausible depth maps from videos using non-parametric depth sampling. We demonstrate our technique in cases where past methods fail (non-translating cameras and dynamic scenes). Our technique is applicable to single images as well as videos. For videos, we use local motion cues to improve the inferred depth maps, while optical flow is used to ensure temporal depth consistency. For training and evaluation, we use a Kinect-based system to collect a large dataset containing stereoscopic videos with known depths. We show that our depth estimation technique outperforms the state-of-the-art on benchmark databases. Our technique can be used to automatically convert a monoscopic video into stereo for 3D visualization, and we demonstrate this through a variety of visually pleasing results for indoor and outdoor scenes, including results from the feature film Charade.

preprint2019arXiv

DepthTransfer: Depth Extraction from Video Using Non-parametric Sampling

We describe a technique that automatically generates plausible depth maps from videos using non-parametric depth sampling. We demonstrate our technique in cases where past methods fail (non-translating cameras and dynamic scenes). Our technique is applicable to single images as well as videos. For videos, we use local motion cues to improve the inferred depth maps, while optical flow is used to ensure temporal depth consistency. For training and evaluation, we use a Kinect-based system to collect a large dataset containing stereoscopic videos with known depths. We show that our depth estimation technique outperforms the state-of-the-art on benchmark databases. Our technique can be used to automatically convert a monoscopic video into stereo for 3D visualization, and we demonstrate this through a variety of visually pleasing results for indoor and outdoor scenes, including results from the feature film Charade.

preprint2019arXiv

Inverse Rendering Techniques for Physically Grounded Image Editing

From a single picture of a scene, people can typically grasp the spatial layout immediately and even make good guesses at materials properties and where light is coming from to illuminate the scene. For example, we can reliably tell which objects occlude others, what an object is made of and its rough shape, regions that are illuminated or in shadow, and so on. It is interesting how little is known about our ability to make these determinations; as such, we are still not able to robustly "teach" computers to make the same high-level observations as people. This document presents algorithms for understanding intrinsic scene properties from single images. The goal of these inverse rendering techniques is to estimate the configurations of scene elements (geometry, materials, luminaires, camera parameters, etc) using only information visible in an image. Such algorithms have applications in robotics and computer graphics. One such application is in physically grounded image editing: photo editing made easier by leveraging knowledge of the physical space. These applications allow sophisticated editing operations to be performed in a matter of seconds, enabling seamless addition, removal, or relocation of objects in images.

preprint2019arXiv

Lightform: Procedural Effects for Projected AR

Projected augmented reality, also called projection mapping or video mapping, is a form of augmented reality that uses projected light to directly augment 3D surfaces, as opposed to using pass-through screens or headsets. The value of projected AR is its ability to add a layer of digital content directly onto physical objects or environments in a way that can be instantaneously viewed by multiple people, unencumbered by a screen or additional setup. Because projected AR typically involves projecting onto non-flat, textured objects (especially those that are conventionally not used as projection surfaces), the digital content needs to be mapped and aligned to precisely fit the physical scene to ensure a compelling experience. Current projected AR techniques require extensive calibration at the time of installation, which is not conducive to iteration or change, whether intentional (the scene is reconfigured) or not (the projector is bumped or settles). The workflows are undefined and fragmented, thus making it confusing and difficult for many to approach projected AR. For example, a digital artist may have the software expertise to create AR content, but could not complete an installation without experience in mounting, blending, and realigning projector(s); the converse is true for many A/V installation teams/professionals. Projection mapping has therefore been limited to high-end event productions, concerts, and films, because it requires expensive, complex tools, and skilled teams ($100K+ budgets). Lightform provides a technology that makes projected AR approachable, practical, intelligent, and robust through integrated hardware and computer-vision software. Lightform brings together and unites a currently fragmented workflow into a single cohesive process that provides users with an approachable and robust method to create and control projected AR experiences.