Source author record

Karthik Desingh

Karthik Desingh appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics Computer Vision Artificial Intelligence Human-Computer Interaction Machine Learning

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Break and Make: Interactive Structural Understanding Using LEGO Bricks

Visual understanding of geometric structures with complex spatial relationships is a fundamental component of human intelligence. As children, we learn how to reason about structure not only from observation, but also by interacting with the world around us -- by taking things apart and putting them back together again. The ability to reason about structure and compositionality allows us to not only build things, but also understand and reverse-engineer complex systems. In order to advance research in interactive reasoning for part-based geometric understanding, we propose a challenging new assembly problem using LEGO bricks that we call Break and Make. In this problem an agent is given a LEGO model and attempts to understand its structure by interactively inspecting and disassembling it. After this inspection period, the agent must then prove its understanding by rebuilding the model from scratch using low-level action primitives. In order to facilitate research on this problem we have built LTRON, a fully interactive 3D simulator that allows learning agents to assemble, disassemble and manipulate LEGO models. We pair this simulator with a new dataset of fan-made LEGO creations that have been uploaded to the internet in order to provide complex scenes containing over a thousand unique brick shapes. We take a first step towards solving this problem using sequence-to-sequence models that provide guidance for how to make progress on this challenging problem. Our simulator and data are available at github.com/aaronwalsman/ltron. Additional training code and PyTorch examples are available at github.com/aaronwalsman/ltron-torch-eccv22.

preprint2022arXiv

SORNet: Spatial Object-Centric Representations for Sequential Manipulation

Sequential manipulation tasks require a robot to perceive the state of an environment and plan a sequence of actions leading to a desired goal state. In such tasks, the ability to reason about spatial relations among object entities from raw sensor inputs is crucial in order to determine when a task has been completed and which actions can be executed. In this work, we propose SORNet (Spatial Object-Centric Representation Network), a framework for learning object-centric representations from RGB images conditioned on a set of object queries, represented as image patches called canonical object views. With only a single canonical view per object and no annotation, SORNet generalizes zero-shot to object entities whose shape and texture are both unseen during training. We evaluate SORNet on various spatial reasoning tasks such as spatial relation classification and relative direction regression in complex tabletop manipulation scenarios and show that SORNet significantly outperforms baselines including state-of-the-art representation learning techniques. We also demonstrate the application of the representation learned by SORNet on visual-servoing and task planning for sequential manipulation on a real robot.

preprint2021arXiv

Differentiable Nonparametric Belief Propagation

We present a differentiable approach to learn the probabilistic factors used for inference by a nonparametric belief propagation algorithm. Existing nonparametric belief propagation methods rely on domain-specific features encoded in the probabilistic factors of a graphical model. In this work, we replace each crafted factor with a differentiable neural network enabling the factors to be learned using an efficient optimization routine from labeled data. By combining differentiable neural networks with an efficient belief propagation algorithm, our method learns to maintain a set of marginal posterior samples using end-to-end training. We evaluate our differentiable nonparametric belief propagation (DNBP) method on a set of articulated pose tracking tasks and compare performance with a recurrent neural network. Results from this comparison demonstrate the effectiveness of using learned factors for tracking and suggest the practical advantage over hand-crafted approaches. The project webpage is available at: progress.eecs.umich.edu/projects/dnbp.

preprint2020arXiv

A Sketch-Based System for Human-Guided Constrained Object Manipulation

In this paper, we present an easy to use sketch-based interface to extract geometries and generate affordance files from 3D point clouds for robot-object interaction tasks. Using our system, even novice users can perform robot task planning by employing such sketch tools. Our focus in this paper is employing human-in-the-loop approach to assist in the generation of more accurate affordance templates and guidance of robot through the task execution process. Since we do not employ any unsupervised learning to generate affordance templates, our system performs much faster and is more versatile for template generation. Our system is based on the extraction of geometries for generalized cylindrical and cuboid shapes, after extracting the geometries, affordances are generated for objects by applying simple sketches. We evaluated our technique by asking users to define affordances by employing sketches on the 3D scenes of a door handle and a drawer handle and used the resulting extracted affordance template files to perform the tasks of turning a door handle and opening a drawer by the robot.

preprint2020arXiv

Parts-Based Articulated Object Localization in Clutter Using Belief Propagation

Robots working in human environments must be able to perceive and act on challenging objects with articulations, such as a pile of tools. Articulated objects increase the dimensionality of the pose estimation problem, and partial observations under clutter create additional challenges. To address this problem, we present a generative-discriminative parts-based recognition and localization method for articulated objects in clutter. We formulate the problem of articulated object pose estimation as a Markov Random Field (MRF). Hidden nodes in this MRF express the pose of the object parts, and edges express the articulation constraints between parts. Localization is performed within the MRF using an efficient belief propagation method. The method is informed by both part segmentation heatmaps over the observation, generated by a neural network, and the articulation constraints between object parts. Our generative-discriminative approach allows the proposed method to function in cluttered environments by inferring the pose of occluded parts using hypotheses from the visible parts. We demonstrate the efficacy of our methods in a tabletop environment for recognizing and localizing hand tools in uncluttered and cluttered configurations.

Karthik Desingh

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Break and Make: Interactive Structural Understanding Using LEGO Bricks

SORNet: Spatial Object-Centric Representations for Sequential Manipulation

Differentiable Nonparametric Belief Propagation

A Sketch-Based System for Human-Guided Constrained Object Manipulation

Parts-Based Articulated Object Localization in Clutter Using Belief Propagation