Researcher profile

Jeffrey Zhang

Jeffrey Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2024arXiv

Preserving Image Properties Through Initializations in Diffusion Models

Retail photography imposes specific requirements on images. For instance, images may need uniform background colors, consistent model poses, centered products, and consistent lighting. Minor deviations from these standards impact a site's aesthetic appeal, making the images unsuitable for use. We show that Stable Diffusion methods, as currently applied, do not respect these requirements. The usual practice of training the denoiser with a very noisy image and starting inference with a sample of pure noise leads to inconsistent generated images during inference. This inconsistency occurs because it is easy to tell the difference between samples of the training and inference distributions. As a result, a network trained with centered retail product images with uniform backgrounds generates images with erratic backgrounds. The problem is easily fixed by initializing inference with samples from an approximation of noisy images. However, in using such an approximation, the joint distribution of text and noisy image at inference time still slightly differs from that at training time. This discrepancy is corrected by training the network with samples from the approximate noisy image distribution. Extensive experiments on real application data show significant qualitative and quantitative improvements in performance from adopting these procedures. Finally, our procedure can interact well with other control-based methods to further enhance the controllability of diffusion-based methods.

preprint2022arXiv

Real Robot Challenge: A Robotics Competition in the Cloud

Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at MPI for Intelligent Systems and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able to control the platforms remotely by submitting code that is executed automatically, akin to a computational cluster. Using this setup, i) we host robotics competitions, where teams from anywhere in the world access our platforms to tackle challenging tasks ii) we publish the datasets collected during these competitions (consisting of hundreds of robot hours), and iii) we give researchers access to these platforms for their own projects.

preprint2021arXiv

Exploring Extended Reality with ILLIXR: A New Playground for Architecture Research

As we enter the era of domain-specific architectures, systems researchers must understand the requirements of emerging application domains. Augmented and virtual reality (AR/VR) or extended reality (XR) is one such important domain. This paper presents ILLIXR, the first open source end-to-end XR system (1) with state-of-the-art components, (2) integrated with a modular and extensible multithreaded runtime, (3) providing an OpenXR compliant interface to XR applications (e.g., game engines), and (4) with the ability to report (and trade off) several quality of experience (QoE) metrics. We analyze performance, power, and QoE metrics for the complete ILLIXR system and for its individual components. Our analysis reveals several properties with implications for architecture and systems research. These include demanding performance, power, and QoE requirements, a large diversity of critical tasks, inter-dependent execution pipelines with challenges in scheduling and resource management, and a large tradeoff space between performance/power and human perception related QoE metrics. ILLIXR and our analysis have the potential to propel new directions in architecture and systems research in general, and impact XR in particular. ILLIXR is open-source and available at https://illixr.github.io

preprint2020arXiv

Complexity Aspects of Fundamental Questions in Polynomial Optimization

In this thesis, we settle the computational complexity of some fundamental questions in polynomial optimization. These include the questions of (i) finding a local minimum, (ii) testing local minimality of a point, and (iii) deciding attainment of the optimal value. Our results characterize the complexity of these three questions for all degrees of the defining polynomials left open by prior literature. Regarding (i) and (ii), we show that unless P=NP, there cannot be a polynomial-time algorithm that finds a point within Euclidean distance $c^n$ (for any constant $c$) of a local minimum of an $n$-variate quadratic program. By contrast, we show that a local minimum of a cubic polynomial can be found efficiently by semidefinite programming (SDP). We prove that second-order points of cubic polynomials admit an efficient semidefinite representation, even though their critical points are NP-hard to find. We also give an efficiently-checkable necessary and sufficient condition for local minimality of a point for a cubic polynomial. Regarding (iii), we prove that testing whether a quadratically constrained quadratic program with a finite optimal value has an optimal solution is NP-hard. We also show that testing coercivity of the objective function, compactness of the feasible set, and the Archimedean property associated with the description of the feasible set are all NP-hard. We also give a new characterization of coercive polynomials that lends itself to a hierarchy of SDPs. In our final chapter, we present an SDP relaxation for finding approximate Nash equilibria in bimatrix games. We show that for a symmetric game, a $1/3$-Nash equilibrium can be efficiently recovered from any rank-2 solution to this relaxation. We also propose SDP relaxations for NP-hard problems related to Nash equilibria, such as that of finding the highest achievable welfare under any Nash equilibrium.

preprint2020arXiv

Memory-Efficient Incremental Learning Through Feature Adaptation

We introduce an approach for incremental learning that preserves feature descriptors of training images from previously learned classes, instead of the images themselves, unlike most existing work. Keeping the much lower-dimensional feature embeddings of images reduces the memory footprint significantly. We assume that the model is updated incrementally for new classes as new data becomes available sequentially.This requires adapting the previously stored feature vectors to the updated feature space without having access to the corresponding original training images. Feature adaptation is learned with a multi-layer perceptron, which is trained on feature pairs corresponding to the outputs of the original and updated network on a training image. We validate experimentally that such a transformation generalizes well to the features of the previous set of classes, and maps features to a discriminative subspace in the feature space. As a result, the classifier is optimized jointly over new and old classes without requiring old class images. Experimental results show that our method achieves state-of-the-art classification accuracy in incremental learning benchmarks, while having at least an order of magnitude lower memory footprint compared to image-preserving strategies.