Source author record

Vladimir Stojanovic

Vladimir Stojanovic appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Distributed, Parallel, and Cluster Computing Information Theory math.IT physics.optics

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

JUMBO: Scalable Multi-task Bayesian Optimization using Offline Data

The goal of Multi-task Bayesian Optimization (MBO) is to minimize the number of queries required to accurately optimize a target black-box function, given access to offline evaluations of other auxiliary functions. When offline datasets are large, the scalability of prior approaches comes at the expense of expressivity and inference quality. We propose JUMBO, an MBO algorithm that sidesteps these limitations by querying additional data based on a combination of acquisition signals derived from training two Gaussian Processes (GP): a cold-GP operating directly in the input domain and a warm-GP that operates in the feature space of a deep neural network pretrained using the offline data. Such a decomposition can dynamically control the reliability of information derived from the online and offline data and the use of pretrained neural networks permits scalability to large offline datasets. Theoretically, we derive regret bounds for JUMBO and show that it achieves no-regret under conditions analogous to GP-UCB (Srinivas et. al. 2010). Empirically, we demonstrate significant performance improvements over existing approaches on two real-world optimization problems: hyper-parameter optimization and automated circuit design.

preprint2020arXiv

GACEM: Generalized Autoregressive Cross Entropy Method for Multi-Modal Black Box Constraint Satisfaction

In this work we present a new method of black-box optimization and constraint satisfaction. Existing algorithms that have attempted to solve this problem are unable to consider multiple modes, and are not able to adapt to changes in environment dynamics. To address these issues, we developed a modified Cross-Entropy Method (CEM) that uses a masked auto-regressive neural network for modeling uniform distributions over the solution space. We train the model using maximum entropy policy gradient methods from Reinforcement Learning. Our algorithm is able to express complicated solution spaces, thus allowing it to track a variety of different solution regions. We empirically compare our algorithm with variations of CEM, including one with a Gaussian prior with fixed variance, and demonstrate better performance in terms of: number of diverse solutions, better mode discovery in multi-modal problems, and better sample efficiency in certain cases.

preprint2020arXiv

Tuning Algorithms and Generators for Efficient Edge Inference

A surge in artificial intelligence and autonomous technologies have increased the demand toward enhanced edge-processing capabilities. Computational complexity and size of state-of-the-art Deep Neural Networks (DNNs) are rising exponentially with diverse network models and larger datasets. This growth limits the performance scaling and energy-efficiency of both distributed and embedded inference platforms. Embedded designs at the edge are constrained by energy and speed limitations of available processor substrates and processor to memory communication required to fetch the model coefficients. While many hardware accelerator and network deployment frameworks have been in development, a framework is needed to allow the variety of existing architectures, and those in development, to be expressed in critical parts of the flow that perform various optimization steps. Moreover, premature architecture-blind network selection and optimization diminish the effectiveness of schedule optimizations and hardware-specific mappings. In this paper, we address these issues by creating a cross-layer software-hardware design framework that encompasses network training and model compression that is aware of and tuned to the underlying hardware architecture. This approach leverages the available degrees of DNN structure and sparsity to create a converged network that can be partitioned and efficiently scheduled on the target hardware platform, minimizing data movement, and improving the overall throughput and energy. To further streamline the design, we leverage the high-level, flexible SoC generator platform based on RISC-V ROCC framework. This integration allows seamless extensions of the RISC-V instruction set and Chisel-based rapid generator design. Utilizing this approach, we implemented a silicon prototype in a 16 nm TSMC process node achieving record processing efficiency of up to 18 TOPS/W.

preprint2015arXiv

Photonics design tool for advanced CMOS nodes

Recently, the authors have demonstrated large-scale integrated systems with several million transistors and hundreds of photonic elements. Yielding such large-scale integrated systems requires a design-for-manufacture rigour that is embodied in the 10 000 to 50 000 design rules that these designs must comply within advanced complementary metal-oxide semiconductor manufacturing. Here, the authors present a photonic design automation tool which allows automatic generation of layouts without design-rule violations. This tool is written in SKILL, the native language of the mainstream electric design automation software, Cadence. This allows seamless integration of photonic and electronic design in a single environment. The tool leverages intuitive photonic layer definitions, allowing the designer to focus on the physical properties rather than on technology-dependent details. For the first time the authors present an algorithm for removal of design-rule violations from photonic layouts based on Manhattan discretisation, Boolean and sizing operations. This algorithm is not limited to the implementation in SKILL, and can in principle be implemented in any scripting language. Connectivity is achieved with software-defined waveguide ports and low-level procedures that enable auto-routing of waveguide connections.

preprint2012arXiv

On U-Statistics and Compressed Sensing II: Non-Asymptotic Worst-Case Analysis

In another related work, U-statistics were used for non-asymptotic "average-case" analysis of random compressed sensing matrices. In this companion paper the same analytical tool is adopted differently - here we perform non-asymptotic "worst-case" analysis. Simple union bounds are a natural choice for "worst-case" analyses, however their tightness is an issue (and questioned in previous works). Here we focus on a theoretical U-statistical result, which potentially allows us to prove that these union bounds are tight. To our knowledge, this kind of (powerful) result is completely new in the context of CS. This general result applies to a wide variety of parameters, and is related to (Stein-Chen) Poisson approximation. In this paper, we consider i) restricted isometries, and ii) mutual coherence. For the bounded case, we show that k-th order restricted isometry constants have tight union bounds, when the measurements m = \mathcal{O}(k (1 + \log(n/k))). Here we require the restricted isometries to grow linearly in k, however we conjecture that this result can be improved to allow them to be fixed. Also, we show that mutual coherence (with the standard estimate \sqrt{(4\log n)/m}) have very tight union bounds. For coherence, the normalization complicates general discussion, and we consider only Gaussian and Bernoulli cases here.