Source author record

Juergen Teich

Juergen Teich appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Hardware Architecture Artificial Intelligence Computational Geometry Computer Vision Machine Learning Mathematical Software

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Utilizing Explainable AI for Quantization and Pruning of Deep Neural Networks

For many applications, utilizing DNNs (Deep Neural Networks) requires their implementation on a target architecture in an optimized manner concerning energy consumption, memory requirement, throughput, etc. DNN compression is used to reduce the memory footprint and complexity of a DNN before its deployment on hardware. Recent efforts to understand and explain AI (Artificial Intelligence) methods have led to a new research area, termed as explainable AI. Explainable AI methods allow us to understand better the inner working of DNNs, such as the importance of different neurons and features. The concepts from explainable AI provide an opportunity to improve DNN compression methods such as quantization and pruning in several ways that have not been sufficiently explored so far. In this paper, we utilize explainable AI methods: mainly DeepLIFT method. We use these methods for (1) pruning of DNNs; this includes structured and unstructured pruning of \ac{CNN} filters pruning as well as pruning weights of fully connected layers, (2) non-uniform quantization of DNN weights using clustering algorithm; this is also referred to as Weight Sharing, and (3) integer-based mixed-precision quantization; this is where each layer of a DNN may use a different number of integer bits. We use typical image classification datasets with common deep learning image classification models for evaluation. In all these three cases, we demonstrate significant improvements as well as new insights and opportunities from the use of explainable AI in DNN compression.

preprint2014arXiv

A Scala Prototype to Generate Multigrid Solver Implementations for Different Problems and Target Multi-Core Platforms

Many problems in computational science and engineering involve partial differential equations and thus require the numerical solution of large, sparse (non)linear systems of equations. Multigrid is known to be one of the most efficient methods for this purpose. However, the concrete multigrid algorithm and its implementation highly depend on the underlying problem and hardware. Therefore, changes in the code or many different variants are necessary to cover all relevant cases. In this article we provide a prototype implementation in Scala for a framework that allows abstract descriptions of PDEs, their discretization, and their numerical solution via multigrid algorithms. From these, one is able to generate data structures and implementations of multigrid components required to solve elliptic PDEs on structured grids. Two different test problems showcase our proposed automatic generation of multigrid solvers for both CPU and GPU target platforms.

preprint2011arXiv

No-Break Dynamic Defragmentation of Reconfigurable Devices

We propose a new method for defragmenting the module layout of a reconfigurable device, enabled by a novel approach for dealing with communication needs between relocated modules and with inhomogeneities found in commonly used FPGAs. Our method is based on dynamic relocation of module positions during runtime, with only very little reconfiguration overhead; the objective is to maximize the length of contiguous free space that is available for new modules. We describe a number of algorithmic aspects of good defragmentation, and present an optimization method based on tabu search. Experimental results indicate that we can improve the quality of module layout by roughly 50 % over static layout. Among other benefits, this improvement avoids unnecessary rejections of modules

preprint2010arXiv

Maintaining Virtual Areas on FPGAs using Strip Packing with Delays

Every year, the computing resources available on dynamically partially reconfigurable devices increase enormously. In the near future, we expect many applications to run on a single reconfigurable device. In this paper, we present a concept for multitasking on dynamically partially reconfigurable systems called virtual area management. We explain its advantages, show its challenges, and discuss possible solutions. Furthermore, we investigate one problem in more detail: Packing modules with time-varying resource requests. This problem from the reconfigurable computing field results in a completely new optimization problem not tackled before. ILP-based and heuristic approaches are compared in an experimental study and the drawbacks and benefits discussed.

preprint2005arXiv

A Practical Approach for Circuit Routing on Dynamic Reconfigurable Devices

Management of communication by on-line routing in new FPGAs with a large amount of logic resources and partial reconfigurability is a new challenging problem. A Network-on-Chip (NoC) typically uses packet routing mechanism, which has often unsafe data transfers, and network interface overhead. In this paper, circuit routing for such dynamic NoCs is investigated, and a practical 1-dimensional network with an efficient routing algorithm is proposed and implemented. Also, this concept has been extended to the 2-dimensional case. The implementation results show the low area overhead and high performance of this network.

preprint2005arXiv

Optimal Free-Space Management and Routing-Conscious Dynamic Placement for Reconfigurable Devices

We describe algorithmic results for two crucial aspects of allocating resources on computational hardware devices with partial reconfigurability. By using methods from the field of computational geometry, we derive a method that allows correct maintainance of free and occupied space of a set of n rectangular modules in optimal time Theta(n log n); previous approaches needed a time of O(n^2) for correct results and O(n) for heuristic results. We also show that finding an optimal feasible communication-conscious placement (which minimizes the total weighted Manhattan distance between the new module and existing demand points) can be computed in Theta(n log n). Both resulting algorithms are practically easy to implement and show convincing experimental behavior.

Juergen Teich

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Utilizing Explainable AI for Quantization and Pruning of Deep Neural Networks

A Scala Prototype to Generate Multigrid Solver Implementations for Different Problems and Target Multi-Core Platforms

No-Break Dynamic Defragmentation of Reconfigurable Devices

Maintaining Virtual Areas on FPGAs using Strip Packing with Delays

A Practical Approach for Circuit Routing on Dynamic Reconfigurable Devices

Optimal Free-Space Management and Routing-Conscious Dynamic Placement for Reconfigurable Devices