Researcher profile

Mario Günzel

Mario Günzel contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2022arXiv

Parallel Path Progression DAG Scheduling

To satisfy the increasing performance needs of modern cyber-physical systems, multiprocessor architectures are increasingly utilized. To efficiently exploit their potential parallelism in hard real-time systems, appropriate task models and scheduling algorithms that allow providing timing guarantees are required. Such scheduling algorithms and the corresponding worst-case response time analyses usually suffer from resource over-provisioning due to pessimistic analyses based on worst-case assumptions. Hence, scheduling algorithms and analysis with high resource efficiency are required. A prominent parallel task model is the directed-acyclic-graph (DAG) task model, where precedence constrained subjobs express parallelism. This paper studies the real-time scheduling problem of sporadic arbitrary-deadline DAG tasks. We propose a path parallel progression scheduling property with only two distinct subtask priorities, which allows to track the execution of a collection of paths simultaneously. This novel approach significantly improves the state-of-the-art response time analyses for parallel DAG tasks for highly parallel DAG structures. Two hierarchical scheduling algorithms are designed based on this property, extending the parallel path progression properties and improving the response time analysis for sporadic arbitrary-deadline DAG task sets.

preprint2021arXiv

Bit Error Tolerance Metrics for Binarized Neural Networks

To reduce the resource demand of neural network (NN) inference systems, it has been proposed to use approximate memory, in which the supply voltage and the timing parameters are tuned trading accuracy with energy consumption and performance. Tuning these parameters aggressively leads to bit errors, which can be tolerated by NNs when bit flips are injected during training. However, bit flip training, which is the state of the art for achieving bit error tolerance, does not scale well; it leads to massive overheads and cannot be applied for high bit error rates (BERs). Alternative methods to achieve bit error tolerance in NNs are needed, but the underlying principles behind the bit error tolerance of NNs have not been reported yet. With this lack of understanding, further progress in the research on NN bit error tolerance will be restrained. In this study, our objective is to investigate the internal changes in the NNs that bit flip training causes, with a focus on binarized NNs (BNNs). To this end, we quantify the properties of bit error tolerant BNNs with two metrics. First, we propose a neuron-level bit error tolerance metric, which calculates the margin between the pre-activation values and batch normalization thresholds. Secondly, to capture the effects of bit error tolerance on the interplay of neurons, we propose an inter-neuron bit error tolerance metric, which measures the importance of each neuron and computes the variance over all importance values. Our experimental results support that these two metrics are strongly related to bit error tolerance.

preprint2021arXiv

Response-Time Analysis and Optimization for Probabilistic Conditional Parallel DAG Tasks

Real-time systems increasingly use multicore processors in order to satisfy thermal, power, and computational requirements. To exploit the architectural parallelism offered by the multicore processors, parallel task models, scheduling algorithms and response-time analyses with respect to real-time constraints have to be provided. In this paper, we propose a reservation-based scheduling algorithm for sporadic constrained-deadline parallel conditional DAG tasks with probabilistic execution behaviour for applications that can tolerate bounded number of deadline misses and bounded tardiness. We devise design rules and analyses to guarantee bounded tardiness for a specified bounded probability for $k$-consecutive deadline misses without enforcing late jobs to be immediately aborted.

preprint2021arXiv

Universal Approximation Theorems of Fully Connected Binarized Neural Networks

Neural networks (NNs) are known for their high predictive accuracy in complex learning problems. Beside practical advantages, NNs also indicate favourable theoretical properties such as universal approximation (UA) theorems. Binarized Neural Networks (BNNs) significantly reduce time and memory demands by restricting the weight and activation domains to two values. Despite the practical advantages, theoretical guarantees based on UA theorems of BNNs are rather sparse in the literature. We close this gap by providing UA theorems for fully connected BNNs under the following scenarios: (1) for binarized inputs, UA can be constructively achieved under one hidden layer; (2) for inputs with real numbers, UA can not be achieved under one hidden layer but can be constructively achieved under two hidden layers for Lipschitz-continuous functions. Our results indicate that fully connected BNNs can approximate functions universally, under certain conditions.

preprint2020arXiv

On Schedulability Analysis of EDF Scheduling by Considering Suspension as Blocking

During the execution of a job, it may suspend itself, i.e., its computation ceases to process until certain activities are complete to be resumed. This paper provides a counterexample of the schedulability analysis by Devi in Euromicro Conference on Real-Time Systems (ECRTS) in 2003, which is the only existing suspension-aware analysis specialized for uniprocessor systems when preemptive earliest-deadline-first (EDF) is applied for scheduling dynamic selfsuspending tasks.

preprint2020arXiv

Towards Explainable Bit Error Tolerance of Resistive RAM-Based Binarized Neural Networks

Non-volatile memory, such as resistive RAM (RRAM), is an emerging energy-efficient storage, especially for low-power machine learning models on the edge. It is reported, however, that the bit error rate of RRAMs can be up to 3.3% in the ultra low-power setting, which might be crucial for many use cases. Binary neural networks (BNNs), a resource efficient variant of neural networks (NNs), can tolerate a certain percentage of errors without a loss in accuracy and demand lower resources in computation and storage. The bit error tolerance (BET) in BNNs can be achieved by flipping the weight signs during training, as proposed by Hirtzlin et al., but their method has a significant drawback, especially for fully connected neural networks (FCNN): The FCNNs overfit to the error rate used in training, which leads to low accuracy under lower error rates. In addition, the underlying principles of BET are not investigated. In this work, we improve the training for BET of BNNs and aim to explain this property. We propose straight-through gradient approximation to improve the weight-sign-flip training, by which BNNs adapt less to the bit error rates. To explain the achieved robustness, we define a metric that aims to measure BET without fault injection. We evaluate the metric and find that it correlates with accuracy over error rate for all FCNNs tested. Finally, we explore the influence of a novel regularizer that optimizes with respect to this metric, with the aim of providing a configurable trade-off in accuracy and BET.