Researcher profile

Jan Moritz Joseph

Jan Moritz Joseph contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

EmuNoC: Hybrid Emulation for Fast and Flexible Network-on-Chip Prototyping on FPGAs

Networks-on-Chips (NoCs) recently became widely used, from multi-core CPUs to edge-AI accelerators. Emulation on FPGAs promises to accelerate their RTL modeling compared to slow simulations. However, realistic test stimuli are challenging to generate in hardware for diverse applications. In other words, both a fast and flexible design framework is required. The most promising solution is hybrid emulation, in which parts of the design are simulated in software, and the other parts are emulated in hardware. This paper proposes a novel hybrid emulation framework called EmuNoC. We introduce a clock-synchronization method and software-only packet generation that improves the emulation speed by 36.3x to 79.3x over state-of-the-art frameworks while retaining the flexibility of a pure-software interface for stimuli simulation. We also increased the area efficiency to model up to an NoC with 169 routers on a single FPGA, while previous frameworks only achieved 64 routers.

preprint2022arXiv

X-Fault: Impact of Faults on Binary Neural Networks in Memristor-Crossbar Arrays with Logic-in-Memory Computation

Memristor-based crossbar arrays represent a promising emerging memory technology to replace conventional memories by offering a high density and enabling computing-in-memory (CIM) paradigms. While analog computing provides the best performance, non-idealities and ADC/DAC conversion limit memristor-based CIM. Logic-in-Memory (LIM) presents another flavor of CIM, in which the memristors are used in a binary manner to implement logic gates. Since binary neural networks (BNNs) use binary logic gates as the dominant operation, they can benefit from the massively parallel execution of binary operations and better resilience to variations of the memristors. Although conventional neural networks have been thoroughly investigated, the impact of faults on memristor-based BNNs remains unclear. Therefore, we analyze the impact of faults on logic gates in memristor-based crossbar arrays for BNNs. We propose a simulation framework that simulates different traditional faults to examine the accuracy loss of BNNs on memristive crossbar arrays. In addition, we compare different logic families based on the robustness and feasibility to accelerate AI applications.

preprint2021arXiv

Architecture, Dataflow and Physical Design Implications of 3D-ICs for DNN-Accelerators

The everlasting demand for higher computing power for deep neural networks (DNNs) drives the development of parallel computing architectures. 3D integration, in which chips are integrated and connected vertically, can further increase performance because it introduces another level of spatial parallelism. Therefore, we analyze dataflows, performance, area, power and temperature of such 3D-DNN-accelerators. Monolithic and TSV-based stacked 3D-ICs are compared against 2D-ICs. We identify workload properties and architectural parameters for efficient 3D-ICs and achieve up to 9.14x speedup of 3D vs. 2D. We discuss area-performance trade-offs. We demonstrate applicability as the 3D-IC draws similar power as 2D-ICs and is not thermal limited.

preprint2021arXiv

NeuroHammer: Inducing Bit-Flips in Memristive Crossbar Memories

Emerging non-volatile memory (NVM) technologies offer unique advantages in energy efficiency, latency, and features such as computing-in-memory. Consequently, emerging NVM technologies are considered an ideal substrate for computation and storage in future-generation neuromorphic platforms. These technologies need to be evaluated for fundamental reliability and security issues. In this paper, we present \emph{NeuroHammer}, a security threat in ReRAM crossbars caused by thermal crosstalk between memory cells. We demonstrate that bit-flips can be deliberately induced in ReRAM devices in a crossbar by systematically writing adjacent memory cells. A simulation flow is developed to evaluate NeuroHammer and the impact of physical parameters on the effectiveness of the attack. Finally, we discuss the security implications in the context of possible attack scenarios.

preprint2020arXiv

Ratatoskr: An open-source framework for in-depth power, performance and area analysis in 3D NoCs

We introduce ratatoskr, an open-source framework for in-depth power, performance and area (PPA) analysis in NoCs for 3D-integrated and heterogeneous System-on-Chips (SoCs). It covers all layers of abstraction by providing a NoC hardware implementation on RT level, a NoC simulator on cycle-accurate level and an application model on transaction level. By this comprehensive approach, ratatoskr can provide the following specific PPA analyses: Dynamic power of links can be measured within 2.4% accuracy of bit-level simulations while maintaining cycle-accurate simulation speed. Router power is determined from RT level synthesis combined with cycle-accurate simulations. The performance of the whole NoC can be measured both via cycle-accurate and RT level simulations. The performance of individual routers is obtained from RT level including gate-level verification. The NoC area is calculated from RT level. Despite these manifold features, ratatoskr offers easy two-step user interaction: First, a single point-of-entry that allows to set design parameters and second, PPA reports are generated automatically. For both the input and the output, different levels of abstraction can be chosen for high-level rapid network analysis or low-level improvement of architectural details. The synthesize NoC model reduces up to 32% total router power and 3% router area in comparison to a conventional standard router. As a forward-thinking and unique feature not found in other NoC PPA-measurement tools, ratatoskr supports heterogeneous 3D integration that is one of the most promising integration paradigms for upcoming SoCs. Thereby, ratatoskr lies the groundwork to design their communication architectures.