Researcher profile

Hossein Asadi

Hossein Asadi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

Enhancing Reliability of STT-MRAM Caches by Eliminating Read Disturbance Accumulation

Spin-Transfer Torque Magnetic RAM (STT-MRAM) as one of the most promising replacements for SRAMs in on-chip cache memories benefits from higher density and scalability, near-zero leakage power, and non-volatility, but its reliability is threatened by high read disturbance error rate. Error-Correcting Codes (ECCs) are conventionally suggested to overcome the read disturbance errors in STT-MRAM caches. By employing aggressive ECCs and checking out a cache block on every read access, a high level of cache reliability is achieved. However, to minimize the cache access time in modern processors, all blocks in the target cache set are simultaneously read in parallel for tags comparison operation and only the requested block is sent out, if any, after checking its ECC. These extra cache block reads without checking their ECCs until requesting the blocks by the processor cause the accumulation of read disturbance error, which significantly degrade the cache reliability. In this paper, we first introduce and formulate the read disturbance accumulation phenomenon and reveal that this accumulation due to conventional parallel accesses of cache blocks significantly increases the cache error rate. Then, we propose a simple yet effective scheme, so-called Read Error Accumulation Preventer cache (REAP-cache), to completely eliminate the accumulation of read disturbances without compromising the cache performance. Our evaluations show that the proposed REAP-cache extends the cache Mean Time To Failure (MTTF) by 171x, while increases the cache area by less than 1% and energy consumption by only 2.7%.

preprint2026arXiv

ROBIN: Incremental Oblique Interleaved ECC for Reliability Improvement in STT-MRAM Caches

Spin-Transfer Torque Magnetic RAM} (STT-MRAM) is a promising alternative for SRAMs in on-chip cache memories. Besides all its advantages, high error rate in STT-MRAM is a major limiting factor for on-chip cache memories. In this paper, we first present a comprehensive analysis that reveals that the conventional Error-Correcting Codes (ECCs) lose their efficiency due to data-dependent error patterns, and then propose an efficient ECC configuration, so-called ROBIN, to improve the correction capability. The evaluations show that the inefficiency of conventional ECC increases the cache error rate by an average of 151.7% while ROBIN reduces this value by more than 28.6x.

preprint2022arXiv

A System-Level Framework for Analytical and Empirical Reliability Exploration of STT-MRAM Caches

Spin-Transfer Torque Magnetic RAM (STT-MRAM) is known as the most promising replacement for SRAM technology in large Last-Level Caches (LLCs). Despite its high-density, non-volatility, near-zero leakage power, and immunity to radiation as the major advantages, STT-MRAM-based cache suffers from high error rates mainly due to retention failure, read disturbance, and write failure. Existing studies are limited to estimating the rate of only one or two of these error types for STT-MRAM cache. However, the overall vulnerability of STT-MRAM caches, which its estimation is a must to design cost-efficient reliable caches, has not been offered in any of previous studies. In this paper, we propose a system-level framework for reliability exploration and characterization of errors behavior in STT-MRAM caches. To this end, we formulate the cache vulnerability considering the inter-correlation of the error types including all three errors as well as the dependency of error rates to workloads behavior and Process Variations (PVs). Our analysis reveals that STT-MRAM cache vulnerability is highly workload-dependent and varies by orders of magnitude in different cache access patterns. Our analytical study also shows that this vulnerability divergence significantly increases by process variations in STT-MRAM cells. To evaluate the framework, we implement the error types in the gem5 full-system simulator, and the experimental results show that the total error rate in a shared LLC varies by 32.0x for different workloads. A further 6.5x vulnerability variation is observed when considering PVs in the STT-MRAM cells. In addition, the contribution of each error type in total LLC vulnerability highly varies in different cache access patterns and moreover, error rates are differently affected by PVs.

preprint2022arXiv

CoPA: Cold Page Awakening to Overcome Retention Failures in STT-MRAM Based I/O Buffers

Performance and reliability are two prominent factors in the design of data storage systems. To achieve higher performance, recently storage system designers use DRAM-based buffers. The volatility of DRAM brings up the possibility of data loss, so a part of the main storage is conventionally used as the journal area to be able of recovering unflushed data pages in the case of power failure. Moreover, periodically flushing buffered data pages to the main storage is a common mechanism to preserve a high level of reliability, which leads to an increase in storage write traffic. To address this shortcoming, recent studies offer a small NVM as the Persistent Journal Area (PJA) along with DRAM as an efficient approach, named NVM-Backed Buffer (NVB-Buffer). This approach aims to address DRAM vulnerability against power failure while reducing storage write traffic. In this paper, we use the most promising technologies for PJA among the emerging technologies, which is STT-MRAM to meet the requirements of PJA (high endurance, non-volatility, and DRAM-like latency). However, STT-MRAM faces major reliability challenges, i.e. Retention Failure, Read Disturbance, and Write Failure. In this paper, we first show that retention failure is the dominant source of errors in NVB-Buffers as it suffers from long and unpredictable page idle intervals. Then, we propose a novel NVB-Buffer management scheme, named, Cold Page Awakening (CoPA), which predictably reduces the idle time of PJA pages. To this aim, CoPA employs Distant Refreshing to periodically overwrite the vulnerable PJA page contents by using their replica in DRAM-based buffer. We compare CoPA with the state-of-the-art schemes over several workloads based on physical journaling. Our evaluations show that employing CoPA leads to three orders of magnitude lower failure rate with negligible performance degradation (1.1%) and memory overhead (1.2%).

preprint2022arXiv

TA-LRW: A Replacement Policy for Error Rate Reduction in STT-MRAM Caches

As technology process node scales down, on-chip SRAM caches lose their efficiency because of their low scalability, high leakage power, and increasing rate of soft errors. Among emerging memory technologies, Spin-Transfer Torque Magnetic RAM (STT-MRAM) is known as the most promising replacement for SRAM-based cache memories. The main advantages of STT-MRAM are its non-volatility, near-zero leakage power, higher density, soft-error immunity, and higher scalability. Despite these advantages, the high error rate in STT-MRAM cells due to retention failure, write failure, and read disturbance threatens the reliability of cache memories built upon STT-MRAM technology. The error rate is significantly increased in higher temperatures, which further affects the reliability of STT-MRAM-based cache memories. The major source of heat generation and temperature increase in STT-MRAM cache memories is write operations, which are managed by cache replacement policy. In this paper, we first analyze the cache behavior in the conventional LRU replacement policy and demonstrate that the majority of consecutive write operations (more than 66%) are committed to adjacent cache blocks. These adjacent write operations cause accumulated heat and increased temperature, which significantly increases the cache error rate. To eliminate heat accumulation and the adjacency of consecutive writes, we propose a cache replacement policy, named Thermal-Aware Least-Recently Written (TA-LRW), to smoothly distribute the generated heat by conducting consecutive write operations in distant cache blocks. TA-LRW guarantees the distance of at least three blocks for each two consecutive write operations in an 8-way associative cache. This distant write scheme reduces the temperature-induced error rate by 94.8%, on average, compared with the conventional LRU policy, which results in 6.9x reduction in cache error rate.