Source author record

Zhengya Zhang

Zhengya Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Hardware Architecture Information Theory math.IT Artificial Intelligence Machine Learning

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Efficient Post-Processors for Improving Error-Correcting Performance of LDPC Codes

The error floor phenomenon, associated with iterative decoders, is one of the most significant limitations to the applications of low-density parity-check (LDPC) codes. A variety of techniques from code design to decoder implementation have been proposed to address the error floor problem, among which post-processors have shown to be both effective and implementation-friendly. In this work, we take the inspiration from simulated annealing to generalize the post-processor design using three methods: quenching, extended heating, and focused heating, each of which targets a different error structure. The resulting post-processor is demonstrated to lower the error floors by two orders of magnitude for two structured code examples, a (2209, 1978) array LDPC code, and a (1944, 1620) LDPC code used by the IEEE 802.11n standard. The post-processor can be integrated to a belief-propagation decoder with minimal overhead. The post-processor design is equally applicable to other structured LDPC codes.

preprint2022arXiv

HiMA: A Fast and Scalable History-based Memory Access Engine for Differentiable Neural Computer

Memory-augmented neural networks (MANNs) provide better inference performance in many tasks with the help of an external memory. The recently developed differentiable neural computer (DNC) is a MANN that has been shown to outperform in representing complicated data structures and learning long-term dependencies. DNC's higher performance is derived from new history-based attention mechanisms in addition to the previously used content-based attention mechanisms. History-based mechanisms require a variety of new compute primitives and state memories, which are not supported by existing neural network (NN) or MANN accelerators. We present HiMA, a tiled, history-based memory access engine with distributed memories in tiles. HiMA incorporates a multi-mode network-on-chip (NoC) to reduce the communication latency and improve scalability. An optimal submatrix-wise memory partition strategy is applied to reduce the amount of NoC traffic; and a two-stage usage sort method leverages distributed tiles to improve computation speed. To make HiMA fundamentally scalable, we create a distributed version of DNC called DNC-D to allow almost all memory operations to be applied to local memories with trainable weighted summation to produce the global memory output. Two approximation techniques, usage skimming and softmax approximation, are proposed to further enhance hardware efficiency. HiMA prototypes are created in RTL and synthesized in a 40nm technology. By simulations, HiMA running DNC and DNC-D demonstrates 6.47x and 39.1x higher speed, 22.8x and 164.3x better area efficiency, and 6.1x and 61.2x better energy efficiency over the state-of-the-art MANN accelerator. Compared to an Nvidia 3080Ti GPU, HiMA demonstrates speedup by up to 437x and 2,646x when running DNC and DNC-D, respectively.

preprint2011arXiv

Absorbing Set Spectrum Approach for Practical Code Design

This paper focuses on controlling the absorbing set spectrum for a class of regular LDPC codes known as separable, circulant-based (SCB) codes. For a specified circulant matrix, SCB codes all share a common mother matrix, examples of which are array-based LDPC codes and many common quasi-cyclic codes. SCB codes retain the standard properties of quasi-cyclic LDPC codes such as girth, code structure, and compatibility with efficient decoder implementations. In this paper, we define a cycle consistency matrix (CCM) for each absorbing set of interest in an SCB LDPC code. For an absorbing set to be present in an SCB LDPC code, the associated CCM must not be full columnrank. Our approach selects rows and columns from the SCB mother matrix to systematically eliminate dominant absorbing sets by forcing the associated CCMs to be full column-rank. We use the CCM approach to select rows from the SCB mother matrix to design SCB codes of column weight 5 that avoid all low-weight absorbing sets (4, 8), (5, 9), and (6, 8). Simulation results demonstrate that the newly designed code has a steeper error-floor slope and provides at least one order of magnitude of improvement in the low error rate region as compared to an elementary array-based code.