Source author record

Xiaolang Yan

Xiaolang Yan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Machine Learning math.NA Numerical Analysis Other Computer Science Performance

Catalog footprint

What is connected

2works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Predicting the Output Structure of Sparse Matrix Multiplication with Sampled Compression Ratio

Sparse general matrix multiplication (SpGEMM) is a fundamental building block in numerous scientific applications. One critical task of SpGEMM is to compute or predict the structure of the output matrix (i.e., the number of nonzero elements per output row) for efficient memory allocation and load balance, which impact the overall performance of SpGEMM. Existing work either precisely calculates the output structure or adopts upper-bound or sampling-based methods to predict the output structure. However, these methods either take much execution time or are not accurate enough. In this paper, we propose a novel sampling-based method with better accuracy and low costs compared to the existing sampling-based method. The proposed method first predicts the compression ratio of SpGEMM by leveraging the number of intermediate products (denoted as FLOP) and the number of nonzero elements (denoted as NNZ) of the same sampled result matrix. And then, the predicted output structure is obtained by dividing the FLOP per output row by the predicted compression ratio. We also propose a reference design of the existing sampling-based method with optimized computing overheads to demonstrate the better accuracy of the proposed method. We construct 625 test cases with various matrix dimensions and sparse structures to evaluate the prediction accuracy. Experimental results show that the absolute relative errors of the proposed method and the reference design are 1.56\% and 8.12\%, respectively, on average, and 25\% and 156\%, respectively, in the worst case.

preprint2007arXiv

Q-DPM: An Efficient Model-Free Dynamic Power Management Technique

When applying Dynamic Power Management (DPM) technique to pervasively deployed embedded systems, the technique needs to be very efficient so that it is feasible to implement the technique on low end processor and tight-budget memory. Furthermore, it should have the capability to track time varying behavior rapidly because the time varying is an inherent characteristic of real world system. Existing methods, which are usually model-based, may not satisfy the aforementioned requirements. In this paper, we propose a model-free DPM technique based on Q-Learning. Q-DPM is much more efficient because it removes the overhead of parameter estimator and mode-switch controller. Furthermore, its policy optimization is performed via consecutive online trialing, which also leads to very rapid response to time varying behavior.