Source author record

Fangwen Fu

Fangwen Fu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Multimedia Distributed, Parallel, and Cluster Computing Information Theory math.IT Programming Languages Systems and Control

Catalog footprint

What is connected

3works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

C-for-Metal: High Performance SIMD Programming on Intel GPUs

The SIMT execution model is commonly used for general GPU development. CUDA and OpenCL developers write scalar code that is implicitly parallelized by compiler and hardware. On Intel GPUs, however, this abstraction has profound performance implications as the underlying ISA is SIMD and important hardware capabilities cannot be fully utilized. To close this performance gap we introduce C-For-Metal (CM), an explicit SIMD programming framework designed to deliver close-to-the-metal performance on Intel GPUs. The CM programming language and its vector/matrix types provide an intuitive interface to exploit the underlying hardware features, allowing fine-grained register management, SIMD size control and cross-lane data sharing. Experimental results show that CM applications from different domains outperform the best-known SIMT-based OpenCL implementations, achieving up to 2.7x speedup on the latest Intel GPU.

preprint2010arXiv

Structural Solutions to Dynamic Scheduling for Multimedia Transmission in Unknown Wireless Environments

In this paper, we propose a systematic solution to the problem of scheduling delay-sensitive media data for transmission over time-varying wireless channels. We first formulate the dynamic scheduling problem as a Markov decision process (MDP) that explicitly considers the users' heterogeneous multimedia data characteristics (e.g. delay deadlines, distortion impacts and dependencies etc.) and time-varying channel conditions, which are not simultaneously considered in state-of-the-art packet scheduling algorithms. This formulation allows us to perform foresighted decisions to schedule multiple data units for transmission at each time in order to optimize the long-term utilities of the multimedia applications. The heterogeneity of the media data enables us to express the transmission priorities between the different data units as a priority graph, which is a directed acyclic graph (DAG). This priority graph provides us with an elegant structure to decompose the multi-data unit foresighted decision at each time into multiple single-data unit foresighted decisions which can be performed sequentially, from the high priority data units to the low priority data units, thereby significantly reducing the computation complexity. When the statistical knowledge of the multimedia data characteristics and channel conditions is unknown a priori, we develop a low-complexity online learning algorithm to update the value functions which capture the impact of the current decision on the future utility. The simulation results show that the proposed solution significantly outperforms existing state-of-the-art scheduling solutions.

preprint2010arXiv

Structure-Aware Stochastic Control for Transmission Scheduling

In this paper, we consider the problem of real-time transmission scheduling over time-varying channels. We first formulate the transmission scheduling problem as a Markov decision process (MDP) and systematically unravel the structural properties (e.g. concavity in the state-value function and monotonicity in the optimal scheduling policy) exhibited by the optimal solutions. We then propose an online learning algorithm which preserves these structural properties and achieves -optimal solutions for an arbitrarily small . The advantages of the proposed online method are that: (i) it does not require a priori knowledge of the traffic arrival and channel statistics and (ii) it adaptively approximates the state-value functions using piece-wise linear functions and has low storage and computation complexity. We also extend the proposed low-complexity online learning solution to the prioritized data transmission. The simulation results demonstrate that the proposed method achieves significantly better utility (or delay)-energy trade-offs when comparing to existing state-of-art online optimization methods.

Fangwen Fu

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

C-for-Metal: High Performance SIMD Programming on Intel GPUs

Structural Solutions to Dynamic Scheduling for Multimedia Transmission in Unknown Wireless Environments

Structure-Aware Stochastic Control for Transmission Scheduling