Source author record

Zoltan Somogyi

Zoltan Somogyi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Programming Languages Distributed, Parallel, and Cluster Computing Performance

Catalog footprint

What is connected

3works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2012arXiv

Region-based memory management for Mercury programs

Region-based memory management (RBMM) is a form of compile time memory management, well-known from the functional programming world. In this paper we describe our work on implementing RBMM for the logic programming language Mercury. One interesting point about Mercury is that it is designed with strong type, mode, and determinism systems. These systems not only provide Mercury programmers with several direct software engineering benefits, such as self-documenting code and clear program logic, but also give language implementors a large amount of information that is useful for program analyses. In this work, we make use of this information to develop program analyses that determine the distribution of data into regions and transform Mercury programs by inserting into them the necessary region operations. We prove the correctness of our program analyses and transformation. To execute the annotated programs, we have implemented runtime support that tackles the two main challenges posed by backtracking. First, backtracking can require regions removed during forward execution to be "resurrected"; and second, any memory allocated during a computation that has been backtracked over must be recovered promptly and without waiting for the regions involved to come to the end of their life. We describe in detail our solution of both these problems. We study in detail how our RBMM system performs on a selection of benchmark programs, including some well-known difficult cases for RBMM. Even with these difficult cases, our RBMM-enabled Mercury system obtains clearly faster runtimes for 15 out of 18 benchmarks compared to the base Mercury system with its Boehm runtime garbage collector, with an average runtime speedup of 24%, and an average reduction in memory requirements of 95%. In fact, our system achieves optimal memory consumption in some programs.

preprint2011arXiv

Estimating the overlap between dependent computations for automatic parallelization

Researchers working on the automatic parallelization of programs have long known that too much parallelism can be even worse for performance than too little, because spawning a task to be run on another CPU incurs overheads. Autoparallelizing compilers have therefore long tried to use granularity analysis to ensure that they only spawn off computations whose cost will probably exceed the spawn-off cost by a comfortable margin. However, this is not enough to yield good results, because data dependencies may \emph{also} limit the usefulness of running computations in parallel. If one computation blocks almost immediately and can resume only after another has completed its work, then the cost of parallelization again exceeds the benefit. We present a set of algorithms for recognizing places in a program where it is worthwhile to execute two or more computations in parallel that pay attention to the second of these issues as well as the first. Our system uses profiling information to compute the times at which a procedure call consumes the values of its input arguments and the times at which it produces the values of its output arguments. Given two calls that may be executed in parallel, our system uses the times of production and consumption of the variables they share to determine how much their executions would overlap if they were run in parallel, and therefore whether executing them in parallel is a good idea or not. We have implemented this technique for Mercury in the form of a tool that uses profiling data to generate recommendations about what to parallelize, for the Mercury compiler to apply on the next compilation of the program. We present preliminary results that show that this technique can yield useful parallelization speedups, while requiring nothing more from the programmer than representative input data for the profiling run.

preprint2011arXiv

Profiling parallel Mercury programs with ThreadScope

The behavior of parallel programs is even harder to understand than the behavior of sequential programs. Parallel programs may suffer from any of the performance problems affecting sequential programs, as well as from several problems unique to parallel systems. Many of these problems are quite hard (or even practically impossible) to diagnose without help from specialized tools. We present a proposal for a tool for profiling the parallel execution of Mercury programs, a proposal whose implementation we have already started. This tool is an adaptation and extension of the ThreadScope profiler that was first built to help programmers visualize the execution of parallel Haskell programs.

Zoltan Somogyi

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Region-based memory management for Mercury programs

Estimating the overlap between dependent computations for automatic parallelization

Profiling parallel Mercury programs with ThreadScope