Researcher profile

James Hanlon

James Hanlon contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
3topics
3close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2022arXiv

A Fast Hardware Pseudorandom Number Generator Based on xoroshiro128

The Graphcore Intelligence Processing Unit contains an original pseudorandom number generator (PRNG) called xoroshiro128aox, based on the F2-linear generator xoroshiro128. It is designed to be cheap to implement in hardware and provide high-quality statistical randomness. In this paper, we present a rigorous assessment of the generator's quality using standard statistical test suites and compare the results with the fast contemporary PRNGs xoroshiro128+, pcg64 and philox4x32-10. We show that xoroshiro128aox mitigates the known weakness in the lower order bits of xoroshiro128+ with a new 'AOX' output function by passing the BigCrush and PractRand suites, but we note that the function has some minor non uniformities. We focus our testing with specific tests for linear artefacts to highlight the weaknesses of both xoroshiro128 PRNGs, but conclude that they are hard to detect, and xoroshiro128aox otherwise provides a good trade off between statistical quality and hardware implementation cost.

preprint2012arXiv

Scalable data abstractions for distributed parallel computations

The ability to express a program as a hierarchical composition of parts is an essential tool in managing the complexity of software and a key abstraction this provides is to separate the representation of data from the computation. Many current parallel programming models use a shared memory model to provide data abstraction but this doesn't scale well with large numbers of cores due to non-determinism and access latency. This paper proposes a simple programming model that allows scalable parallel programs to be expressed with distributed representations of data and it provides the programmer with the flexibility to employ shared or distributed styles of data-parallelism where applicable. It is capable of an efficient implementation, and with the provision of a small set of primitive capabilities in the hardware, it can be compiled to operate directly on the hardware, in the same way stack-based allocation operates for subroutines in sequential machines.

preprint2011arXiv

Fast Distributed Process Creation with the XMOS XS1 Architecture

The provision of mechanisms for processor allocation in current distributed parallel programming models is very limited. This makes difficult, or even prohibits, the expression of a large class of programs which require a run-time assessment of their required resources. This includes programs whose structure is irregular, composite or unbounded. Efficient allocation of processors requires a process creation mechanism able to initiate and terminate remote computations quickly. This paper presents the design, demonstration and analysis of an explicit mechanism to do this, implemented on the XMOS XS1 architecture, as a foundation for a more dynamic scheme. It shows that process creation can be made efficient so that it incurs only a fractional overhead of the total runtime and that it can be combined naturally with recursion to enable rapid distribution of computations over a system.