Researcher profile

Alberto Garcia-Ortiz

Alberto Garcia-Ortiz contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 11 - UnverifiedVerification L1Unclaimed author
1works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

1 published item(s)

preprint2021arXiv

MemPool-3D: Boosting Performance and Efficiency of Shared-L1 Memory Many-Core Clusters with 3D Integration

Three-dimensional integrated circuits promise power, performance, and footprint gains compared to their 2D counterparts, thanks to drastic reductions in the interconnects' length through their smaller form factor. We can leverage the potential of 3D integration by enhancing MemPool, an open-source many-core design with 256 cores and a shared pool of L1 scratchpad memory connected with a low-latency interconnect. MemPool's baseline 2D design is severely limited by routing congestion and wire propagation delay, making the design ideal for 3D integration. In architectural terms, we increase MemPool's scratchpad memory capacity beyond the sweet spot for 2D designs, improving performance in a common digital signal processing kernel. We propose a 3D MemPool design that leverages a smart partitioning of the memory resources across two layers to balance the size and utilization of the stacked dies. In this paper, we explore the architectural and the technology parameter spaces by analyzing the power, performance, area, and energy efficiency of MemPool instances in 2D and 3D with 1 MiB, 2 MiB, 4 MiB, and 8 MiB of scratchpad memory in a commercial 28 nm technology node. We observe a performance gain of 9.1% when running a matrix multiplication on the MemPool-3D design with 4 MiB of scratchpad memory compared to the MemPool 2D counterpart. In terms of energy efficiency, we can implement the MemPool-3D instance with 4 MiB of L1 memory on an energy budget 15% smaller than its 2D counterpart, and even 3.7% smaller than the MemPool-2D instance with one-fourth of the L1 scratchpad memory capacity.