Researcher profile

Marco Domenico Santambrogio

Marco Domenico Santambrogio contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2022arXiv

A Mixed Precision, Multi-GPU Design for Large-scale Top-K Sparse Eigenproblems

Graph analytics techniques based on spectral methods process extremely large sparse matrices with millions or even billions of non-zero values. Behind these algorithms lies the Top-K sparse eigenproblem, the computation of the largest eigenvalues and their associated eigenvectors. In this work, we leverage GPUs to scale the Top-K sparse eigenproblem to bigger matrices than previously achieved while also providing state-of-the-art execution times. We can transparently partition the computation across multiple GPUs, process out-of-core matrices, and tune precision and execution time using mixed-precision floating-point arithmetic. Overall, we are 67 times faster than the highly optimized ARPACK library running on a 104-thread CPU and 1.9 times than a recent FPGA hardware design. We also determine how mixed-precision floating-point arithmetic improves execution time by 50% over double-precision, and is 12 times more accurate than single-precision floating-point arithmetic.

preprint2021arXiv

Scaling up HBM Efficiency of Top-K SpMV for Approximate Embedding Similarity on FPGAs

Top-K SpMV is a key component of similarity-search on sparse embeddings. This sparse workload does not perform well on general-purpose NUMA systems that employ traditional caching strategies. Instead, modern FPGA accelerator cards have a few tricks up their sleeve. We introduce a Top-K SpMV FPGA design that leverages reduced precision and a novel packet-wise CSR matrix compression, enabling custom data layouts and delivering bandwidth efficiency often unreachable even in architectures with higher peak bandwidth. With HBM-based boards, we are 100x faster than a multi-threaded CPU implementation and 2x faster than a GPU with 20% higher bandwidth, with 14.2x higher power-efficiency.