Paper detail

A New Parallel Algorithm for Sinkhorn Word-Movers Distance and Its Performance on PIUMA and Xeon CPU

The Word Movers Distance (WMD) measures the semantic dissimilarity between two text documents by computing the cost of optimally moving all words of a source/query document to the most similar words of a target document. Computing WMD between two documents is costly because it requires solving an $O(V^3log(V))$ optimization problem where $V$ is the number of unique words in the document. Fortunately, WMD can be framed as an Earth Mover's Distance (EMD) for which the algorithmic complexity can be reduced to $O(V^2)$ by adding an entropy penalty to the optimization problem and solving it using the Sinkhorn-Knopp algorithm. Additionally, the computation can be made highly parallel by adopting a batching approach, i.e., computing the WMD of a single query document against multiple target documents at once. Sinkhorn WMD is a key kernel used in many ML/NLP applications. and usually gets implemented in Python. However, a straightforward Python implementation may leave significant performance on the table even though it may internally call optimized C++ BLAS routines. We present a new sparse {P}arallel {A}lgorithm for {S}inkhorn-Knopp {W}ord-movers {D}istance to compute the semantic distance of one document to many other documents by adopting the $O(V^2)$ EMD algorithm. We algorithmically transform $O(V^2)$ dense compute-heavy EMD version into an equivalent sparse one using new fused SDDMM-SpMM (sparse selection of dense-dense matrix-, sparse-dense matrix-multiplication) kernels. We implemented and optimized this algorithm for two very different architectures -- the new Intel Programmable Integrated Unified Memory Architecture (PIUMA) and Intel Xeon CPUs. We show that we were able to reach close to peak performance on both platforms.

preprint2022arXivOpen access
0citations
0reviews
0saves
Nocode
Nodataset
0institutions

Next steps

Decide what to do with this paper

Use like or dislike for the fast social read. The more specific scholarly feedback stays available below when needed.

Log in to curate

Reading frame

Keep the important context close to the paper

Keep the important signals around this paper in one place: votes, save state, collection context, reviews and the metadata you need before deciding what to do next.

Institutions

Add specific reaction

Move through the context

Research map

Open full explorer

Move through nearby people, institutions, topics and adjacent work without leaving the paper page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Structured reviews

0 review(s)

ContributeLeave structured feedbackUse the review template when you have a concrete strength, concern or method question.Open review form

No structured reviews yet. High-signal critique starts here.

Work discussion

0 comment(s)

DiscussAdd a high-signal commentKeep quick notes, caveats and replication pointers separate from formal reviews.Open comment form

No discussion yet. The first strong comment sets the tone.