Researcher profile

Stephen Kobourov

Stephen Kobourov contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
15works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

15 published item(s)

preprint2026arXiv

Bridging Graph Drawing and Dimensionality Reduction with Stochastic Stress Optimization

Both Dimensionality Reduction (DR) and Graph Drawing (GD) aim to visualize abstract, non-linear structures, yet rely on different optimization paradigms. This contrast is evident in Multidimensional Scaling (MDS), which typically depends on the SMACOF algorithm despite graph drawing results showing that simpler stochastic optimization schemes can be more effective for the same objective. We bridge these domains by adapting Stochastic Gradient Descent (SGD) techniques from graph drawing to vector data embedding. We present a scikit-learn compatible estimator that minimizes global stress through local pairwise updates, improving upon the existing implementation. Experiments on standard high-dimensional benchmarks show that our stochastic solver converges substantially faster than SMACOF while achieving comparable or lower stress.

preprint2026arXiv

Class Angular Distortion Index for Dimensionality Reduction

Dimensionality reduction (DR) techniques are often characterized by whether they preserve global, high-level structures in the data or local, neighborhood structures. This distinction matters in visualization: global methods can obscure clusters while local methods can over-emphasize them. Yet, even when clusters appear distinct, their relative arrangement in the projection may be arbitrary or misleading, a common issue in techniques such as t-SNE and UMAP. Existing cluster quality metrics either only measure cluster separability or assume spherical, globular clusters in the original space. We introduce the Class Angular Distortion Index (CADI), a metric that uses internal angles among point triples to determine the faithfulness of cluster organization in a projection. We show cases on both real and synthetic data where existing cluster metrics fail, but CADI provides an interpretable result. Since it relies on computing angles, CADI is also differentiable, enabling optimization. We demonstrate this with a CADI-based DR technique.

preprint2022arXiv

An FPT Algorithm for Bipartite Vertex Splitting

Bipartite graphs model the relationship between two disjoint sets of objects. They have a wide range of applications and are often visualized as a 2-layered drawing, where each set of objects is visualized as a set of vertices (points) on one of the two parallel horizontal lines and the relationships are represented by edges (simple curves) between the two lines connecting the corresponding vertices. One of the common objectives in such drawings is to minimize the number of crossings this, however, is computationally expensive and may still result in drawings with so many crossings that they affect the readability of the drawing. We consider a recent approach to remove crossings in such visualizations by splitting vertices, where the goal is to find the minimum number of vertices to be split to obtain a planar drawing. We show that determining whether a planar drawing exists after splitting at most $k$ vertices is fixed parameter tractable in $k$.

preprint2022arXiv

Browser-based Hyperbolic Visualization of Graphs

Hyperbolic geometry offers a natural focus + context for data visualization and has been shown to underlie real-world complex networks. However, current hyperbolic network visualization approaches are limited to special types of networks and do not scale to large datasets. With this in mind, we designed, implemented, and analyzed three methods for hyperbolic visualization of networks in the browser based on inverse projections, generalized force-directed algorithms, and hyperbolic multi-dimensional scaling (H-MDS). A comparison with Euclidean MDS shows that H-MDS produces embeddings with lower distortion for several types of networks. All three methods can handle node-link representations and are available in fully functional web-based systems.

preprint2022arXiv

On Random Graph Properties

We consider 15 properties of labeled random graphs that are of interest in the graph-theoretical and the graph mining literature, such as clustering coefficients, centrality measures, spectral radius, degree assortativity, treedepth, treewidth, etc. We analyze relationships and correlations between these properties. Whereas for graphs on a small number of vertices we can exactly compute the average values and range for each property of interest, this becomes infeasible for larger graphs. We show that graphs generated by the \ErdosRenyi graph generator with $p = 1/2$ model well the underlying space of all labeled graphs with a fixed number of vertices. The later observation allows us to analyze properties and correlations between these properties for larger graphs. We then use linear and non-linear models to predict a given property based on the others and for each property, we find the most predictive subset. We experimentally show that pairs and triples of properties have high predictive power, making it possible to estimate computationally expensive to compute properties with ones for which there are efficient algorithms.

preprint2022arXiv

Spherical Graph Drawing by Multi-dimensional Scaling

We describe an efficient and scalable spherical graph embedding method. The method uses a generalization of the Euclidean stress function for Multi-Dimensional Scaling adapted to spherical space, where geodesic pairwise distances are employed instead of Euclidean distances. The resulting spherical stress function is optimized by means of stochastic gradient descent. Quantitative and qualitative evaluations demonstrate the scalability and effectiveness of the proposed method. We also show that some graph families can be embedded with lower distortion on the sphere, than in Euclidean and hyperbolic spaces.

preprint2022arXiv

The Rique-Number of Graphs

We continue the study of linear layouts of graphs in relation to known data structures. At a high level, given a data structure, the goal is to find a linear order of the vertices of the graph and a partition of its edges into pages, such that the edges in each page follow the restriction of the given data structure in the underlying order. In this regard, the most notable representatives are the stack and queue layouts, while there exists some work also for deques. In this paper, we study linear layouts of graphs that follow the restriction of a restricted-input queue (rique), in which insertions occur only at the head, and removals occur both at the head and the tail. We characterize the graphs admitting rique layouts with a single page and we use the characterization to derive a corresponding testing algorithm when the input graph is maximal planar. We finally give bounds on the number of needed pages (so-called rique-number) of complete graphs.

preprint2022arXiv

The Segment Number: Algorithms and Universal Lower Bounds for Some Classes of Planar Graphs

The segment number of a planar graph $G$ is the smallest number of line segments needed for a planar straight-line drawing of $G$. Dujmović, Eppstein, Suderman, and Wood [CGTA'07] introduced this measure for the visual complexity of graphs. There are optimal algorithms for trees and worst-case optimal algorithms for outerplanar graphs, 2-trees, and planar 3-trees. It is known that every cubic triconnected planar $n$-vertex graph (except $K_4$) has segment number $n/2+3$, which is the only known universal lower bound for a meaningful class of planar graphs. We show that every triconnected planar 4-regular graph can be drawn using at most $n+3$ segments. This bound is tight up to an additive constant, improves a previous upper bound of $7n/4+2$ implied by a more general result of Dujmović et al., and supplements the result for cubic graphs. We also give a simple optimal algorithm for cactus graphs, generalizing the above-mentioned result for trees. We prove the first linear universal lower bounds for outerpaths, maximal outerplanar graphs, 2-trees, and planar 3-trees. This shows that the existing algorithms for these graph classes are constant-factor approximations. For maximal outerpaths, our bound is best possible and can be generalized to circular arcs.

preprint2022arXiv

Visualizing Evolving Trees

Evolving trees arise in many real-life scenarios from computer file systems and dynamic call graphs, to fake news propagation and disease spread. Most layout algorithms for static trees do not work well in an evolving setting (e.g., they are not designed to be stable between time steps). Dynamic graph layout algorithms are better suited to this task, although they often introduce unnecessary edge crossings. With this in mind we propose two methods for visualizing evolving trees that guarantee no edge crossings, while optimizing (1) desired edge length realization, (2) layout compactness, and (3) stability. We evaluate the two new methods, along with five prior approaches (three static and two dynamic), on real-world datasets using quantitative metrics: stress, desired edge length realization, layout compactness, stability, and running time. The new methods are fully functional and available on github.

preprint2021arXiv

On the Readability of Abstract Set Visualizations

Set systems are used to model data that naturally arises in many contexts: social networks have communities, musicians have genres, and patients have symptoms. Visualizations that accurately reflect the information in the underlying set system make it possible to identify the set elements, the sets themselves, and the relationships between the sets. In static contexts, such as print media or infographics, it is necessary to capture this information without the help of interactions. With this in mind, we consider three different systems for medium-sized set data, LineSets, EulerView, and MetroSets, and report the results of a controlled human-subjects experiment comparing their effectiveness. Specifically, we evaluate the performance, in terms of time and error, on tasks that cover the spectrum of static set-based tasks. We also collect and analyze qualitative data about the three different visualization systems. Our results include statistically significant differences, suggesting that MetroSets performs and scales better.

preprint2020arXiv

Drawing Shortest Paths in Geodetic Graphs

Motivated by the fact that in a space where shortest paths are unique, no two shortest paths meet twice, we study a question posed by Greg Bodwin: Given a geodetic graph $G$, i.e., an unweighted graph in which the shortest path between any pair of vertices is unique, is there a philogeodetic drawing of $G$, i.e., a drawing of $G$ in which the curves of any two shortest paths meet at most once? We answer this question in the negative by showing the existence of geodetic graphs that require some pair of shortest paths to cross at least four times. The bound on the number of crossings is tight for the class of graphs we construct. Furthermore, we exhibit geodetic graphs of diameter two that do not admit a philogeodetic drawing.

preprint2020arXiv

Graph Drawing via Gradient Descent, $(GD)^2$

Readability criteria, such as distance or neighborhood preservation, are often used to optimize node-link representations of graphs to enable the comprehension of the underlying data. With few exceptions, graph drawing algorithms typically optimize one such criterion, usually at the expense of others. We propose a layout approach, Graph Drawing via Gradient Descent, $(GD)^2$, that can handle multiple readability criteria. $(GD)^2$ can optimize any criterion that can be described by a smooth function. If the criterion cannot be captured by a smooth function, a non-smooth function for the criterion is combined with another smooth function, or auto-differentiation tools are used for the optimization. Our approach is flexible and can be used to optimize several criteria that have already been considered earlier (e.g., obtaining ideal edge lengths, stress, neighborhood preservation) as well as other criteria which have not yet been explicitly optimized in such fashion (e.g., vertex resolution, angular resolution, aspect ratio). We provide quantitative and qualitative evidence of the effectiveness of $(GD)^2$ with experimental data and a functional prototype: \url{http://hdc.cs.arizona.edu/~mwli/graph-drawing/}.

preprint2020arXiv

Graph Spanners: A Tutorial Review

This tutorial review provides a guiding reference to researchers who want to have an overview of the large body of literature about graph spanners. It reviews the current literature covering various research streams about graph spanners, such as different formulations, sparsity and lightness results, computational complexity, dynamic algorithms, and applications. As an additional contribution, we offer a list of open problems on graph spanners.

preprint2020arXiv

Kruskal-based approximation algorithm for the multi-level Steiner tree problem

We study the multi-level Steiner tree problem: a generalization of the Steiner tree problem in graphs where terminals $T$ require varying priority, level, or quality of service. In this problem, we seek to find a minimum cost tree containing edges of varying rates such that any two terminals $u$, $v$ with priorities $P(u)$, $P(v)$ are connected using edges of rate $\min\{P(u),P(v)\}$ or better. The case where edge costs are proportional to their rate is approximable to within a constant factor of the optimal solution. For the more general case of non-proportional costs, this problem is hard to approximate with ratio $c \log \log n$, where $n$ is the number of vertices in the graph. A simple greedy algorithm by Charikar et al., however, provides a $\min\{2(\ln |T|+1), \ell ρ\}$-approximation in this setting, where $ρ$ is an approximation ratio for a heuristic solver for the Steiner tree problem and $\ell$ is the number of priorities or levels (Byrka et al. give a Steiner tree algorithm with $ρ\approx 1.39$, for example). In this paper, we describe a natural generalization to the multi-level case of the classical (single-level) Steiner tree approximation algorithm based on Kruskal's minimum spanning tree algorithm. We prove that this algorithm achieves an approximation ratio at least as good as Charikar et al., and experimentally performs better with respect to the optimum solution. We develop an integer linear programming formulation to compute an exact solution for the multi-level Steiner tree problem with non-proportional edge costs and use it to evaluate the performance of our algorithm on both random graphs and multi-level instances derived from SteinLib.

preprint2020arXiv

Multi-Perspective, Simultaneous Embedding

We describe MPSE: a Multi-Perspective Simultaneous Embedding method for visualizing high-dimensional data, based on multiple pairwise distances between the data points. Specifically, MPSE computes positions for the points in 3D and provides different views into the data by means of 2D projections (planes) that preserve each of the given distance matrices. We consider two versions of the problem: fixed projections and variable projections. MPSE with fixed projections takes as input a set of pairwise distance matrices defined on the data points, along with the same number of projections and embeds the points in 3D so that the pairwise distances are preserved in the given projections. MPSE with variable projections takes as input a set of pairwise distance matrices and embeds the points in 3D while also computing the appropriate projections that preserve the pairwise distances. The proposed approach can be useful in multiple scenarios: from creating simultaneous embedding of multiple graphs on the same set of vertices, to reconstructing a 3D object from multiple 2D snapshots, to analyzing data from multiple points of view. We provide a functional prototype of MPSE that is based on an adaptive and stochastic generalization of multi-dimensional scaling to multiple distances and multiple variable projections. We provide an extensive quantitative evaluation with datasets of different sizes and using different number of projections, as well as several examples that illustrate the quality of the resulting solutions.