Source author record

Longkun Guo

Longkun Guo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Discrete Mathematics Machine Learning Networking and Internet Architecture

Catalog footprint

What is connected

5works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Improved Streaming Algorithm for Fair $k$-Center Clustering

Many real-world applications pose challenges in incorporating fairness constraints into the $k$-center clustering problem, where the dataset consists of $m$ demographic groups, each with a specified upper bound on the number of centers to ensure fairness. Focusing on big data scenarios, this paper addresses the problem in a streaming setting, where data points arrive one by one sequentially in a continuous stream. Leveraging a structure called the $λ$-independent center set, we propose a one-pass streaming algorithm that first computes a reserved set of points during the streaming process. Then, for the post-streaming process, we propose an approach for selecting centers from the reserved point set by analyzing all three possible cases, transforming the most complicated one into a specially constrained vertex cover problem in an auxiliary graph. Our algorithm achieves a tight approximation ratio of 5 while consuming $O(k\log n)$ memory. It can also be readily adapted to solve the offline fair $k$-center problem, achieving a 3-approximation ratio that matches the current state of the art. Furthermore, we extend our approach to a semi-structured data stream, where data points from each group arrive in batches. In this setting, we present a 3-approximation algorithm for $m = 2$ and a 4-approximation algorithm for general $m$. Lastly, we conduct extensive experiments to evaluate the performance of our approaches, demonstrating that they outperform existing baselines in both clustering cost and runtime efficiency.

preprint2026arXiv

Optimized Algorithms for Text Clustering with LLM-Generated Constraints

Clustering is a fundamental tool that has garnered significant interest across a wide range of applications including text analysis. To improve clustering accuracy, many researchers have incorporated background knowledge, typically in the form of must-link and cannot-link constraints, to guide the clustering process. With the recent advent of large language models (LLMs), there is growing interest in improving clustering quality through LLM-based automatic constraint generation. In this paper, we propose a novel constraint-generation approach that reduces resource consumption by generating constraint sets rather than using traditional pairwise constraints. This approach improves both query efficiency and constraint accuracy compared to state-of-the-art methods. We further introduce a constrained clustering algorithm tailored to the characteristics of LLM-generated constraints. Our method incorporates a confidence threshold and a penalty mechanism to address potentially inaccurate constraints. We evaluate our approach on five text datasets, considering both the cost of constraint generation and the overall clustering performance. The results show that our method achieves clustering accuracy comparable to the state-of-the-art algorithms while reducing the number of LLM queries by more than 20 times.

preprint2015arXiv

Efficient Approximation Algorithms for Computing \emph{k} Disjoint Restricted Shortest Paths

Network applications, such as multimedia streaming and video conferencing, impose growing requirements over Quality of Service (QoS), including bandwidth, delay, jitter, etc. Meanwhile, networks are expected to be load-balanced, energy-efficient, and resilient to some degree of failures. It is observed that the above requirements could be better met with multiple disjoint QoS paths than a single one. Let $G=(V,\, E)$ be a digraph with nonnegative integral cost and delay on every edge, $s,\, t\in V$ be two specified vertices, and $D\in\mathbb{Z}_{0}^{+}$ be a delay bound (or some other constraint), the \emph{$k$ Disjoint Restricted Shortest Path} ($k$\emph{RSP})\emph{ Problem} is computing $k$ disjoint paths between $s$ and $t$ with total cost minimized and total delay bounded by $D$. Few efficient algorithms have been developed because of the hardness of the problem. In this paper, we propose efficient algorithms with provable performance guarantees for the $k$RSP problem. We first present a pseudo-polynomial-time approximation algorithm with a bifactor approximation ratio of $(1,\,2)$, then improve the algorithm to polynomial time with a bifactor ratio of $(1+ε,\,2+ε)$ for any fixed $ε>0$, which is better than the current best approximation ratio $(O(1+γ),\, O(1+\frac{1}γ)\})$ for any fixed $γ>0$ \cite{orda2004efficient}. To the best of our knowledge, this is the first constant-factor algorithm that almost strictly obeys the constraint for the $k$RSP problem.

preprint2013arXiv

A Parameterized Approximation Algorithm for The Shallow-Light Steiner Tree Problem

For a given graph $G=(V,\, E)$ with a terminal set $S$ and a selected root $r\in S$, a positive integer cost and a delay on every edge and a delay constraint $D\in Z^{+}$, the shallow-light Steiner tree (\emph{SLST}) problem is to compute a minimum cost tree spanning the terminals of $S$, in which the delay between root and every vertex is restrained by $D$. This problem is NP-hard and very hard to approximate. According to known inapproximability results, this problem admits no approximation with ratio better than factor $(1,\, O(\log^{2}n))$ unless $NP\subseteq DTIME(n^{\log\log n})$ \cite{khandekar2013some}, while it admits no approximation ratio better than $(1,\, O(\log|V|))$ for D=4 unless $NP\subseteq DTIME(n^{\log\log n})$ \cite{bar2001generalized}. Hence, the paper focus on parameterized algorithm for \emph{SLST}. We firstly present an exact algorithm for \emph{SLST} with time complexity $O(3^{|S|}|V|D+2^{|S|}|V|^{2}D^{2}+|V|^{3}D^{3})$, where $|S|$ and $|V|$ are the number of terminals and vertices respectively. This is a pseudo polynomial time parameterized algorithm with respect to the parameterization: "number of terminals". Later, we improve this algorithm such that it runs in polynomial time $O(\frac{|V|^{2}}ε3^{|S|}+\frac{|V|^{4}}ε2^{|S|}+\frac{|V|^{6}}ε)$, and computes a Steiner tree with delay bounded by $(1+ε)D$ and cost bounded by the cost of an optimum solution, where $ε>0$ is any small real number. To the best of our knowledge, this is the first parameterized approximation algorithm for the \emph{SLST} problem.

preprint2013arXiv

Constrained Fault-Tolerant Resource Allocation

In the Constrained Fault-Tolerant Resource Allocation (FTRA) problem, we are given a set of sites containing facilities as resources, and a set of clients accessing these resources. Specifically, each site i is allowed to open at most R_i facilities with cost f_i for each opened facility. Each client j requires an allocation of r_j open facilities and connecting j to any facility at site i incurs a connection cost c_ij. The goal is to minimize the total cost of this resource allocation scenario. FTRA generalizes the Unconstrained Fault-Tolerant Resource Allocation (FTRA_{\infty}) [18] and the classical Fault-Tolerant Facility Location (FTFL) [13] problems: for every site i, FTRA_{\infty} does not have the constraint R_i, whereas FTFL sets R_i=1. These problems are said to be uniform if all r_j's are the same, and general otherwise. For the general metric FTRA, we first give an LP-rounding algorithm achieving the approximation ratio of 4. Then we show the problem reduces to FTFL, implying the ratio of 1.7245 from [3]. For the uniform FTRA, we provide a 1.52-approximation primal-dual algorithm in O(n^4) time, where n is the total number of sites and clients. We also consider the Constrained Fault-Tolerant k-Resource Allocation (k-FTRA) problem where additionally the total number of facilities can be opened across all sites is bounded by k. For the uniform k-FTRA, we give the first constant-factor approximation algorithm with a factor of 4. Note that the above results carry over to FTRA_{\infty} and k-FTRA_{\infty}.

Longkun Guo

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Improved Streaming Algorithm for Fair $k$-Center Clustering

Optimized Algorithms for Text Clustering with LLM-Generated Constraints

Efficient Approximation Algorithms for Computing \emph{k} Disjoint Restricted Shortest Paths

A Parameterized Approximation Algorithm for The Shallow-Light Steiner Tree Problem

Constrained Fault-Tolerant Resource Allocation