Researcher profile

Xianmin Liu

Xianmin Liu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 11 - UnverifiedVerification L1Unclaimed author
1works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

1 published item(s)

preprint2020arXiv

Complexity and Efficient Algorithms for Data Inconsistency Evaluating and Repairing

Data inconsistency evaluating and repairing are major concerns in data quality management. As the basic computing task, optimal subset repair is not only applied for cost estimation during the progress of database repairing, but also directly used to derive the evaluation of database inconsistency. Computing an optimal subset repair is to find a minimum tuple set from an inconsistent database whose remove results in a consistent subset left. Tight bound on the complexity and efficient algorithms are still unknown. In this paper, we improve the existing complexity and algorithmic results, together with a fast estimation on the size of optimal subset repair. We first strengthen the dichotomy for optimal subset repair computation problem, we show that it is not only APXcomplete, but also NPhard to approximate an optimal subset repair with a factor better than $17/16$ for most cases. We second show a $(2-0.5^{\tinyσ-1})$-approximation whenever given $σ$ functional dependencies, and a $(2-η_k+\frac{η_k}{k})$-approximation when an $η_k$-portion of tuples have the $k$-quasi-Tur$\acute{\text{a}}$n property for some $k>1$. We finally show a sublinear estimator on the size of optimal \textit{S}-repair for subset queries, it outputs an estimation of a ratio $2n+εn$ with a high probability, thus deriving an estimation of FD-inconsistency degree of a ratio $2+ε$. To support a variety of subset queries for FD-inconsistency evaluation, we unify them as the $\subseteq$-oracle which can answer membership-query, and return $p$ tuples uniformly sampled whenever given a number $p$. Experiments are conducted on range queries as an implementation of $\subseteq$-oracle, and results show the efficiency of our FD-inconsistency degree estimator.