Researcher profile

Val Tannen

Val Tannen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

The Complexity of Finding Missing Answer Repairs

We investigate the problem of identifying database repairs for missing tuples in query answers. We show that when the query is part of the input - the combined complexity setting - determining whether or not a repair exists is polynomial-time is equivalent to the satisfiability problem for classes of queries admitting a weak form of projection and selection. We then identify the sub-classes of unions of conjunctive queries with negated atoms, defined by the relational algebra operations permitted to appear in the query, for which the minimal repair problem can be solved in polynomial time. In contrast, we show that the problem is NP-hard, as well as set cover-hard to approximate via strict reductions, whenever both projection and join are permitted in the input query. Additionally, we show that finding the size of a minimal repair for unions of conjunctive queries (with negated atoms permitted) is OptP[log(n)]-complete, while computing a minimal repair is possible with O($n^2$) queries to an NP oracle. With recursion permitted, the combined complexity of all of these variants increases significantly, with an EXP lower bound. However, from the data complexity perspective, we show that minimal repairs can be identified in polynomial time for all queries expressible as semi-positive datalog programs.

preprint2022arXiv

DBSP: Automatic Incremental View Maintenance for Rich Query Languages

Incremental view maintenance has been for a long time a central problem in database theory. Many solutions have been proposed for restricted classes of database languages, such as the relational algebra, or Datalog. These techniques do not naturally generalize to richer languages. In this paper we give a general solution to this problem in 3 steps: (1) we describe a simple but expressive language called DBSP for describing computations over data streams; (2) we give a general algorithm for solving the incremental view maintenance problem for arbitrary DBSP programs, and (3) we show how to model many rich database query languages (including the full relational queries, grouping and aggregation, monotonic and non-monotonic recursion, and streaming aggregation) using DBSP. As a consequence, we obtain efficient incremental view maintenance techniques for all these rich languages.

preprint2020arXiv

Generalized Absorptive Polynomials and Provenance Semantics for Fixed-Point Logic

Semiring provenance is a successful approach to provide detailed information on the combinations of atomic facts that are responsible for the result of a query. In particular, interpretations in general provenance semirings of polynomials or formal power series give precise descriptions of the successful evaluation strategies for the query. While provenance analysis in databases has, for a long time, been largely confined to negation-free query languages, a recent approach extends this to model checking problems for logics with full negation. Algebraically this relies on new quotient semirings of dual-indeterminate polynomials or power series. So far, this approach has been developed mainly for first-order logic and for the positive fragment of least fixed-point logic. What has remained open is an adequate treatment for fixed-point calculi that admit arbitrary interleavings of least and greatest fixed points. We show that an adequate framework for the provenance analysis of full fixed-point logics is provided by semirings that are (1) fully continuous, (2) absorptive, and (3) chain-positive. Full continuity guarantees that provenance values of least and greatest fixed-points are well-defined. Absorptive semirings provide a symmetry between least and greatest fixed-point computations and make sure that provenance values of greatest fixed points are informative. Finally, chain-positivity is responsible for having truth-preserving interpretations, which give non-zero values to all true formulae. We further identify semirings of generalized absorptive polynomials and prove universal properties that make them the most general appropriate semirings for LFP. We illustrate the power of provenance interpretations in these semirings by relating them to provenance values of plays and strategies in the associated model-checking games.

preprint2020arXiv

PrIU: A Provenance-Based Approach for Incrementally Updating Regression Models

The ubiquitous use of machine learning algorithms brings new challenges to traditional database problems such as incremental view update. Much effort is being put in better understanding and debugging machine learning models, as well as in identifying and repairing errors in training datasets. Our focus is on how to assist these activities when they have to retrain the machine learning model after removing problematic training samples in cleaning or selecting different subsets of training data for interpretability. This paper presents an efficient provenance-based approach, PrIU, and its optimized version, PrIU-opt, for incrementally updating model parameters without sacrificing prediction accuracy. We prove the correctness and convergence of the incrementally updated model parameters, and validate it experimentally. Experimental results show that up to two orders of magnitude speed-ups can be achieved by PrIU-opt compared to simply retraining the model from scratch, yet obtaining highly similar models.