Source author record

Michael Shekelyan

Michael Shekelyan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Databases Cryptography and Security Data Structures and Algorithms Machine Learning Social and Information Networks

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Differentially Private Top-k Selection via Canonical Lipschitz Mechanism

Selecting the top-$k$ highest scoring items under differential privacy (DP) is a fundamental task with many applications. This work presents three new results. First, the exponential mechanism, permute-and-flip and report-noisy-max, as well as their oneshot variants, are unified into the Lipschitz mechanism, an additive noise mechanism with a single DP-proof via a mandated Lipschitz property for the noise distribution. Second, this new generalized mechanism is paired with a canonical loss function to obtain the canonical Lipschitz mechanism, which can directly select k-subsets out of $d$ items in $O(dk+d \log d)$ time. The canonical loss function assesses subsets by how many users must change for the subset to become top-$k$. Third, this composition-free approach to subset selection improves utility guarantees by an $Ω(\log k)$ factor compared to one-by-one selection via sequential composition, and our experiments on synthetic and real-world data indicate substantial utility improvements.

preprint2022arXiv

Weighted Random Sampling over Joins

Joining records with all other records that meet a linkage condition can result in an astronomically large number of combinations due to many-to-many relationships. For such challenging (acyclic) joins, a random sample over the join result is a practical alternative to working with the oversized join result. Whereas prior works are limited to uniform join sampling where each join row is assigned the same probability, the scope is extended in this work to weighted sampling to support emerging applications such as scientific discovery in observational data and privacy-preserving query answering. Notwithstanding some naive methods, this work presents the first approach for weighted random sampling from join results. Due to a lack of baselines, experiments over various join types and real-world data sets are conducted to show substantial memory savings and competitive performance with main-memory index-based approaches in the equal-probability setting. In contrast to existing uniform sampling approaches that require prepared structures that occupy contested resources to squeeze out slightly faster query-times, the proposed approaches exhibit qualities that are urgently needed in practice, namely reduced memory footprint, streaming operation, support for selections, outer joins, semi joins and anti joins and unequal-probability sampling. All pertinent code and data can be found at: https://github.com/shekelyan/weightedjoinsampling

preprint2014arXiv

ParetoPrep: Fast computation of Path Skylines Queries

Computing cost optimal paths in network data is a very important task in many application areas like transportation networks, computer networks or social graphs. In many cases, the cost of an edge can be described by various cost criteria. For example, in a road network possible cost criteria are distance, time, ascent, energy consumption or toll fees. In such a multicriteria network, a route or path skyline query computes the set of all paths having pareto optimal costs, i.e. each result path is optimal for different user preferences. In this paper, we propose a new method for computing route skylines which significantly decreases processing time and memory consumption. Furthermore, our method does not rely on any precomputation or indexing method and thus, it is suitable for dynamically changing edge costs. Our experiments demonstrate that our method outperforms state of the art approaches and allows highly efficient path skyline computation without any preprocessing.