Researcher profile

Theodoros Rekatsinas

Theodoros Rekatsinas contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

Saga: A Platform for Continuous Construction and Serving of Knowledge At Scale

We introduce Saga, a next-generation knowledge construction and serving platform for powering knowledge-based applications at industrial scale. Saga follows a hybrid batch-incremental design to continuously integrate billions of facts about real-world entities and construct a central knowledge graph that supports multiple production use cases with diverse requirements around data freshness, accuracy, and availability. In this paper, we discuss the unique challenges associated with knowledge graph construction at industrial scale, and review the main components of Saga and how they address these challenges. Finally, we share lessons-learned from a wide array of production use cases powered by Saga.

preprint2020arXiv

An Empirical Analysis of the Impact of Data Augmentation on Knowledge Distillation

Generalization Performance of Deep Learning models trained using Empirical Risk Minimization can be improved significantly by using Data Augmentation strategies such as simple transformations, or using Mixed Samples. We attempt to empirically analyze the impact of such strategies on the transfer of generalization between teacher and student models in a distillation setup. We observe that if a teacher is trained using any of the mixed sample augmentation strategies, such as MixUp or CutMix, the student model distilled from it is impaired in its generalization capabilities. We hypothesize that such strategies limit a model's capability to learn example-specific features, leading to a loss in quality of the supervision signal during distillation. We present a novel Class-Discrimination metric to quantitatively measure this dichotomy in performance and link it to the discriminative capacity induced by the different strategies on a network's latent space.

preprint2020arXiv

Data-Dependent Differentially Private Parameter Learning for Directed Graphical Models

Directed graphical models (DGMs) are a class of probabilistic models that are widely used for predictive analysis in sensitive domains, such as medical diagnostics. In this paper we present an algorithm for differentially private learning of the parameters of a DGM with a publicly known graph structure over fully observed data. Our solution optimizes for the utility of inference queries over the DGM and \textit{adds noise that is customized to the properties of the private input dataset and the graph structure of the DGM}. To the best of our knowledge, this is the first explicit data-dependent privacy budget allocation algorithm for DGMs. We compare our algorithm with a standard data-independent approach over a diverse suite of DGM benchmarks and demonstrate that our solution requires a privacy budget that is $3\times$ smaller to obtain the same or higher utility.

preprint2020arXiv

Principal Component Networks: Parameter Reduction Early in Training

Recent works show that overparameterized networks contain small subnetworks that exhibit comparable accuracy to the full model when trained in isolation. These results highlight the potential to reduce training costs of deep neural networks without sacrificing generalization performance. However, existing approaches for finding these small networks rely on expensive multi-round train-and-prune procedures and are non-practical for large data sets and models. In this paper, we show how to find small networks that exhibit the same performance as their overparameterized counterparts after only a few training epochs. We find that hidden layer activations in overparameterized networks exist primarily in subspaces smaller than the actual model width. Building on this observation, we use PCA to find a basis of high variance for layer inputs and represent layer weights using these directions. We eliminate all weights not relevant to the found PCA basis and term these network architectures Principal Component Networks. On CIFAR-10 and ImageNet, we show that PCNs train faster and use less energy than overparameterized models, without accuracy loss. We find that our transformation leads to networks with up to 23.8x fewer parameters, with equal or higher end-model accuracy---in some cases we observe improvements up to 3%. We also show that ResNet-20 PCNs outperform deep ResNet-110 networks while training faster.

preprint2020arXiv

Record fusion: A learning approach

Record fusion is the task of aggregating multiple records that correspond to the same real-world entity in a database. We can view record fusion as a machine learning problem where the goal is to predict the "correct" value for each attribute for each entity. Given a database, we use a combination of attribute-level, recordlevel, and database-level signals to construct a feature vector for each cell (or (row, col)) of that database. We use this feature vector alongwith the ground-truth information to learn a classifier for each of the attributes of the database. Our learning algorithm uses a novel stagewise additive model. At each stage, we construct a new feature vector by combining a part of the original feature vector with features computed by the predictions from the previous stage. We then learn a softmax classifier over the new feature space. This greedy stagewise approach can be viewed as a deep model where at each stage, we are adding more complicated non-linear transformations of the original feature vector. We show that our approach fuses records with an average precision of ~98% when source information of records is available, and ~94% without source information across a diverse array of real-world datasets. We compare our approach to a comprehensive collection of data fusion and entity consolidation methods considered in the literature. We show that our approach can achieve an average precision improvement of ~20%/~45% with/without source information respectively.