Source author record

Talha Cihad Gulcu

Talha Cihad Gulcu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Machine Learning

Catalog footprint

What is connected

3works

3topics

2close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Statistical and Algorithmic Insights for Semi-supervised Learning with Self-training

Self-training is a classical approach in semi-supervised learning which is successfully applied to a variety of machine learning problems. Self-training algorithm generates pseudo-labels for the unlabeled examples and progressively refines these pseudo-labels which hopefully coincides with the actual labels. This work provides theoretical insights into self-training algorithm with a focus on linear classifiers. We first investigate Gaussian mixture models and provide a sharp non-asymptotic finite-sample characterization of the self-training iterations. Our analysis reveals the provable benefits of rejecting samples with low confidence and demonstrates that self-training iterations gracefully improve the model accuracy even if they do get stuck in sub-optimal fixed points. We then demonstrate that regularization and class margin (i.e. separation) is provably important for the success and lack of regularization may prevent self-training from identifying the core features in the data. Finally, we discuss statistical aspects of empirical risk minimization with self-training for general distributions. We show how a purely unsupervised notion of generalization based on self-training based clustering can be formalized based on cluster margin. We then establish a connection between self-training based semi-supervision and the more general problem of learning with heterogenous data and weak supervision.

preprint2016arXiv

Achieving Secrecy Capacity of the Wiretap Channel and Broadcast Channel with a Confidential Component

The wiretap channel model of Wyner is one of the first communication models with both reliability and security constraints. Capacity-achieving schemes for various models of the wiretap channel have received considerable attention in recent literature. In this paper, we show that capacity of the general (not necessarily degraded or symmetric) wiretap channel under a "strong secrecy constraint" can be achieved using a transmission scheme based on polar codes. We also extend our construction to the case of broadcast channels with confidential messages defined by Csisz{á}r and K{örner}, achieving the entire capacity region of this communication model.

preprint2015arXiv

Interactive Function Computation via Polar Coding

In a series of papers N. Ma and P. Ishwar (2011-13) considered a range of distributed source coding problems that arise in the context of iterative computation of functions, characterizing the region of achievable communication rates. We consider the problems of interactive computation of functions by two terminals and interactive computation in a collocated network, showing that the rate regions for both these problems can be achieved using several rounds of polar-coded transmissions.