Source author record

Mustafa Safa Ozdayi

Mustafa Safa Ozdayi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Databases cs.CY Data Structures and Algorithms Distributed, Parallel, and Cluster Computing Machine Learning

Catalog footprint

What is connected

4works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Fair Machine Learning under Limited Demographically Labeled Data

Research has shown that, machine learning models might inherit and propagate undesired social biases encoded in the data. To address this problem, fair training algorithms are developed. However, most algorithms assume we know demographic/sensitive data features such as gender and race. This assumption falls short in scenarios where collecting demographic information is not feasible due to privacy concerns, and data protection policies. A recent line of work develops fair training methods that can function without any demographic feature on the data, that are collectively referred as Rawlsian methods. Yet, we show in experiments that, Rawlsian methods tend to exhibit relatively high bias. Given this, we look at the middle ground between the previous approaches, and consider a setting where we know the demographic attributes for only a small subset of our data. In such a setting, we design fair training algorithms which exhibit both good utility, and low bias. In particular, we show that our techniques can train models to significantly outperform Rawlsian approaches even when 0.1% of demographic attributes are available in the training data. Furthermore, our main algorithm can accommodate multiple training objectives easily. We expand our main algorithm to achieve robustness to label noise in addition to fairness in the limited demographics setting to highlight that property as well.

preprint2020arXiv

Achieving Competitiveness in Online Problems

In the setting of online algorithms, the input is initially not present but rather arrive one-by-one over time and after each input, the algorithm has to make a decision. Depending on the formulation of the problem, the algorithm might be allowed to change its previous decisions or not at a later time. We analyze two problems to show that it is possible for an online algorithm to become more competitive by changing its former decisions. We first consider the online edge orientation in which the edges arrive one-by-one to an empty graph and the aim is to orient them in a way such that the maximum in-degree is minimized. We then consider the online bipartite b-matching. In this problem, we are given a bipartite graph where one side of the graph is initially present and where the other side arrive online. The goal is to maintain a matching set such that the maximum degree in the set is minimized. For both of the problems, the best achievable competitive ratio is $Θ(\log n)$ over n input arrivals when decisions are irreversible. We study three algorithms for these problems, two for the former and one for the latter, that achieve O(1) competitive ratio by changing O(n) of their decisions over n arrivals. In addition to that, we analyze one of the algorithms, the shortest path algorithm, against an adversary. Through that, we prove some new results about algorithms performance.

preprint2020arXiv

Leveraging Blockchain for Immutable Logging and Querying Across Multiple Sites

Blockchain has emerged as a decentralized and distributed framework that enables tamper-resilience and, thus, practical immutability for stored data. This immutability property is important in scenarios where auditability is desired, such as in maintaining access logs for sensitive healthcare and biomedical data.However, the underlying data structure of blockchain, by default, does not provide capabilities to efficiently query the stored data. In this investigation, we show that it is possible to efficiently run complex audit queries over the access log data stored on blockchains by using additional key-value stores. This paper specifically reports on the approach we designed for the blockchain track of iDASH Privacy & Security Workshop 2018 competition.Particularly, we implemented our solution and compared its loading and query-response performance with SQLite, a commonly used relational database, using the data provided by the iDASH 2018 organizers. Depending on the query type and the data size, the run time difference between blockchain based query-response and SQLite based query-response ranged from 0.2 seconds to 6 seconds. A deeper inspection revealed that range queries were the bottleneck of our solution which, nevertheless, scales up linearly. Concretely, this investigation demonstrates that blockchain-based systems can provide reasonable query-response times to complex queries even if they only use simple key-value stores to manage their data. Consequently, we show that blockchains may be useful for maintaining data with auditability and immutability requirements across multiple sites.

preprint2020arXiv

Secure IoT Data Analytics in Cloud via Intel SGX

The growing adoption of IoT devices in our daily life is engendering a data deluge, mostly private information that needs careful maintenance and secure storage system to ensure data integrity and protection. Also, the prodigious IoT ecosystem has provided users with opportunities to automate systems by interconnecting their devices and other services with rule-based programs. The cloud services that are used to store and process sensitive IoT data turn out to be vulnerable to outside threats. Hence, sensitive IoT data and rule-based programs need to be protected against cyberattacks. To address this important challenge, in this paper, we propose a framework to maintain confidentiality and integrity of IoT data and rule-based program execution. We design the framework to preserve data privacy utilizing Trusted Execution Environment (TEE) such as Intel SGX, and end-to-end data encryption mechanism. We evaluate the framework by executing rule-based programs in the SGX securely with both simulated and real IoT device data.

Mustafa Safa Ozdayi

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Fair Machine Learning under Limited Demographically Labeled Data

Achieving Competitiveness in Online Problems

Leveraging Blockchain for Immutable Logging and Querying Across Multiple Sites

Secure IoT Data Analytics in Cloud via Intel SGX