Source author record

Saber Salehkaleybar

Saber Salehkaleybar appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Distributed, Parallel, and Cluster Computing Computer Science and Game Theory Data Structures and Algorithms eess.SP Information Theory math.IT Quantitative Methods

Catalog footprint

What is connected

11works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Data-Driven Covariate Selection for Nonparametric and Cycle-Agnostic Causal Effect Estimation

Estimating causal effects from observational data requires identifying valid adjustment sets. This task is especially challenging in realistic settings where latent confounding and feedback loops are present. Existing approaches typically assume acyclicity or rely on global causal structure learning, limiting applicability and computational efficiency. In this work, we study a local, data-driven method for covariate selection based on conditional independence information. While this method is known to be sound and complete in acyclic causal models, its validity in the presence of cycles has remained unclear. Our main contribution is to show that these guarantees extend to cyclic causal models. In particular, our result relies on the invariance of conditional independence assertions under $σ$-acyclification. These findings establish a unified, cycle-agnostic perspective on covariate selection and causal effect estimation, showing that the method applies across cyclic and acyclic settings without modification. Empirically, we validate this on extensive synthetic data, showing reliable performance in cyclic causal models.

preprint2026arXiv

Inference Time Causal Probing in LLMs

Causal probing methods aim to test and control how internal representations influence the behavior of generative models. In causal probing, an intervention modifies hidden states so that a property takes on a different value. Most existing approaches define such interventions by training an auxiliary probe classifier, which ties the method to a specific task or model and risks misalignment with the model's predictive geometry. We propose Hidden-state Driven Margin Intervention (HDMI), a probe-free, gradient-based technique that directly steers hidden states using the model's native output. HDMI applies a margin objective that increases the probability of a target continuation while decreasing that of the source, without relying on probe classifiers. We further introduce a lookahead variant (LA-HDMI) for text editing that backpropagates through the softmax embeddings, modifying the current hidden state so that the likelihood of user-specified tokens increases in next token generations while preserving fluency. To evaluate interventions, we measure completeness (whether the targeted property changes as intended) and selectivity (whether unrelated properties are preserved), and report their harmonic mean as an overall measure of reliability. HDMI consistently achieves higher reliability than prior methods on the LGD agreement corpus and the CausalGym benchmark, across Meta-Llama-3-8B-Instruct, and Pythia-70M.

preprint2022arXiv

Fast Causal Orientation Learning in Directed Acyclic Graphs

Causal relationships among a set of variables are commonly represented by a directed acyclic graph. The orientations of some edges in the causal DAG can be discovered from observational/interventional data. Further edges can be oriented by iteratively applying so-called Meek rules. Inferring edges' orientations from some previously oriented edges, which we call Causal Orientation Learning (COL), is a common problem in various causal discovery tasks. In these tasks, it is often required to solve multiple COL problems and therefore applying Meek rules could be time-consuming. Motivated by Meek rules, we introduce Meek functions that can be utilized in solving COL problems. In particular, we show that these functions have some desirable properties, enabling us to speed up the process of applying Meek rules. In particular, we propose a dynamic programming (DP) based method to apply Meek functions. Moreover, based on the proposed DP method, we present a lower bound on the number of edges that can be oriented as a result of intervention. We also propose a method to check whether some oriented edges belong to a causal DAG. Experimental results show that the proposed methods can outperform previous work in several causal discovery tasks in terms of running-time.

preprint2021arXiv

Deep-Learning Based Blind Recognition of Channel Code Parameters over Candidate Sets under AWGN and Multi-Path Fading Conditions

We consider the problem of recovering channel code parameters over a candidate set by merely analyzing the received encoded signals. We propose a deep learning-based solution that I) is capable of identifying the channel code parameters for any coding scheme (such as LDPC, Convolutional, Turbo, and Polar codes), II) is robust against channel impairments like multi-path fading, III) does not require any previous knowledge or estimation of channel state or signal-to-noise ratio (SNR), and IV) outperforms related works in terms of probability of detecting the correct code parameters.

preprint2021arXiv

ParaLiNGAM: Parallel Causal Structure Learning for Linear non-Gaussian Acyclic Models

One of the key objectives in many fields in machine learning is to discover causal relationships among a set of variables from observational data. In linear non-Gaussian acyclic models (LiNGAM), it can be shown that the true underlying causal structure can be identified uniquely from merely observational data. DirectLiNGAM algorithm is a well-known solution to learn the true causal structure in high dimensional setting. DirectLiNGAM algorithm executes in a sequence of iterations and it performs a set of comparisons between pairs of variables in each iteration. Unfortunately, the runtime of this algorithm grows significantly as the number of variables increases. In this paper, we propose a parallel algorithm, called ParaLiNGAM, to learn casual structures based on DirectLiNGAM algorithm. We propose a threshold mechanism that can reduce the number of comparisons remarkably compared with the sequential solution. Moreover, in order to further reduce runtime, we employ a messaging mechanism between workers and derive some mathematical formulations to simplify the execution of comparisons. We also present an implementation of ParaLiNGAM on GPU, considering hardware constraints. Experimental results on synthetic and real data show that the implementation of proposed algorithm on GPU can outperform DirectLiNGAM by a factor up to 4600 X.

preprint2020arXiv

Active Learning of Causal Structures with Deep Reinforcement Learning

We study the problem of experiment design to learn causal structures from interventional data. We consider an active learning setting in which the experimenter decides to intervene on one of the variables in the system in each step and uses the results of the intervention to recover further causal relationships among the variables. The goal is to fully identify the causal structures with minimum number of interventions. We present the first deep reinforcement learning based solution for the problem of experiment design. In the proposed method, we embed input graphs to vectors using a graph neural network and feed them to another neural network which outputs a variable for performing intervention in each step. Both networks are trained jointly via a Q-iteration algorithm. Experimental results show that the proposed method achieves competitive performance in recovering causal structures with respect to previous works, while significantly reducing execution time in dense graphs.

preprint2020arXiv

LazyIter: A Fast Algorithm for Counting Markov Equivalent DAGs and Designing Experiments

The causal relationships among a set of random variables are commonly represented by a Directed Acyclic Graph (DAG), where there is a directed edge from variable $X$ to variable $Y$ if $X$ is a direct cause of $Y$. From the purely observational data, the true causal graph can be identified up to a Markov Equivalence Class (MEC), which is a set of DAGs with the same conditional independencies between the variables. The size of an MEC is a measure of complexity for recovering the true causal graph by performing interventions. We propose a method for efficient iteration over possible MECs given intervention results. We utilize the proposed method for computing MEC sizes and experiment design in active and passive learning settings. Compared to previous work for computing the size of MEC, our proposed algorithm reduces the time complexity by a factor of $O(n)$ for sparse graphs where $n$ is the number of variables in the system. Additionally, integrating our approach with dynamic programming, we design an optimal algorithm for passive experiment design. Experimental results show that our proposed algorithms for both computing the size of MEC and experiment design outperform the state of the art.

preprint2019arXiv

cuPC: CUDA-based Parallel PC Algorithm for Causal Structure Learning on GPU

The main goal in many fields in the empirical sciences is to discover causal relationships among a set of variables from observational data. PC algorithm is one of the promising solutions to learn underlying causal structure by performing a number of conditional independence tests. In this paper, we propose a novel GPU-based parallel algorithm, called cuPC, to execute an order-independent version of PC. The proposed solution has two variants, cuPC-E and cuPC-S, which parallelize PC in two different ways for multivariate normal distribution. Experimental results show the scalability of the proposed algorithms with respect to the number of variables, the number of samples, and different graph densities. For instance, in one of the most challenging datasets, the runtime is reduced from more than 11 hours to about 4 seconds. On average, cuPC-E and cuPC-S achieve 500 X and 1300 X speedup, respectively, compared to serial implementation on CPU. The source code of cuPC is available online [1].

preprint2019arXiv

One-Shot Federated Learning: Theoretical Limits and Algorithms to Achieve Them

We consider distributed statistical optimization in one-shot setting, where there are $m$ machines each observing $n$ i.i.d. samples. Based on its observed samples, each machine sends a $B$-bit-long message to a server. The server then collects messages from all machines, and estimates a parameter that minimizes an expected convex loss function. We investigate the impact of communication constraint, $B$, on the expected error and derive a tight lower bound on the error achievable by any algorithm. We then propose an estimator, which we call Multi-Resolution Estimator (MRE), whose expected error (when $B\ge\log mn$) meets the aforementioned lower bound up to poly-logarithmic factors, and is thereby order optimal. We also address the problem of learning under tiny communication budget, and present lower and upper error bounds when $B$ is a constant. The expected error of MRE, unlike existing algorithms, tends to zero as the number of machines ($m$) goes to infinity, even when the number of samples per machine ($n$) remains upper bounded by a constant. This property of the MRE algorithm makes it applicable in new machine learning paradigms where $m$ is much larger than $n$.

preprint2010arXiv

A New Framework for Cognitive Medium Access Control: POSG Approach

In this paper, we propose a new analytical framework to solve medium access problem for secondary users (SUs) in cognitive radio networks. Partially Observable Stochastic Games (POSG) and Decentralized Markov Decision Process (Dec-POMDP) are two multi-agent Markovian decision processes which are used to present a solution. A primary network with two SUs is considered as an example to demonstrate our proposed framework. Two different scenarios are assumed. In the first scenario, SUs compete to acquire the licensed channel which is modeled using POSG framework. In the second one, SUs cooperate to access channel for which the solution is based on Dec-POMDP. Besides, the dominant strategy for both of the above mentioned scenarios is presented for a three slot horizon length.

preprint2010arXiv

QoS-Aware Joint Policies in Cognitive Radio Networks

One of the most challenging problems in Opportunistic Spectrum Access (OSA) is to design channel sensing-based protocol in multi secondary users (SUs) network. Quality of Service (QoS) requirements for SUs have significant implications on this protocol design. In this paper, we propose a new method to find joint policies for SUs which not only guarantees QoS requirements but also maximizes network throughput. We use Decentralized Partially Observable Markov Decision Process (Dec-POMDP) to formulate interactions between SUs. Meanwhile, a tractable approach for Dec-POMDP is utilized to extract sub-optimum joint policies for large horizons. Among these policies, the joint policy which guarantees QoS requirements is selected as the joint sensing strategy for SUs. To show the efficiency of the proposed method, we consider two SUs trying to access two-channel primary users (PUs) network modeled by discrete Markov chains. Simulations demonstrate three interesting findings: 1- Optimum joint policies for large horizons can be obtained using the proposed method. 2- There exists a joint policy for the assumed QoS constraints. 3- Our method outperforms other related works in terms of network throughput.

Saber Salehkaleybar

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Data-Driven Covariate Selection for Nonparametric and Cycle-Agnostic Causal Effect Estimation

Inference Time Causal Probing in LLMs

Fast Causal Orientation Learning in Directed Acyclic Graphs

Deep-Learning Based Blind Recognition of Channel Code Parameters over Candidate Sets under AWGN and Multi-Path Fading Conditions

ParaLiNGAM: Parallel Causal Structure Learning for Linear non-Gaussian Acyclic Models

Active Learning of Causal Structures with Deep Reinforcement Learning

LazyIter: A Fast Algorithm for Counting Markov Equivalent DAGs and Designing Experiments

cuPC: CUDA-based Parallel PC Algorithm for Causal Structure Learning on GPU

One-Shot Federated Learning: Theoretical Limits and Algorithms to Achieve Them

A New Framework for Cognitive Medium Access Control: POSG Approach

QoS-Aware Joint Policies in Cognitive Radio Networks