Source author record

Yifei Xie

Yifei Xie appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Methodology econ.EM Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Optimizing View Change for Byzantine Fault Tolerance in Parallel Consensus

The parallel Byzantine Fault Tolerant (BFT) protocol is viewed as a promising solution to address the consensus scalability issue of the permissioned blockchain. One of the main challenges in parallel BFT is the view change process that happens when the leader node fails, which can lead to performance bottlenecks. Existing parallel BFT protocols typically rely on passive view change mechanisms with blind leader rotation. Such approaches frequently select unavailable or slow nodes as leaders, resulting in degraded performance. To address these challenges, we propose a View Change Optimization (VCO) model based on mixed integer programming that optimizes leader selection and follower reassignment across parallel committees by considering communication delays and failure scenarios. We applied a decomposition method with efficient subproblems and improved benders cuts to solve the VCO model. Leveraging the results of improved decomposition solution method, we propose an efficient iterative backup leader selection algorithm as views proceed. By performing experiments in Microsoft Azure cloud environments, we demonstrate that the VCO-driven parallel BFT outperforms existing configuration methods under both normal operation and faulty condition. The results show that the VCO model is effective as network size increases, making it a suitable solution for high-performance parallel BFT systems.

preprint2026arXiv

RepFlow: Representation Enhanced Flow Matching for Causal Effect Estimation

Estimating causal effects from observational data has become increasingly critical in diverse fields including healthcare, economics, and social policy. The fundamental challenge in causal inference arises from the missing counterfactuals and the selection bias. Existing methods are largely limited to point estimates and lack the capacity for distribution modeling. In this work, we propose RepFlow, a novel framework that formulates causal effect estimation as a joint optimization problem integrating representation learning with Conditional Flow Matching (CFM). RepFlow mitigates selection bias by minimizing the entropically regularized Wasserstein distance between treated and control representations. To enhance numerical stability, we further introduce an $L_2$ normalization constraint on latent representations. This balanced representation enables the flow model to accurately capture the distribution of potential outcomes. Extensive experiments across a wide range of benchmarks demonstrate that RepFlow consistently outperforms existing methods in both point and distributional causal effect estimation.

preprint2021arXiv

Discrete Choice Analysis with Machine Learning Capabilities

This paper discusses capabilities that are essential to models applied in policy analysis settings and the limitations of direct applications of off-the-shelf machine learning methodologies to such settings. Traditional econometric methodologies for building discrete choice models for policy analysis involve combining data with modeling assumptions guided by subject-matter considerations. Such considerations are typically most useful in specifying the systematic component of random utility discrete choice models but are typically of limited aid in determining the form of the random component. We identify an area where machine learning paradigms can be leveraged, namely in specifying and systematically selecting the best specification of the random component of the utility equations. We review two recent novel applications where mixed-integer optimization and cross-validation are used to algorithmically select optimal specifications for the random utility components of nested logit and logit mixture models subject to interpretability constraints.

preprint2020arXiv

Sparse Covariance Estimation in Logit Mixture Models

This paper introduces a new data-driven methodology for estimating sparse covariance matrices of the random coefficients in logit mixture models. Researchers typically specify covariance matrices in logit mixture models under one of two extreme assumptions: either an unrestricted full covariance matrix (allowing correlations between all random coefficients), or a restricted diagonal matrix (allowing no correlations at all). Our objective is to find optimal subsets of correlated coefficients for which we estimate covariances. We propose a new estimator, called MISC, that uses a mixed-integer optimization (MIO) program to find an optimal block diagonal structure specification for the covariance matrix, corresponding to subsets of correlated coefficients, for any desired sparsity level using Markov Chain Monte Carlo (MCMC) posterior draws from the unrestricted full covariance matrix. The optimal sparsity level of the covariance matrix is determined using out-of-sample validation. We demonstrate the ability of MISC to correctly recover the true covariance structure from synthetic data. In an empirical illustration using a stated preference survey on modes of transportation, we use MISC to obtain a sparse covariance matrix indicating how preferences for attributes are related to one another.