Researcher profile

Alexander Gutfraind

Alexander Gutfraind contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2019arXiv

Using massive health insurance claims data to predict very high-cost claimants: a machine learning approach

Due to escalating healthcare costs, accurately predicting which patients will incur high costs is an important task for payers and providers of healthcare. High-cost claimants (HiCCs) are patients who have annual costs above $\$250,000$ and who represent just 0.16% of the insured population but currently account for 9% of all healthcare costs. In this study, we aimed to develop a high-performance algorithm to predict HiCCs to inform a novel care management system. Using health insurance claims from 48 million people and augmented with census data, we applied machine learning to train binary classification models to calculate the personal risk of HiCC. To train the models, we developed a platform starting with 6,006 variables across all clinical and demographic dimensions and constructed over one hundred candidate models. The best model achieved an area under the receiver operating characteristic curve of 91.2%. The model exceeds the highest published performance (84%) and remains high for patients with no prior history of high-cost status (89%), who have less than a full year of enrollment (87%), or lack pharmacy claims data (88%). It attains an area under the precision-recall curve of 23.1%, and precision of 74% at a threshold of 0.99. A care management program enrolling 500 people with the highest HiCC risk is expected to treat 199 true HiCCs and generate a net savings of $\$7.3$ million per year. Our results demonstrate that high-performing predictive models can be constructed using claims data and publicly available data alone, even for rare high-cost claimants exceeding $\$250,000$. Our model demonstrates the transformational power of machine learning and artificial intelligence in care management, which would allow healthcare payers and providers to introduce the next generation of care management programs.

preprint2012arXiv

Multiscale Network Generation

Networks are widely used in science and technology to represent relationships between entities, such as social or ecological links between organisms, enzymatic interactions in metabolic systems, or computer infrastructure. Statistical analyses of networks can provide critical insights into the structure, function, dynamics, and evolution of those systems. However, the structures of real-world networks are often not known completely, and they may exhibit considerable variation so that no single network is sufficiently representative of a system. In such situations, researchers may turn to proxy data from related systems, sophisticated methods for network inference, or synthetic networks. Here, we introduce a flexible method for synthesizing realistic ensembles of networks starting from a known network, through a series of mappings that coarsen and later refine the network structure by randomized editing. The method, MUSKETEER, preserves structural properties with minimal bias, including unknown or unspecified features, while introducing realistic variability at multiple scales. Using examples from several domains, we show that MUSKETEER produces the intended stochasticity while achieving greater fidelity across a suite of network properties than do other commonly used network generation algorithms.

preprint2012arXiv

Optimal recovery of damaged infrastructure network

Natural disasters or attacks may disrupt infrastructure networks on a vast scale. Parts of the damaged network are interdependent, making it difficult to plan and optimally execute the recovery operations. To study how interdependencies affect the recovery schedule, we introduce a new discrete optimization problem where the goal is to minimize the total cost of installing (or recovering) a given network. This cost is determined by the structure of the network and the sequence in which the nodes are installed. Namely, the cost of installing a node is a function of the number of its neighbors that have been installed before it. We analyze the natural case where the cost function is decreasing and convex, and provide bounds on the cost of the optimal solution. We also show that all sequences have the same cost when the cost function is linear and provide an upper bound on the cost of a random solution for an Erdős-Rényi random graph. Examining the computational complexity, we show that the problem is NP-hard when the cost function is arbitrary. Finally, we provide a formulation as an integer program, an exact dynamic programming algorithm, and a greedy heuristic which gives high quality solutions.

preprint2011arXiv

Evader Interdiction and Collateral Damage

In network interdiction problems, evaders (e.g., hostile agents or data packets) may be moving through a network towards targets and we wish to choose locations for sensors in order to intercept the evaders before they reach their destinations. The evaders might follow deterministic routes or Markov chains, or they may be reactive}, i.e., able to change their routes in order to avoid sensors placed to detect them. The challenge in such problems is to choose sensor locations economically, balancing security gains with costs, including the inconvenience sensors inflict upon innocent travelers. We study the objectives of 1) maximizing the number of evaders captured when limited by a budget on sensing cost and 2) capturing all evaders as cheaply as possible. We give optimal sensor placement algorithms for several classes of special graphs and hardness and approximation results for general graphs, including for deterministic or Markov chain-based and reactive or oblivious evaders. In a similar-sounding but fundamentally different problem setting posed by Rubinstein and Glazer where both evaders and innocent travelers are reactive, we again give optimal algorithms for special cases and hardness and approximation results on general graphs.

preprint2011arXiv

Lanchester Theory and the Fate of Armed Revolts

Major revolts have recently erupted in parts of the Middle East with substantial international repercussions. Predicting, coping with and winning those revolts have become a grave problem for many regimes and for world powers. We propose a new model of such revolts that describes their evolution by building on the classic Lanchester theory of combat. The model accounts for the split in the population between those loyal to the regime and those favoring the rebels. We show that, contrary to classical Lanchesterian insights regarding traditional force-on-force engagements, the outcome of a revolt is independent of the initial force sizes; it only depends on the fraction of the population supporting each side and their combat effectiveness. We also consider the effects of foreign intervention and of shifting loyalties of the two populations during the conflict. The model's predictions are consistent with the situations currently observed in Afghanistan, Libya and Syria (Spring 2011) and it offers tentative guidance on policy.

preprint2010arXiv

Interdiction of a Markovian Evader

Shortest path network interdiction is a combinatorial optimization problem on an activity network arising in a number of important security-related applications. It is classically formulated as a bilevel maximin problem representing an "interdictor" and an "evader". The evader tries to move from a source node to the target node along a path of the least cost while the interdictor attempts to frustrate this motion by cutting edges or nodes. The interdiction objective is to find the optimal set of edges to cut given that there is a finite interdiction budget and the interdictor must move first. We reformulate the interdiction problem for stochastic evaders by introducing a model in which the evader follows a Markovian random walk guided by the least-cost path to the target. This model can represent incomplete knowledge about the evader, and the resulting model is a nonlinear 0-1 optimization problem. We then introduce an optimization heuristic based on betweenness centrality that can rapidly find high-quality interdiction solutions by providing a global view of the network.

preprint2010arXiv

Optimizing topological cascade resilience based on the structure of terrorist networks

Complex socioeconomic networks such as information, finance and even terrorist networks need resilience to cascades - to prevent the failure of a single node from causing a far-reaching domino effect. We show that terrorist and guerrilla networks are uniquely cascade-resilient while maintaining high efficiency, but they become more vulnerable beyond a certain threshold. We also introduce an optimization method for constructing networks with high passive cascade resilience. The optimal networks are found to be based on cells, where each cell has a star topology. Counterintuitively, we find that there are conditions where networks should not be modified to stop cascades because doing so would come at a disproportionate loss of efficiency. Implementation of these findings can lead to more cascade-resilient networks in many diverse areas.

preprint2010arXiv

Targeting by Transnational Terrorist Groups

Many successful terrorist groups operate across international borders where different countries host different stages of terrorist operations. Often the recruits for the group come from one country or countries, while the targets of the operations are in another. Stopping such attacks is difficult because intervention in any region or route might merely shift the terrorists elsewhere. Here we propose a model of transnational terrorism based on the theory of activity networks. The model represents attacks on different countries as paths in a network. The group is assumed to prefer paths of lowest cost (or risk) and maximal yield from attacks. The parameters of the model are computed for the Islamist-Salafi terrorist movement based on open source data and then used for estimation of risks of future attacks. The central finding is that the USA has an enduring appeal as a target, due to lack of other nations of matching geopolitical weight or openness. It is also shown that countries in Africa and Asia that have been overlooked as terrorist bases may become highly significant threats in the future. The model quantifies the dilemmas facing countries in the effort to cut such networks, and points to a limitation of deterrence against transnational terrorists.