Source author record

Alex Pentland

Alex Pentland appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Social and Information Networks physics.soc-ph cs.CY Machine Learning Artificial Intelligence Multiagent Systems Cryptography and Security Computer Science and Game Theory Human-Computer Interaction Applications Networking and Internet Architecture Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

45works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Generative AI collective behavior needs an interactionist paradigm

In this article, we argue that understanding the collective behavior of agents based on large language models (LLMs) is an essential area of inquiry, with important implications in terms of risks and benefits, impacting us as a society at many levels. We claim that the distinctive nature of LLMs--namely, their initialization with extensive pre-trained knowledge and implicit social priors, together with their capability of adaptation through in-context learning--motivates the need for an interactionist paradigm consisting of alternative theoretical foundations, methodologies, and analytical tools, in order to systematically examine how prior knowledge and embedded values interact with social context to shape emergent phenomena in multi-agent generative AI systems. We propose and discuss four directions that we consider crucial for the development and deployment of LLM-based collectives, focusing on theory, methods, and trans-disciplinary dialogue.

preprint2026arXiv

Permission Manifests for Web Agents

The rise of Large Language Model (LLM)-based web agents represents a significant shift in automated interactions with the web. Unlike traditional crawlers that follow simple conventions, such as robots$.$txt, modern agents engage with websites in sophisticated ways: navigating complex interfaces, extracting structured information, and completing end-to-end tasks. Existing governance mechanisms were not designed for these capabilities. Without a way to specify what interactions are and are not allowed, website owners increasingly rely on blanket blocking and CAPTCHAs, which undermine beneficial applications such as efficient automation, convenient use of e-commerce services, and accessibility tools. We introduce agent-permissions$.$json, a robots$.$txt-style lightweight manifest where websites specify allowed interactions, complemented by API references where available. This framework provides a low-friction coordination mechanism: website owners only need to write a simple JSON file, while agents can easily parse and automatically implement the manifest's provisions. Website owners can then focus on blocking non-compliant agents, rather than agents as a whole. By extending the spirit of robots$.$txt to the era of LLM-mediated interaction, and complementing data use initiatives such as AIPref, the manifest establishes a compliance framework that enables beneficial agent interactions while respecting site owners' preferences.

preprint2022arXiv

Are neighbourhood amenities associated with more walking and less driving? Yes, but only for the wealthy

Cities are home to a vast array of amenities, from local barbers to science museums and shopping malls. But these are inequality distributed across urban space. Using Google Places data combined with trip-based mobility data for Bogotá, Colombia, we shed light on the impact of neighbourhood amenities on urban mobility patterns. Deriving a new accessibility metric that explicitly takes into account spatial range, we find that a higher density of local amenities is associated a higher likelihood of walking as well as shorter bus and car trips. Digging deeper, we use a sample stratification framework to show that socioeconomic status (SES) modulates these effects. Amenities within about a 1km radius are strongly associated with a higher propensity to walk and lower driving time only for only the wealthiest group. In contrast, a higher density of amenities is associated with shorter bus trips for low and middle SES residents. As cities globally aim to boost public transport and green travel, these findings enable us to better understand how commercial structure shapes urban mobility in highly income-segregated settings.

preprint2022arXiv

Disambiguating Disinformation: Extending Beyond the Veracity of Online Content

Following the 2016 US presidential election and the now overwhelming evidence of Russian interference, there has been an explosion of interest in the phenomenon of "fake news". To date, research on false news has centered around detecting content from low-credibility sources and analyzing how this content spreads across online platforms. Misinformation poses clear risks, yet research agendas that overemphasize veracity miss the opportunity to truly understand the Kremlin-led disinformation campaign that shook so many Americans. In this paper, we present a definition for disinformation - a set or sequence of orchestrated, agenda-driven information actions with the intent to deceive - that is useful in contextualizing Russian interference in 2016 and disinformation campaigns more broadly. We expand on our ongoing work to operationalize this definition and demonstrate how detecting disinformation must extend beyond assessing the credibility of a specific publisher, user, or story.

preprint2022arXiv

Investigating and Modeling the Dynamics of Long Ties

Long ties, the social ties that bridge different communities, are widely believed to play crucial roles in spreading novel information in social networks. However, some existing network theories and prediction models indicate that long ties might dissolve quickly or eventually become redundant, thus putting into question the long-term value of long ties. Our empirical analysis of real-world dynamic networks shows that contrary to such reasoning, long ties are more likely to persist than other social ties, and that many of them constantly function as social bridges without being embedded in local networks. Using a novel cost-benefit analysis model combined with machine learning, we show that long ties are highly beneficial, which instinctively motivates people to expend extra effort to maintain them. This partly explains why long ties are more persistent than what has been suggested by many existing theories and models. Overall, our study suggests the need for social interventions that can promote the formation of long ties, such as mixing people with diverse backgrounds.

preprint2022arXiv

Private and Byzantine-Proof Cooperative Decision-Making

The cooperative bandit problem is a multi-agent decision problem involving a group of agents that interact simultaneously with a multi-armed bandit, while communicating over a network with delays. The central idea in this problem is to design algorithms that can efficiently leverage communication to obtain improvements over acting in isolation. In this paper, we investigate the stochastic bandit problem under two settings - (a) when the agents wish to make their communication private with respect to the action sequence, and (b) when the agents can be byzantine, i.e., they provide (stochastically) incorrect information. For both these problem settings, we provide upper-confidence bound algorithms that obtain optimal regret while being (a) differentially-private and (b) tolerant to byzantine agents. Our decentralized algorithms require no information about the network of connectivity between agents, making them scalable to large dynamic systems. We test our algorithms on a competitive benchmark of random graphs and demonstrate their superior performance with respect to existing robust algorithms. We hope that our work serves as an important step towards creating distributed decision-making systems that maintain privacy.

preprint2022arXiv

Zero Botnets: An Observe-Pursue-Counter Approach

Adversarial Internet robots (botnets) represent a growing threat to the safe use and stability of the Internet. Botnets can play a role in launching adversary reconnaissance (scanning and phishing), influence operations (upvoting), and financing operations (ransomware, market manipulation, denial of service, spamming, and ad click fraud) while obfuscating tailored tactical operations. Reducing the presence of botnets on the Internet, with the aspirational target of zero, is a powerful vision for galvanizing policy action. Setting a global goal, encouraging international cooperation, creating incentives for improving networks, and supporting entities for botnet takedowns are among several policies that could advance this goal. These policies raise significant questions regarding proper authorities/access that cannot be answered in the abstract. Systems analysis has been widely used in other domains to achieve sufficient detail to enable these questions to be dealt with in concrete terms. Defeating botnets using an observe-pursue-counter architecture is analyzed, the technical feasibility is affirmed, and the authorities/access questions are significantly narrowed. Recommended next steps include: supporting the international botnet takedown community, expanding network observatories, enhancing the underlying network science at scale, conducting detailed systems analysis, and developing appropriate policy frameworks.

preprint2020arXiv

Analysis of misinformation during the COVID-19 outbreak in China: cultural, social and political entanglements

COVID-19 resulted in an infodemic, which could erode public trust, impede virus containment, and outlive the pandemic itself. The evolving and fragmented media landscape is a key driver of the spread of misinformation. Using misinformation identified by the fact-checking platform by Tencent and posts on Weibo, our results showed that the evolution of misinformation follows an issue-attention cycle, pertaining to topics such as city lockdown, cures, and preventions, and school reopening. Sources of authority weigh in on these topics, but their influence is complicated by peoples' pre-existing beliefs and cultural practices. Finally, social media has a complicated relationship with established or legacy media systems. Sometimes they reinforce each other, but in general, social media may have a topic cycle of its own making. Our findings shed light on the distinct characteristics of misinformation during the COVID-19 and offer insights into combating misinformation in China and across the world at large.

preprint2020arXiv

Cooperative Multi-Agent Bandits with Heavy Tails

We study the heavy-tailed stochastic bandit problem in the cooperative multi-agent setting, where a group of agents interact with a common bandit problem, while communicating on a network with delays. Existing algorithms for the stochastic bandit in this setting utilize confidence intervals arising from an averaging-based communication protocol known as~\textit{running consensus}, that does not lend itself to robust estimation for heavy-tailed settings. We propose \textsc{MP-UCB}, a decentralized multi-agent algorithm for the cooperative stochastic bandit that incorporates robust estimation with a message-passing protocol. We prove optimal regret bounds for \textsc{MP-UCB} for several problem settings, and also demonstrate its superiority to existing methods. Furthermore, we establish the first lower bounds for the cooperative bandit problem, in addition to providing efficient algorithms for robust bandit estimation of location.

preprint2020arXiv

ERC20 Transactions over Ethereum Blockchain: Network Analysis and Predictions

Following the birth of Bitcoin and the introduction of the Ethereum ERC20 protocol a decade ago, recent years have witnessed a growing number of cryptographic tokens that are being introduced by researchers, private sector companies and NGOs. The ubiquitous of such Blockchain based cryptocurrencies give birth to a new kind of rising economy, which presents great difficulties to modeling its dynamics using conventional semantic properties. Our work presents the analysis of the dynamical properties of the ERC20 protocol compliant crypto-coins' trading data using a network theory prism. We examine the dynamics of ERC20 based networks over time by analyzing a meta-parameter of the network, the power of its degree distribution. Our analysis demonstrates that this parameter can be modeled as an under-damped harmonic oscillator over time, enabling a year forward of network parameters predictions.

preprint2020arXiv

Give more data, awareness and control to individual citizens, and they will help COVID-19 containment

The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the phase 2 of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens' privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens' "personal data stores", to be shared separately and selectively, voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: it allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allow the user to share spatio-temporal aggregates - if and when they want, for specific aims - with health authorities, for instance. Second, we favour a longer-term pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society.

preprint2020arXiv

Interpretable Stochastic Block Influence Model: measuring social influence among homophilous communities

Decision-making on networks can be explained by both homophily and social influence. While homophily drives the formation of communities with similar characteristics, social influence occurs both within and between communities. Social influence can be reasoned through role theory, which indicates that the influences among individuals depend on their roles and the behavior of interest. To operationalize these social science theories, we empirically identify the homophilous communities and use the community structures to capture the "roles", which affect the particular decision-making processes. We propose a generative model named Stochastic Block Influence Model and jointly analyze both the network formation and the behavioral influence within and between different empirically-identified communities. To evaluate the performance and demonstrate the interpretability of our method, we study the adoption decisions of microfinance in an Indian village. We show that although individuals tend to form links within communities, there are strong positive and negative social influences between communities, supporting the weak tie theory. Moreover, we find that communities with shared characteristics are associated with positive influence. In contrast, the communities with a lack of overlap are associated with negative influence. Our framework facilitates the quantification of the influences underlying decision communities and is thus a useful tool for driving information diffusion, viral marketing, and technology adoptions.

preprint2020arXiv

Kernel Methods for Cooperative Multi-Agent Contextual Bandits

Cooperative multi-agent decision making involves a group of agents cooperatively solving learning problems while communicating over a network with delays. In this paper, we consider the kernelised contextual bandit problem, where the reward obtained by an agent is an arbitrary linear function of the contexts' images in the related reproducing kernel Hilbert space (RKHS), and a group of agents must cooperate to collectively solve their unique decision problems. For this problem, we propose \textsc{Coop-KernelUCB}, an algorithm that provides near-optimal bounds on the per-agent regret, and is both computationally and communicatively efficient. For special cases of the cooperative problem, we also provide variants of \textsc{Coop-KernelUCB} that provides optimal per-agent regret. In addition, our algorithm generalizes several existing results in the multi-agent bandit setting. Finally, on a series of both synthetic and real-world multi-agent network benchmarks, we demonstrate that our algorithm significantly outperforms existing benchmarks.

preprint2020arXiv

Learning Quadratic Games on Networks

Individuals, or organizations, cooperate with or compete against one another in a wide range of practical situations. Such strategic interactions are often modeled as games played on networks, where an individual's payoff depends not only on her action but also on that of her neighbors. The current literature has largely focused on analyzing the characteristics of network games in the scenario where the structure of the network, which is represented by a graph, is known beforehand. It is often the case, however, that the actions of the players are readily observable while the underlying interaction network remains hidden. In this paper, we propose two novel frameworks for learning, from the observations on individual actions, network games with linear-quadratic payoffs, and in particular, the structure of the interaction network. Our frameworks are based on the Nash equilibrium of such games and involve solving a joint optimization problem for the graph structure and the individual marginal benefits. Both synthetic and real-world experiments demonstrate the effectiveness of the proposed frameworks, which have theoretical as well as practical implications for understanding strategic interactions in a network environment.

preprint2020arXiv

Leveraging Communication Topologies Between Learning Agents in Deep Reinforcement Learning

A common technique to improve learning performance in deep reinforcement learning (DRL) and many other machine learning algorithms is to run multiple learning agents in parallel. A neglected component in the development of these algorithms has been how best to arrange the learning agents involved to improve distributed search. Here we draw upon results from the networked optimization literatures suggesting that arranging learning agents in communication networks other than fully connected topologies (the implicit way agents are commonly arranged in) can improve learning. We explore the relative performance of four popular families of graphs and observe that one such family (Erdos-Renyi random graphs) empirically outperforms the de facto fully-connected communication topology across several DRL benchmark tasks. Additionally, we observe that 1000 learning agents arranged in an Erdos-Renyi graph can perform as well as 3000 agents arranged in the standard fully-connected topology, showing the large learning improvement possible when carefully designing the topology over which agents communicate. We complement these empirical results with a theoretical investigation of why our alternate topologies perform better. Overall, our work suggests that distributed machine learning algorithms could be made more effective if the communication topology between learning agents was optimized.

preprint2020arXiv

Mobile phone data and COVID-19: Missing an opportunity?

This paper describes how mobile phone data can guide government and public health authorities in determining the best course of action to control the COVID-19 pandemic and in assessing the effectiveness of control measures such as physical distancing. It identifies key gaps and reasons why this kind of data is only scarcely used, although their value in similar epidemics has proven in a number of use cases. It presents ways to overcome these gaps and key recommendations for urgent action, most notably the establishment of mixed expert groups on national and regional level, and the inclusion and support of governments and public authorities early on. It is authored by a group of experienced data scientists, epidemiologists, demographers and representatives of mobile network operators who jointly put their work at the service of the global effort to combat the COVID-19 pandemic.

preprint2020arXiv

Privacy-Preserving Claims Exchange Networks for Virtual Asset Service Providers

In order for VASPs to fulfill the regulatory requirements from the FATF and the Travel Rule, VASPs need access to truthful information regarding originators, beneficiaries and other VASPs involved in a virtual asset transfer instance. Additionally, in seeking data regarding subjects (individuals or organizations) VASPs are faced with privacy regulations such as the GDPR and CCPA. In this paper we a propose privacy-preserving claims issuance model that carries indicators of the provenance of the data and the algorithms used to derive the claim or assertion. This allows VASPs to obtain originator and beneficiary information without necessarily having access to the private data about these entities. Secondly we propose the use of a consortium trust network arrangement for VASPs to exchange signed claims about subjects and their public-key information or certificate.

preprint2020arXiv

Segregated interactions in urban and online space

Urban income segregation is a widespread phenomenon that challenges societies across the globe. Classical studies on segregation have largely focused on the geographic distribution of residential neighborhoods rather than on patterns of social behaviors and interactions. In this study, we analyze segregation in economic and social interactions by observing credit card transactions and Twitter mentions among thousands of individuals in three culturally different metropolitan areas. We show that segregated interaction is amplified relative to the expected effects of geographic segregation in terms of both purchase activity and online communication. Furthermore, we find that segregation increases with difference in socio-economic status but is asymmetric for purchase activity, i.e., the amount of interaction from poorer to wealthier neighborhoods is larger than vice versa. Our results provide novel insights into the understanding of behavioral segregation in human interactions with significant socio-political and economic implications.

preprint2020arXiv

Social Learning and the Accuracy-Risk Trade-off in the Wisdom of the Crowd

How do we design and deploy crowdsourced prediction platforms for real-world applications where risk is an important dimension of prediction performance? To answer this question, we conducted a large online Wisdom of the Crowd study where participants predicted the prices of real financial assets (e.g. S&P 500). We observe a Pareto frontier between accuracy of prediction and risk, and find that this trade-off is mediated by social learning i.e. as social learning is increasingly leveraged, it leads to lower accuracy but also lower risk. We also observe that social learning leads to superior accuracy during one of our rounds that occurred during the high market uncertainty of the Brexit vote. Our results have implications for the design of crowdsourced prediction platforms: for example, they suggest that the performance of the crowd should be more comprehensively characterized by using both accuracy and risk (as is standard in financial and statistical forecasting), in contrast to prior work where risk of prediction has been overlooked.

preprint2020arXiv

The Wisdom of the Network: How Adaptive Networks Promote Collective Intelligence

Social networks continuously change as new ties are created and existing ones fade. It is widely noted that our social embedding exerts a strong influence on what information we receive and how we form beliefs and make decisions. However, most empirical studies on the role of social networks in collective intelligence have overlooked the dynamic nature of social networks and its role in fostering adaptive collective intelligence. It remains unknown (1) how network structures adapt to the attributes of individuals, and (2) whether this adaptation promotes the accuracy of individual and collective decisions. Here, we answer these questions through a series of behavioral experiments and supporting simulations. Our results reveal that social network plasticity in the presence of feedback, can adapt to biased and changing information environments, and produce collective estimates that are more accurate than their best-performing member. We explore two mechanisms that explain these results: (1) a global adaptation mechanism where the structural connectivity of the network itself changes such that it amplifies the estimates of high-performing members within the group; (2) a local adaptation mechanism where accurate individuals are more resistant to social influence, and therefore their initial belief is weighted in the collective estimate disproportionately. Thereby, our findings substantiate the role of social network plasticity and feedback as adaptive mechanisms for refining individual and collective judgments.

preprint2020arXiv

Wallet Attestations for Virtual Asset Service Providers and Crypto-Assets Insurance

The emerging virtual asset service providers (VASP) industry currently faces a number of challenges related to the Travel Rule, notably pertaining to customer personal information, account number and cryptographic key information. VASPs will be handling virtual assets of different forms, where each may be bound to different private-public key pairs on the blockchain. As such, VASPs also face the additional problem of the management of its own keys and the management of customer keys that may reside in a customer wallet. The use of attestation technologies as applied to wallet systems may provide VASPs with suitable evidence relevant to the Travel Rule regarding cryptographic key information and their operational state. Additionally, wallet attestations may provide crypto-asset insurers with strong evidence regarding the key management aspects of a wallet device, thereby providing the insurance industry with measurable levels of assurance that can become the basis for insurers to perform risk assessment on crypto-assets bound to keys in wallets, both enterprise-grade wallets and consumer-grade wallets.

preprint2016arXiv

Bots as Virtual Confederates: Design and Ethics

The use of bots as virtual confederates in online field experiments holds extreme promise as a new methodological tool in computational social science. However, this potential tool comes with inherent ethical challenges. Informed consent can be difficult to obtain in many cases, and the use of confederates necessarily implies the use of deception. In this work we outline a design space for bots as virtual confederates, and we propose a set of guidelines for meeting the status quo for ethical experimentation. We draw upon examples from prior work in the CSCW community and the broader social science literature for illustration. While a handful of prior researchers have used bots in online experimentation, our work is meant to inspire future work in this area and raise awareness of the associated ethical issues.

preprint2016arXiv

Human collective intelligence as distributed Bayesian inference

Collective intelligence is believed to underly the remarkable success of human society. The formation of accurate shared beliefs is one of the key components of human collective intelligence. How are accurate shared beliefs formed in groups of fallible individuals? Answering this question requires a multiscale analysis. We must understand both the individual decision mechanisms people use, and the properties and dynamics of those mechanisms in the aggregate. As of yet, mathematical tools for such an approach have been lacking. To address this gap, we introduce a new analytical framework: We propose that groups arrive at accurate shared beliefs via distributed Bayesian inference. Distributed inference occurs through information processing at the individual level, and yields rational belief formation at the group level. We instantiate this framework in a new model of human social decision-making, which we validate using a dataset we collected of over 50,000 users of an online social trading platform where investors mimic each others' trades using real money in foreign exchange and other asset markets. We find that in this setting people use a decision mechanism in which popularity is treated as a prior distribution for which decisions are best to make. This mechanism is boundedly rational at the individual level, but we prove that in the aggregate implements a type of approximate "Thompson sampling"---a well-known and highly effective single-agent Bayesian machine learning algorithm for sequential decision-making. The perspective of distributed Bayesian inference therefore reveals how collective rationality emerges from the boundedly rational decision mechanisms people use.

preprint2016arXiv

Inferring Population Preferences via Mixtures of Spatial Voting Models

Understanding political phenomena requires measuring the political preferences of society. We introduce a model based on mixtures of spatial voting models that infers the underlying distribution of political preferences of voters with only voting records of the population and political positions of candidates in an election. Beyond offering a cost-effective alternative to surveys, this method projects the political preferences of voters and candidates into a shared latent preference space. This projection allows us to directly compare the preferences of the two groups, which is desirable for political science but difficult with traditional survey methods. After validating the aggregated-level inferences of this model against results of related work and on simple prediction tasks, we apply the model to better understand the phenomenon of political polarization in the Texas, New York, and Ohio electorates. Taken at face value, inferences drawn from our model indicate that the electorates in these states may be less bimodal than the distribution of candidates, but that the electorates are comparatively more extreme in their variance. We conclude with a discussion of limitations of our method and potential future directions for research.

preprint2016arXiv

Modeling Human Ad Hoc Coordination

Whether in groups of humans or groups of computer agents, collaboration is most effective between individuals who have the ability to coordinate on a joint strategy for collective action. However, in general a rational actor will only intend to coordinate if that actor believes the other group members have the same intention. This circular dependence makes rational coordination difficult in uncertain environments if communication between actors is unreliable and no prior agreements have been made. An important normative question with regard to coordination in these ad hoc settings is therefore how one can come to believe that other actors will coordinate, and with regard to systems involving humans, an important empirical question is how humans arrive at these expectations. We introduce an exact algorithm for computing the infinitely recursive hierarchy of graded beliefs required for rational coordination in uncertain environments, and we introduce a novel mechanism for multiagent coordination that uses it. Our algorithm is valid in any environment with a finite state space, and extensions to certain countably infinite state spaces are likely possible. We test our mechanism for multiagent coordination as a model for human decisions in a simple coordination game using existing experimental data. We then explore via simulations whether modeling humans in this way may improve human-agent collaboration.

preprint2016arXiv

Optimal Dynamic Coverage Infrastructure for Large-Scale Fleets of Reconnaissance UAVs

Current state of the art in the field of UAV activation relies solely on human operators for the design and adaptation of the drones' flying routes. Furthermore, this is being done today on an individual level (one vehicle per operators), with some exceptions of a handful of new systems, that are comprised of a small number of self-organizing swarms, manually guided by a human operator. Drones-based monitoring is of great importance in variety of civilian domains, such as road safety, homeland security, and even environmental control. In its military aspect, efficiently detecting evading targets by a fleet of unmanned drones has an ever increasing impact on the ability of modern armies to engage in warfare. The latter is true both traditional symmetric conflicts among armies as well as asymmetric ones. Be it a speeding driver, a polluting trailer or a covert convoy, the basic challenge remains the same -- how can its detection probability be maximized using as little number of drones as possible. In this work we propose a novel approach for the optimization of large scale swarms of reconnaissance drones -- capable of producing on-demand optimal coverage strategies for any given search scenario. Given an estimation cost of the threat's potential damages, as well as types of monitoring drones available and their comparative performance, our proposed method generates an analytically provable strategy, stating the optimal number and types of drones to be deployed, in order to cost-efficiently monitor a pre-defined region for targets maneuvering using a given roads networks. We demonstrate our model using a unique dataset of the Israeli transportation network, on which different deployment schemes for drones deployment are evaluated.

preprint2016arXiv

The Role of Reciprocity and Directionality of Friendship Ties in Promoting Behavioral Change

Friendship is a fundamental characteristic of human beings and usually assumed to be reciprocal in nature. Despite this common expectation, in reality, not all friendships by default are reciprocal nor created equal. Here, we show that reciprocated friendships are more intimate and they are substantially different from those that are not. We examine the role of reciprocal ties in inducing more effective peer pressure in a cooperative arrangements setting and find that the directionality of friendship ties can significantly limit the ability to persuade others to act. Specifically, we observe a higher behavioral change and more effective peer-influence when subjects shared reciprocal ties with their peers compared to sharing unilateral ones. Moreover, through spreading process simulation, we find that although unilateral ties diffuse behaviors across communities, reciprocal ties play more important role at the early stages of the diffusion process.

preprint2016arXiv

Vaccination and Complex Social Dynamics

Vaccination and outbreak monitoring are essential tools for preventing and minimizing outbreaks of infectious diseases. Targeted strategies, where the individuals most important for monitoring or preventing outbreaks are selected for intervention, offer a possibility to significantly improve these measures. Although targeted strategies carry a strong potential, identifying optimal target groups remains a challenge. Here we consider the problem of identifying target groups based on digital communication networks (telecommunication, online social media) in order to predict and contain an infectious disease spreading on a real-world person-to-person network of more than 500 individuals. We show that target groups for efficient outbreak monitoring can be determined based on both telecommunication and online social network information. In case of vaccination the information regarding the digital communication networks improves the efficacy for short-range disease transmissions but, surprisingly, performance is severely reduced in the case of long-range transmission. These results are robust with respect to the strategy used to identify targeted individuals and time-gap between identification of targets and the intervention. Thus, we demonstrate that data available from telecommunication and online social networks can greatly improve epidemic control measures, but it is important to consider the details of the pathogen spreading mechanism when such policies are applied.

preprint2015arXiv

Beyond Contagion: Reality Mining Reveals Complex Patterns of Social Influence

Contagion, a concept from epidemiology, has long been used to characterize social influence on people's behavior and affective (emotional) states. While it has revealed many useful insights, it is not clear whether the contagion metaphor is sufficient to fully characterize the complex dynamics of psychological states in a social context. Using wearable sensors that capture daily face-to-face interaction, combined with three daily experience sampling surveys, we collected the most comprehensive data set of personality and emotion dynamics of an entire community of work. From this high-resolution data about actual (rather than self-reported) face-to-face interaction, a complex picture emerges where contagion (that can be seen as adaptation of behavioral responses to the behavior of other people) cannot fully capture the dynamics of transitory states. We found that social influence has two opposing effects on states: \emph{adaptation} effects that go beyond mere contagion, and \emph{complementarity} effects whereby individuals' behaviors tend to complement the behaviors of others. Surprisingly, these effects can exhibit completely different directions depending on the stable personality or emotional dispositions (stable traits) of target individuals. Our findings provide a foundation for richer models of social dynamics, and have implications on organizational engineering and workplace well-being.

preprint2015arXiv

Enigma: Decentralized Computation Platform with Guaranteed Privacy

A peer-to-peer network, enabling different parties to jointly store and run computations on data while keeping the data completely private. Enigma's computational model is based on a highly optimized version of secure multi-party computation, guaranteed by a verifiable secret-sharing scheme. For storage, we use a modified distributed hashtable for holding secret-shared data. An external blockchain is utilized as the controller of the network, manages access control, identities and serves as a tamper-proof log of events. Security deposits and fees incentivize operation, correctness and fairness of the system. Similar to Bitcoin, Enigma removes the need for a trusted third party, enabling autonomous control of personal data. For the first time, users are able to share their data with cryptographic guarantees regarding their privacy.

preprint2014arXiv

Once Upon a Crime: Towards Crime Prediction from Demographics and Mobile Data

In this paper, we present a novel approach to predict crime in a geographic space from multiple data sources, in particular mobile phone and demographic data. The main contribution of the proposed approach lies in using aggregated and anonymized human behavioral data derived from mobile network activity to tackle the crime prediction problem. While previous research efforts have used either background historical knowledge or offenders' profiling, our findings support the hypothesis that aggregated human behavioral data captured from the mobile network infrastructure, in combination with basic demographic information, can be used to predict crime. In our experimental results with real crime data from London we obtain an accuracy of almost 70% when predicting whether a specific area in the city will be a crime hotspot or not. Moreover, we provide a discussion of the implications of our findings for data-driven crime analysis.

preprint2014arXiv

Privacy for Personal Neuroinformatics

Human brain activity collected in the form of Electroencephalography (EEG), even with low number of sensors, is an extremely rich signal. Traces collected from multiple channels and with high sampling rates capture many important aspects of participants' brain activity and can be used as a unique personal identifier. The motivation for sharing EEG signals is significant, as a mean to understand the relation between brain activity and well-being, or for communication with medical services. As the equipment for such data collection becomes more available and widely used, the opportunities for using the data are growing; at the same time however inherent privacy risks are mounting. The same raw EEG signal can be used for example to diagnose mental diseases, find traces of epilepsy, and decode personality traits. The current practice of the informed consent of the participants for the use of the data either prevents reuse of the raw signal or does not truly respect participants' right to privacy by reusing the same raw data for purposes much different than originally consented to. Here we propose an integration of a personal neuroinformatics system, Smartphone Brain Scanner, with a general privacy framework openPDS. We show how raw high-dimensionality data can be collected on a mobile device, uploaded to a server, and subsequently operated on and accessed by applications or researchers, without disclosing the raw signal. Those extracted features of the raw signal, called answers, are of significantly lower-dimensionality, and provide the full utility of the data in given context, without the risk of disclosing sensitive raw signal. Such architecture significantly mitigates a very serious privacy risk related to raw EEG recordings floating around and being used and reused for various purposes.

preprint2014arXiv

Privacy in Sensor-Driven Human Data Collection: A Guide for Practitioners

In recent years, the amount of information collected about human beings has increased dramatically. This development has been partially driven by individuals posting and storing data about themselves and friends using online social networks or collecting their data for self-tracking purposes (quantified-self movement). Across the sciences, researchers conduct studies collecting data with an unprecedented resolution and scale. Using computational power combined with mathematical models, such rich datasets can be mined to infer underlying patterns, thereby providing insights into human nature. Much of the data collected is sensitive. It is private in the sense that most individuals would feel uncomfortable sharing their collected personal data publicly. For this reason, the need for solutions to ensure the privacy of the individuals generating data has grown alongside the data collection efforts. Out of all the massive data collection efforts, this paper focuses on efforts directly instrumenting human behavior, and notes that -- in many cases -- the privacy of participants is not sufficiently addressed. For example, study purposes are often not explicit, informed consent is ill-defined, and security and sharing protocols are only partially disclosed. This paper provides a survey of the work related to addressing privacy issues in research studies that collect detailed sensor data on human behavior. Reflections on the key problems and recommendations for future work are included. We hope the overview of the privacy-related practices in massive data collection studies can be used as a frame of reference for practitioners in the field. Although focused on data collection in an academic context, we believe that many of the challenges and solutions we identify are also relevant and useful for other domains where massive data collection takes place, including businesses and governments.

preprint2013arXiv

Urban characteristics attributable to density-driven tie formation

Motivated by empirical evidence on the interplay between geography, population density and societal interaction, we propose a generative process for the evolution of social structure in cities. Our analytical and simulation results predict both super-linear scaling of social tie density and information flow as a function of the population. We demonstrate that our model provides a robust and accurate fit for the dependency of city characteristics with city size, ranging from individual-level dyadic interactions (number of acquaintances, volume of communication) to population-level variables (contagious disease rates, patenting activity, economic productivity and crime) without the need to appeal to modularity, specialization, or hierarchy.

preprint2012arXiv

Automatic Prediction Of Small Group Performance In Information Sharing Tasks

In this paper, we describe a novel approach, based on Markov jump processes, to model small group conversational dynamics and to predict small group performance. More precisely, we estimate conversational events such as turn taking, backchannels, turn-transitions at the micro-level (1 minute windows) and then we bridge the micro-level behavior and the macro-level performance. We tested our approach with a cooperative task, the Information Sharing task, and we verified the relevance of micro- level interaction dynamics in determining a good group performance (e.g. higher speaking turns rate and more balanced participation among group members).

preprint2012arXiv

Graph-Coupled HMMs for Modeling the Spread of Infection

We develop Graph-Coupled Hidden Markov Models (GCHMMs) for modeling the spread of infectious disease locally within a social network. Unlike most previous research in epidemiology, which typically models the spread of infection at the level of entire populations, we successfully leverage mobile phone data collected from 84 people over an extended period of time to model the spread of infection on an individual level. Our model, the GCHMM, is an extension of widely-used Coupled Hidden Markov Models (CHMMs), which allow dependencies between state transitions across multiple Hidden Markov Models (HMMs), to situations in which those dependencies are captured through the structure of a graph, or to social networks that may change over time. The benefit of making infection predictions on an individual level is enormous, as it allows people to receive more personalized and relevant health advice.

preprint2012arXiv

Modeling Dynamical Influence in Human Interaction Patterns

How can we model influence between individuals in a social system, even when the network of interactions is unknown? In this article, we review the literature on the "influence model," which utilizes independent time series to estimate how much the state of one actor affects the state of another actor in the system. We extend this model to incorporate dynamical parameters that allow us to infer how influence changes over time, and we provide three examples of how this model can be applied to simulated and real data. The results show that the model can recover known estimates of influence, it generates results that are consistent with other measures of social networks, and it allows us to uncover important shifts in the way states may be transmitted between actors at different points in time.

preprint2011arXiv

Composite Social Network for Predicting Mobile Apps Installation

We have carefully instrumented a large portion of the population living in a university graduate dormitory by giving participants Android smart phones running our sensing software. In this paper, we propose the novel problem of predicting mobile application (known as "apps") installation using social networks and explain its challenge. Modern smart phones, like the ones used in our study, are able to collect different social networks using built-in sensors. (e.g. Bluetooth proximity network, call log network, etc) While this information is accessible to app market makers such as the iPhone AppStore, it has not yet been studied how app market makers can use these information for marketing research and strategy development. We develop a simple computational model to better predict app installation by using a composite network computed from the different networks sensed by phones. Our model also captures individual variance and exogenous factors in app adoption. We show the importance of considering all these factors in predicting app installations, and we observe the surprising result that app installation is indeed predictable. We also show that our model achieves the best results compared with generic approaches: our results are four times better than random guess, and predict almost 45% of all apps users install with almost 45% precision (F1 score= 0.43).

preprint2011arXiv

Incremental Learning with Accuracy Prediction of Social and Individual Properties from Mobile-Phone Data

Mobile phones are quickly becoming the primary source for social, behavioral, and environmental sensing and data collection. Today's smartphones are equipped with increasingly more sensors and accessible data types that enable the collection of literally dozens of signals related to the phone, its user, and its environment. A great deal of research effort in academia and industry is put into mining this raw data for higher level sense-making, such as understanding user context, inferring social networks, learning individual features, predicting outcomes, and so on. In this work we investigate the properties of learning and inference of real world data collected via mobile phones over time. In particular, we look at the dynamic learning process over time, and how the ability to predict individual parameters and social links is incrementally enhanced with the accumulation of additional data. To do this, we use the Friends and Family dataset, which contains rich data signals gathered from the smartphones of 140 adult members of a young-family residential community for over a year, and is one of the most comprehensive mobile phone datasets gathered in academia to date. We develop several models that predict social and individual properties from sensed mobile phone data, including detection of life-partners, ethnicity, and whether a person is a student or not. Then, for this set of diverse learning tasks, we investigate how the prediction accuracy evolves over time, as new data is collected. Finally, based on gained insights, we propose a method for advance prediction of the maximal learning accuracy possible for the learning task at hand, based on an initial set of measurements. This has practical implications, like informing the design of mobile data collection campaigns, or evaluating analysis strategies.

preprint2011arXiv

Social Networks and Spin Glasses

The networks formed from the links between telephones observed in a month's call detail records (CDRs) in the UK are analyzed, looking for the characteristics thought to identify a communications network or a social network. Some novel methods are employed. We find similarities to both types of network. We conclude that, just as analogies to spin glasses have proved fruitful for optimization of large scale practical problems, there will be opportunities to exploit a statistical mechanics of the formation and dynamics of social networks in today's electronically connected world.

preprint2011arXiv

Trends Prediction Using Social Diffusion Models

The importance of the ability of predict trends in social media has been growing rapidly in the past few years with the growing dominance of social media in our everyday's life. Whereas many works focus on the detection of anomalies in networks, there exist little theoretical work on the prediction of the likelihood of anomalous network pattern to globally spread and become "trends". In this work we present an analytic model the social diffusion dynamics of spreading network patterns. Our proposed method is based on information diffusion models, and is capable of predicting future trends based on the analysis of past social interactions between the community's members. We present an analytic lower bound for the probability that emerging trends would successful spread through the network. We demonstrate our model using two comprehensive social datasets - the "Friends and Family" experiment that was held in MIT for over a year, where the complete activity of 140 users was analyzed, and a financial dataset containing the complete activities of over 1.5 million members of the "eToro" social trading community.

preprint2010arXiv

Modeling Corporate Epidemiology

Corporate responses to illness is currently an ad-hoc, subjective process that has little basis in data on how disease actually spreads at the workplace. Additionally, many studies have shown that productivity is not an individual factor but a social one: in any study on epidemic responses this social factor has to be taken into account. The barrier to addressing this problem has been the lack of data on the interaction and mobility patterns of people in the workplace. We have created a wearable Sociometric Badge that senses interactions between individuals using an infra-red (IR) transceiver and proximity using a radio transmitter. Using the data from the Sociometric Badges, we are able to simulate diseases spreading through face-to-face interactions with realistic epidemiological parameters. In this paper we construct a curve trading off productivity with epidemic potential. We are able to take into account impacts on productivity that arise from social factors, such as interaction diversity and density, which studies that take an individual approach ignore. We also propose new organizational responses to diseases that take into account behavioral patterns that are associated with a more virulent disease spread. This is advantageous because it will allow companies to decide appropriate responses based on the organizational context of a disease outbreak.

preprint2010arXiv

Patterns of Individual Shopping Behavior

Much of economic theory is built on observations of aggregate, rather than individual, behavior. Here, we present novel findings on human shopping patterns at the resolution of a single purchase. Our results suggest that much of our seemingly elective activity is actually driven by simple routines. While the interleaving of shopping events creates randomness at the small scale, on the whole consumer behavior is largely predictable. We also examine income-dependent differences in how people shop, and find that wealthy individuals are more likely to bundle shopping trips. These results validate previous work on mobility from cell phone data, while describing the unpredictability of behavior at higher resolution.

preprint2010arXiv

Stealing Reality

In this paper we discuss the threat of malware targeted at extracting information about the relationships in a real-world social network as well as characteristic information about the individuals in the network, which we dub Stealing Reality. We present Stealing Reality, explain why it differs from traditional types of network attacks, and discuss why its impact is significantly more dangerous than that of other attacks. We also present our initial analysis and results regarding the form that an SR attack might take, with the goal of promoting the discussion of defending against such an attack, or even just detecting the fact that one has already occurred.

preprint2010arXiv

Time Critical Social Mobilization: The DARPA Network Challenge Winning Strategy

It is now commonplace to see the Web as a platform that can harness the collective abilities of large numbers of people to accomplish tasks with unprecedented speed, accuracy and scale. To push this idea to its limit, DARPA launched its Network Challenge, which aimed to "explore the roles the Internet and social networking play in the timely communication, wide-area team-building, and urgent mobilization required to solve broad-scope, time-critical problems." The challenge required teams to provide coordinates of ten red weather balloons placed at different locations in the continental United States. This large-scale mobilization required the ability to spread information about the tasks widely and quickly, and to incentivize individuals to act. We report on the winning team's strategy, which utilized a novel recursive incentive mechanism to find all balloons in under nine hours. We analyze the theoretical properties of the mechanism, and present data about its performance in the challenge.

Alex Pentland

What is connected

Connect this record

See the researcher in context

Building this map preview

45 published item(s)

Generative AI collective behavior needs an interactionist paradigm

Permission Manifests for Web Agents

Are neighbourhood amenities associated with more walking and less driving? Yes, but only for the wealthy

Disambiguating Disinformation: Extending Beyond the Veracity of Online Content

Investigating and Modeling the Dynamics of Long Ties

Private and Byzantine-Proof Cooperative Decision-Making

Zero Botnets: An Observe-Pursue-Counter Approach

Analysis of misinformation during the COVID-19 outbreak in China: cultural, social and political entanglements

Cooperative Multi-Agent Bandits with Heavy Tails

ERC20 Transactions over Ethereum Blockchain: Network Analysis and Predictions

Give more data, awareness and control to individual citizens, and they will help COVID-19 containment

Interpretable Stochastic Block Influence Model: measuring social influence among homophilous communities

Kernel Methods for Cooperative Multi-Agent Contextual Bandits

Learning Quadratic Games on Networks

Leveraging Communication Topologies Between Learning Agents in Deep Reinforcement Learning

Mobile phone data and COVID-19: Missing an opportunity?

Privacy-Preserving Claims Exchange Networks for Virtual Asset Service Providers

Segregated interactions in urban and online space

Social Learning and the Accuracy-Risk Trade-off in the Wisdom of the Crowd

The Wisdom of the Network: How Adaptive Networks Promote Collective Intelligence

Wallet Attestations for Virtual Asset Service Providers and Crypto-Assets Insurance

Bots as Virtual Confederates: Design and Ethics

Human collective intelligence as distributed Bayesian inference

Inferring Population Preferences via Mixtures of Spatial Voting Models

Modeling Human Ad Hoc Coordination

Optimal Dynamic Coverage Infrastructure for Large-Scale Fleets of Reconnaissance UAVs

The Role of Reciprocity and Directionality of Friendship Ties in Promoting Behavioral Change

Vaccination and Complex Social Dynamics

Beyond Contagion: Reality Mining Reveals Complex Patterns of Social Influence

Enigma: Decentralized Computation Platform with Guaranteed Privacy

Once Upon a Crime: Towards Crime Prediction from Demographics and Mobile Data

Privacy for Personal Neuroinformatics

Privacy in Sensor-Driven Human Data Collection: A Guide for Practitioners

Urban characteristics attributable to density-driven tie formation

Automatic Prediction Of Small Group Performance In Information Sharing Tasks

Graph-Coupled HMMs for Modeling the Spread of Infection

Modeling Dynamical Influence in Human Interaction Patterns

Composite Social Network for Predicting Mobile Apps Installation

Incremental Learning with Accuracy Prediction of Social and Individual Properties from Mobile-Phone Data

Social Networks and Spin Glasses

Trends Prediction Using Social Diffusion Models

Modeling Corporate Epidemiology

Patterns of Individual Shopping Behavior

Stealing Reality

Time Critical Social Mobilization: The DARPA Network Challenge Winning Strategy