Source author record

Jon Kleinberg

Jon Kleinberg appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Social and Information Networks physics.soc-ph cs.CY Computer Science and Game Theory Machine Learning Data Structures and Algorithms Artificial Intelligence Discrete Mathematics Computation and Language Multiagent Systems Applications Computational Complexity Digital Libraries econ.TH Human-Computer Interaction math.CO math.PR Methodology Neural and Evolutionary Computing

Catalog footprint

What is connected

47works

19topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Mistake-Bounded Language Generation

We investigate the learning task of language generation in the limit, but shift focus from the traditional time-of-last-mistake metric of a generator's success to a new notion of "mistake-bounded generation." While existing results for language generation in the limit focus on guaranteeing eventual consistency, they are blind to the cumulative error incurred during the learning process. We address this by shifting the goal to minimizing the total number of invalid elements output by a generation algorithm. We establish a formal reduction to the Learning from Correct Demonstrations framework of Joshi et al. (2025), enabling a general recipe for deriving mistake bounds via weighted update rules. For finite classes, we provide an algorithm that simultaneously achieves an optimal last-mistake time of $\mathsf{Cdim}(L)$ and a mistake bound of $\lfloor \log_2 |L| \rfloor$, whereas for the non-uniform setting of countably infinite streams of languages, we prove a fundamental trade-off: achieving logarithmic mistakes $O(\log i)$ necessarily precludes convergence guarantees established in prior work. Finally, we show that our framework can be extended to accommodate noisy adversaries and guarantee mistake bounds that scale with the adversary's suboptimality.

preprint2023arXiv

Combinatorial Characterizations and Impossibilities for Higher-order Homophily

Homophily is the seemingly ubiquitous tendency for people to connect and interact with other individuals who are similar to them. This is a well-documented principle and is fundamental for how society organizes. Although many social interactions occur in groups, homophily has traditionally been measured using a graph model, which only accounts for pairwise interactions involving two individuals. Here, we develop a framework using hypergraphs to quantify homophily from group interactions. This reveals natural patterns of group homophily that appear with gender in scientific collaboration and political affiliation in legislative bill co-sponsorship, and also reveals distinctive gender distributions in group photographs, all of which cannot be fully captured by pairwise measures. At the same time, we show that seemingly natural ways to define group homophily are combinatorially impossible. This reveals important pitfalls to avoid when defining and interpreting notions of group homophily, as higher-order homophily patterns are governed by combinatorial constraints that are independent of human behavior but are easily overlooked.

preprint2022arXiv

Allocating Stimulus Checks in Times of Crisis

We study the problem of allocating bailouts (stimulus, subsidy allocations) to people participating in a financial network subject to income shocks. We build on the financial clearing framework of Eisenberg and Noe that allows the incorporation of a bailout policy that is based on discrete bailouts motivated by the types of stimulus checks people receive around the world as part of COVID-19 economical relief plans. We show that optimally allocating such bailouts on a financial network in order to maximize a variety of social welfare objectives of this form is a computationally intractable problem. We develop approximation algorithms to optimize these objectives and establish guarantees for their approximation rations. Then, we incorporate multiple fairness constraints in the optimization problems and establish relative bounds on the solutions with versus without these constraints. Finally, we apply our methodology to a variety of data, both in the context of a system of large financial institutions with real-world data, as well as in a realistic societal context with financial interactions between people and businesses for which we use semi-artificial data derived from mobility patterns. Our results suggest that the algorithms we develop and study have reasonable results in practice and outperform other network-based heuristics. We argue that the presented problem through the societal-level lens could assist policymakers in making informed decisions on issuing subsidies.

preprint2022arXiv

Core-periphery Models for Hypergraphs

We introduce a random hypergraph model for core-periphery structure. By leveraging our model's sufficient statistics, we develop a novel statistical inference algorithm that is able to scale to large hypergraphs with runtime that is practically linear wrt. the number of nodes in the graph after a preprocessing step that is almost linear in the number of hyperedges, as well as a scalable sampling algorithm. Our inference algorithm is capable of learning embeddings that correspond to the reputation (rank) of a node within the hypergraph. We also give theoretical bounds on the size of the core of hypergraphs generated by our model. We experiment with hypergraph data that range to $\sim 10^5$ hyperedges mined from the Microsoft Academic Graph, Stack Exchange, and GitHub and show that our model outperforms baselines wrt. producing good fits.

preprint2022arXiv

Detecting Individual Decision-Making Style: Exploring Behavioral Stylometry in Chess

The advent of machine learning models that surpass human decision-making ability in complex domains has initiated a movement towards building AI systems that interact with humans. Many building blocks are essential for this activity, with a central one being the algorithmic characterization of human behavior. While much of the existing work focuses on aggregate human behavior, an important long-range goal is to develop behavioral models that specialize to individual people and can differentiate among them. To formalize this process, we study the problem of behavioral stylometry, in which the task is to identify a decision-maker from their decisions alone. We present a transformer-based approach to behavioral stylometry in the context of chess, where one attempts to identify the player who played a set of games. Our method operates in a few-shot classification framework, and can correctly identify a player from among thousands of candidate players with 98% accuracy given only 100 labeled games. Even when trained on amateur play, our method generalises to out-of-distribution samples of Grandmaster players, despite the dramatic differences between amateur and world-class players. Finally, we consider more broadly what our resulting embeddings reveal about human style in chess, as well as the potential ethical implications of powerful methods for identifying individuals from behavioral data.

preprint2022arXiv

Four Years of FAccT: A Reflexive, Mixed-Methods Analysis of Research Contributions, Shortcomings, and Future Prospects

Fairness, Accountability, and Transparency (FAccT) for socio-technical systems has been a thriving area of research in recent years. An ACM conference bearing the same name has been the central venue for scholars in this area to come together, provide peer feedback to one another, and publish their work. This reflexive study aims to shed light on FAccT's activities to date and identify major gaps and opportunities for translating contributions into broader positive impact. To this end, we utilize a mixed-methods research design. On the qualitative front, we develop a protocol for reviewing and coding prior FAccT papers, tracing their distribution of topics, methods, datasets, and disciplinary roots. We also design and administer a questionnaire to reflect the voices of FAccT community members and affiliates on a wide range of topics. On the quantitative front, we use the full text and citation network associated with prior FAccT publications to provide further evidence about topics and values represented in FAccT. We organize the findings from our analysis into four main dimensions: the themes present in FAccT scholarship, the values that underpin the work, the impact of the contributions both within academic circles and beyond, and the practices and informal norms of the community that has formed around FAccT. Finally, our work identifies several suggestions on directions for change, as voiced by community members.

preprint2022arXiv

Learning Models of Individual Behavior in Chess

AI systems that can capture human-like behavior are becoming increasingly useful in situations where humans may want to learn from these systems, collaborate with them, or engage with them as partners for an extended duration. In order to develop human-oriented AI systems, the problem of predicting human actions -- as opposed to predicting optimal actions -- has received considerable attention. Existing work has focused on capturing human behavior in an aggregate sense, which potentially limits the benefit any particular individual could gain from interaction with these systems. We extend this line of work by developing highly accurate predictive models of individual human behavior in chess. Chess is a rich domain for exploring human-AI interaction because it combines a unique set of properties: AI systems achieved superhuman performance many years ago, and yet humans still interact with them closely, both as opponents and as preparation tools, and there is an enormous corpus of recorded data on individual player games. Starting with Maia, an open-source version of AlphaZero trained on a population of human players, we demonstrate that we can significantly improve prediction accuracy of a particular player's moves by applying a series of fine-tuning methods. Furthermore, our personalized models can be used to perform stylometry -- predicting who made a given set of moves -- indicating that they capture human decision-making at an individual level. Our work demonstrates a way to bring AI systems into better alignment with the behavior of individual people, which could lead to large improvements in human-AI interaction.

preprint2022arXiv

Mimetic Models: Ethical Implications of AI that Acts Like You

An emerging theme in artificial intelligence research is the creation of models to simulate the decisions and behavior of specific people, in domains including game-playing, text generation, and artistic expression. These models go beyond earlier approaches in the way they are tailored to individuals, and the way they are designed for interaction rather than simply the reproduction of fixed, pre-computed behaviors. We refer to these as mimetic models, and in this paper we develop a framework for characterizing the ethical and social issues raised by their growing availability. Our framework includes a number of distinct scenarios for the use of such models, and considers the impacts on a range of different participants, including the target being modeled, the operator who deploys the model, and the entities that interact with it.

preprint2022arXiv

On the Effect of Triadic Closure on Network Segregation

The tendency for individuals to form social ties with others who are similar to themselves, known as homophily, is one of the most robust sociological principles. Since this phenomenon can lead to patterns of interactions that segregate people along different demographic dimensions, it can also lead to inequalities in access to information, resources, and opportunities. As we consider potential interventions that might alleviate the effects of segregation, we face the challenge that homophily constitutes a pervasive and organic force that is difficult to push back against. Designing effective interventions can therefore benefit from identifying counterbalancing social processes that might be harnessed to work in opposition to segregation. In this work, we show that triadic closure -- another common phenomenon that posits that individuals with a mutual connection are more likely to be connected to one another -- can be one such process. In doing so, we challenge a long-held belief that triadic closure and homophily work in tandem. By analyzing several fundamental network models using popular integration measures, we demonstrate the desegregating potential of triadic closure. We further empirically investigate this effect on real-world dynamic networks, surfacing observations that mirror our theoretical findings. We leverage these insights to discuss simple interventions that can help reduce segregation in settings that exhibit an interplay between triadic closure and homophily. We conclude with a discussion on qualitative implications for the design of interventions in settings where individuals arrive in an online fashion, and the designer can influence the initial set of connections.

preprint2022arXiv

Ordered Submodularity and its Applications to Diversifying Recommendations

A fundamental task underlying many important optimization problems, from influence maximization to sensor placement to content recommendation, is to select the optimal group of $k$ items from a larger set. Submodularity has been very effective in allowing approximation algorithms for such subset selection problems. However, in several applications, we are interested not only in the elements of a set, but also the order in which they appear, breaking the assumption that all selected items receive equal consideration. One such category of applications involves the presentation of search results, product recommendations, news articles, and other content, due to the well-documented phenomenon that humans pay greater attention to higher-ranked items. As a result, optimization in content presentation for diversity, user coverage, calibration, or other objectives more accurately represents a sequence selection problem, to which traditional submodularity approximation results no longer apply. Although extensions of submodularity to sequences have been proposed, none is designed to model settings where items contribute based on their position in a ranked list, and hence they are not able to express these types of optimization problems. In this paper, we aim to address this modeling gap. Here, we propose a new formalism of ordered submodularity that captures these ordering problems in content presentation, and more generally a category of optimization problems over ranked sequences in which different list positions contribute differently to the objective function. We analyze the natural ordered analogue of the greedy algorithm and show that it provides a $2$-approximation. We also show that this bound is tight, establishing that our new framework is conceptually and quantitatively distinct from previous formalisms of set and sequence submodularity.

preprint2021arXiv

Algorithmic Monoculture and Social Welfare

As algorithms are increasingly applied to screen applicants for high-stakes decisions in employment, lending, and other domains, concerns have been raised about the effects of algorithmic monoculture, in which many decision-makers all rely on the same algorithm. This concern invokes analogies to agriculture, where a monocultural system runs the risk of severe harm from unexpected shocks. Here we show that the dangers of algorithmic monoculture run much deeper, in that monocultural convergence on a single algorithm by a group of decision-making agents, even when the algorithm is more accurate for any one agent in isolation, can reduce the overall quality of the decisions being made by the full collection of agents. Unexpected shocks are therefore not needed to expose the risks of monoculture; it can hurt accuracy even under "normal" operations, and even for algorithms that are more accurate when used by only a single decision-maker. Our results rely on minimal assumptions, and involve the development of a probabilistic framework for analyzing systems that use multiple noisy estimates of a set of alternatives.

preprint2021arXiv

Allocating Opportunities in a Dynamic Model of Intergenerational Mobility

Opportunities such as higher education can promote intergenerational mobility, leading individuals to achieve levels of socioeconomic status above that of their parents. We develop a dynamic model for allocating such opportunities in a society that exhibits bottlenecks in mobility; the problem of optimal allocation reflects a trade-off between the benefits conferred by the opportunities in the current generation and the potential to elevate the socioeconomic status of recipients, shaping the composition of future generations in ways that can benefit further from the opportunities. We show how optimal allocations in our model arise as solutions to continuous optimization problems over multiple generations, and we find in general that these optimal solutions can favor recipients of low socioeconomic status over slightly higher-performing individuals of high socioeconomic status -- a form of socioeconomic affirmative action that the society in our model discovers in the pursuit of purely payoff-maximizing goals. We characterize how the structure of the model can lead to either temporary or persistent affirmative action, and we consider extensions of the model with more complex processes modulating the movement between different levels of socioeconomic status.

preprint2021arXiv

Random Graphs with Prescribed $K$-Core Sequences: A New Null Model for Network Analysis

In the analysis of large-scale network data, a fundamental operation is the comparison of observed phenomena to the predictions provided by null models: when we find an interesting structure in a family of real networks, it is important to ask whether this structure is also likely to arise in random networks with similar characteristics to the real ones. A long-standing challenge in network analysis has been the relative scarcity of reasonable null models for networks; arguably the most common such model has been the configuration model, which starts with a graph $G$ and produces a random graph with the same node degrees as $G$. This leads to a very weak form of null model, since fixing the node degrees does not preserve many of the crucial properties of the network, including the structure of its subgraphs. Guided by this challenge, we propose a new family of network null models that operate on the $k$-core decomposition. For a graph $G$, the $k$-core is its maximal subgraph of minimum degree $k$; and the core number of a node $v$ in $G$ is the largest $k$ such that $v$ belongs to the $k$-core of $G$. We provide the first efficient sampling algorithm to solve the following basic combinatorial problem: given a graph $G$, produce a random graph sampled nearly uniformly from among all graphs with the same sequence of core numbers as $G$. This opens the opportunity to compare observed networks $G$ with random graphs that exhibit the same core numbers, a comparison that preserves aspects of the structure of $G$ that are not captured by more local measures like the degree sequence. We illustrate the power of this core-based null model on some fundamental tasks in network analysis, including the enumeration of networks motifs.

preprint2020arXiv

Adversarial Perturbations of Opinion Dynamics in Networks

We study the connections between network structure, opinion dynamics, and an adversary's power to artificially induce disagreements. We approach these questions by extending models of opinion formation in the social sciences to represent scenarios, familiar from recent events, in which external actors seek to destabilize communities through sophisticated information warfare tactics via fake news and bots. In many instances, the intrinsic goals of these efforts are not necessarily to shift the overall sentiment of the network, but rather to induce discord. These perturbations diffuse via opinion dynamics on the underlying network, through mechanisms that have been analyzed and abstracted through work in computer science and the social sciences. We investigate the properties of such attacks, considering optimal strategies both for the adversary seeking to create disagreement and for the entities tasked with defending the network from attack. We show that for different formulations of these types of objectives, different regimes of the spectral structure of the network will limit the adversary's capacity to sow discord; this enables us to qualitatively describe which networks are most vulnerable against these perturbations. We then consider the algorithmic task of a network defender to mitigate these sorts of adversarial attacks by insulating nodes heterogeneously; we show that, by considering the geometry of this problem, this optimization task can be efficiently solved via convex programming. Finally, we generalize these results to allow for two network structures, where the opinion dynamics process and the measurement of disagreement become uncoupled, and determine how the adversary's power changes; for instance, this may arise when opinion dynamics are controlled an online community via social media, while disagreement is measured along "real-world" connections.

preprint2020arXiv

Aligning Superhuman AI with Human Behavior: Chess as a Model System

As artificial intelligence becomes increasingly intelligent---in some cases, achieving superhuman performance---there is growing potential for humans to learn from and collaborate with algorithms. However, the ways in which AI systems approach problems are often different from the ways people do, and thus may be uninterpretable and hard to learn from. A crucial step in bridging this gap between human and artificial intelligence is modeling the granular actions that constitute human behavior, rather than simply matching aggregate human performance. We pursue this goal in a model system with a long history in artificial intelligence: chess. The aggregate performance of a chess player unfolds as they make decisions over the course of a game. The hundreds of millions of games played online by players at every skill level form a rich source of data in which these decisions, and their exact context, are recorded in minute detail. Applying existing chess engines to this data, including an open-source implementation of AlphaZero, we find that they do not predict human moves well. We develop and introduce Maia, a customized version of Alpha-Zero trained on human chess games, that predicts human moves at a much higher accuracy than existing engines, and can achieve maximum accuracy when predicting decisions made by players at a specific skill level in a tuneable way. For a dual task of predicting whether a human will make a large mistake on the next move, we develop a deep neural network that significantly outperforms competitive baselines. Taken together, our results suggest that there is substantial promise in designing artificial intelligence systems with human collaboration in mind by first accurately modeling granular human decision-making.

preprint2020arXiv

Frozen Binomials on the Web: Word Ordering and Language Conventions in Online Text

There is inherent information captured in the order in which we write words in a list. The orderings of binomials --- lists of two words separated by `and' or `or' --- has been studied for more than a century. These binomials are common across many areas of speech, in both formal and informal text. In the last century, numerous explanations have been given to describe what order people use for these binomials, from differences in semantics to differences in phonology. These rules describe primarily `frozen' binomials that exist in exactly one ordering and have lacked large-scale trials to determine efficacy. Online text provides a unique opportunity to study these lists in the context of informal text at a very large scale. In this work, we expand the view of binomials to include a large-scale analysis of both frozen and non-frozen binomials in a quantitative way. Using this data, we then demonstrate that most previously proposed rules are ineffective at predicting binomial ordering. By tracking the order of these binomials across time and communities we are able to establish additional, unexplored dimensions central to these predictions. Expanding beyond the question of individual binomials, we also explore the global structure of binomials in various communities, establishing a new model for these lists and analyzing this structure for non-frozen and frozen binomials. Additionally, novel analysis of trinomials --- lists of length three --- suggests that none of the binomials analysis applies in these cases. Finally, we demonstrate how large data sets gleaned from the web can be used in conjunction with older theories to expand and improve on old questions.

preprint2020arXiv

Hypergraph Cuts with General Splitting Functions

The minimum $s$-$t$ cut problem in graphs is one of the most fundamental problems in combinatorial optimization, and graph cuts underlie algorithms throughout discrete mathematics, theoretical computer science, operations research, and data science. While graphs are a standard model for pairwise relationships, hypergraphs provide the flexibility to model multi-way relationships, and are now a standard model for complex data and systems. However, when generalizing from graphs to hypergraphs, the notion of a "cut hyperedge" is less clear, as a hyperedge's nodes can be split in several ways. Here, we develop a framework for hypergraph cuts by considering the problem of separating two terminal nodes in a hypergraph in a way that minimizes a sum of penalties at split hyperedges. In our setup, different ways of splitting the same hyperedge have different penalties, and the penalty is encoded by what we call a splitting function. Our framework opens a rich space on the foundations of hypergraph cuts. We first identify a natural class of cardinality-based hyperedge splitting functions that depend only on the number of nodes on each side of the split. In this case, we show that the general hypergraph $s$-$t$ cut problem can be reduced to a tractable graph $s$-$t$ cut problem if and only if the splitting functions are submodular. We also identify a wide regime of non-submodular splitting functions for which the problem is NP-hard. We also analyze extensions to multiway cuts with at least three terminal nodes and identify a natural class of splitting functions for which the problem can be reduced in an approximation-preserving way to the node-weighted multiway cut problem in graphs, again subject to a submodularity property. Finally, we outline several open questions on general hypergraph cut problems.

preprint2020arXiv

Minimizing Localized Ratio Cut Objectives in Hypergraphs

Hypergraphs are a useful abstraction for modeling multiway relationships in data, and hypergraph clustering is the task of detecting groups of closely related nodes in such data. Graph clustering has been studied extensively, and there are numerous methods for detecting small, localized clusters without having to explore an entire input graph. However, there are only a few specialized approaches for localized clustering in hypergraphs. Here we present a framework for local hypergraph clustering based on minimizing localized ratio cut objectives. Our framework takes an input set of reference nodes in a hypergraph and solves a sequence of hypergraph minimum $s$-$t$ cut problems in order to identify a nearby well-connected cluster of nodes that overlaps substantially with the input set. Our methods extend graph-based techniques but are significantly more general and have new output quality guarantees. First, our methods can minimize new generalized notions of hypergraph cuts, which depend on specific configurations of nodes within each hyperedge, rather than just on the number of cut hyperedges. Second, our framework has several attractive theoretical properties in terms of output cluster quality. Most importantly, our algorithm is strongly-local, meaning that its runtime depends only on the size of the input set, and does not need to explore the entire hypergraph to find good local clusters. We use our methodology to effectively identify clusters in hypergraphs of real-world data with millions of nodes, millions of hyperedges, and large average hyperedge size with runtimes ranging between a few seconds and a few minutes.

preprint2020arXiv

Roles for Computing in Social Change

A recent normative turn in computer science has brought concerns about fairness, bias, and accountability to the core of the field. Yet recent scholarship has warned that much of this technical work treats problematic features of the status quo as fixed, and fails to address deeper patterns of injustice and inequality. While acknowledging these critiques, we posit that computational research has valuable roles to play in addressing social problems -- roles whose value can be recognized even from a perspective that aspires toward fundamental social change. In this paper, we articulate four such roles, through an analysis that considers the opportunities as well as the significant risks inherent in such work. Computing research can serve as a diagnostic, helping us to understand and measure social problems with precision and clarity. As a formalizer, computing shapes how social problems are explicitly defined --- changing how those problems, and possible responses to them, are understood. Computing serves as rebuttal when it illuminates the boundaries of what is possible through technical means. And computing acts as synecdoche when it makes long-standing social problems newly salient in the public eye. We offer these paths forward as modalities that leverage the particular strengths of computational work in the service of social change, without overclaiming computing's capacity to solve social problems on its own.

preprint2016arXiv

Assessing Human Error Against a Benchmark of Perfection

An increasing number of domains are providing us with detailed trace data on human decisions in settings where we can evaluate the quality of these decisions via an algorithm. Motivated by this development, an emerging line of work has begun to consider whether we can characterize and predict the kinds of decisions where people are likely to make errors. To investigate what a general framework for human error prediction might look like, we focus on a model system with a rich history in the behavioral sciences: the decisions made by chess players as they select moves in a game. We carry out our analysis at a large scale, employing datasets with several million recorded games, and using chess tablebases to acquire a form of ground truth for a subset of chess positions that have been completely solved by computers but remain challenging even for the best players in the world. We organize our analysis around three categories of features that we argue are present in most settings where the analysis of human error is applicable: the skill of the decision-maker, the time available to make the decision, and the inherent difficulty of the decision. We identify rich structure in all three of these categories of features, and find strong evidence that in our domain, features describing the inherent difficulty of an instance are significantly more powerful than features based on skill or time.

preprint2016arXiv

Do Cascades Recur?

Cascades of information-sharing are a primary mechanism by which content reaches its audience on social media, and an active line of research has studied how such cascades, which form as content is reshared from person to person, develop and subside. In this paper, we perform a large-scale analysis of cascades on Facebook over significantly longer time scales, and find that a more complex picture emerges, in which many large cascades recur, exhibiting multiple bursts of popularity with periods of quiescence in between. We characterize recurrence by measuring the time elapsed between bursts, their overlap and proximity in the social network, and the diversity in the demographics of individuals participating in each peak. We discover that content virality, as revealed by its initial popularity, is a main driver of recurrence, with the availability of multiple copies of that content helping to spark new bursts. Still, beyond a certain popularity of content, the rate of recurrence drops as cascades start exhausting the population of interested individuals. We reproduce these observed patterns in a simple model of content recurrence simulated on a real social network. Using only characteristics of a cascade's initial burst, we demonstrate strong performance in predicting whether it will recur in the future.

preprint2016arXiv

Inherent Trade-Offs in the Fair Determination of Risk Scores

Recent discussion in the public sphere about algorithmic classification has involved tension between competing notions of what it means for a probabilistic classification to be fair to different groups. We formalize three fairness conditions that lie at the heart of these debates, and we prove that except in highly constrained special cases, there is no method that can satisfy these three conditions simultaneously. Moreover, even satisfying all three conditions approximately requires that the data lie in an approximate version of one of the constrained special cases identified by our theorem. These results suggest some of the ways in which key notions of fairness are incompatible with each other, and hence provide a framework for thinking about the trade-offs between them.

preprint2016arXiv

Planning Problems for Sophisticated Agents with Present Bias

Present bias, the tendency to weigh costs and benefits incurred in the present too heavily, is one of the most widespread human behavioral biases. It has also been the subject of extensive study in the behavioral economics literature. While the simplest models assume that the agents are naive, reasoning about the future without taking their bias into account, there is considerable evidence that people often behave in ways that are sophisticated with respect to present bias, making plans based on the belief that they will be present-biased in the future. For example, committing to a course of action to reduce future opportunities for procrastination or overconsumption are instances of sophisticated behavior in everyday life. Models of sophisticated behavior have lacked an underlying formalism that allows one to reason over the full space of multi-step tasks that a sophisticated agent might face. This has made it correspondingly difficult to make comparative or worst-case statements about the performance of sophisticated agents in arbitrary scenarios. In this paper, we incorporate the notion of sophistication into a graph-theoretic model that we used in recent work for modeling naive agents. This new synthesis of two formalisms - sophistication and graph-theoretic planning - uncovers a rich structure that wasn't apparent in the earlier behavioral economics work on this problem. In particular, our graph-theoretic model makes two kinds of new results possible. First, we give tight worst-case bounds on the performance of sophisticated agents in arbitrary multi-step tasks relative to the optimal plan. Second, the flexibility of our formalism makes it possible to identify new phenomena that had not been seen in prior literature: these include a surprising non-monotonic property in the use of rewards to motivate sophisticated agents and a framework for reasoning about commitment devices.

preprint2016arXiv

Social Networks Under Stress

Social network research has begun to take advantage of fine-grained communications regarding coordination, decision-making, and knowledge sharing. These studies, however, have not generally analyzed how external events are associated with a social network's structure and communicative properties. Here, we study how external events are associated with a network's change in structure and communications. Analyzing a complete dataset of millions of instant messages among the decision-makers in a large hedge fund and their network of outside contacts, we investigate the link between price shocks, network structure, and change in the affect and cognition of decision-makers embedded in the network. When price shocks occur the communication network tends not to display structural changes associated with adaptiveness. Rather, the network "turtles up". It displays a propensity for higher clustering, strong tie interaction, and an intensification of insider vs. outsider communication. Further, we find changes in network structure predict shifts in cognitive and affective processes, execution of new transactions, and local optimality of transactions better than prices, revealing the important predictive relationship between network structure and collective behavior within a social network.

preprint2016arXiv

Survey of Expressivity in Deep Neural Networks

We survey results on neural network expressivity described in "On the Expressive Power of Deep Neural Networks". The paper motivates and develops three natural measures of expressiveness, which all display an exponential dependence on the depth of the network. In fact, all of these measures are related to a fourth quantity, trajectory length. This quantity grows exponentially in the depth of the network, and is responsible for the depth sensitivity observed. These results translate to consequences for networks during and after training. They suggest that parameters earlier in a network have greater influence on its expressive power -- in particular, given a layer, its influence on expressivity is determined by the remaining depth of the network after that layer. This is verified with experiments on MNIST and CIFAR-10. We also explore the effect of training on the input-output map, and find that it trades off between the stability and expressivity.

preprint2016arXiv

The Status Gradient of Trends in Social Media

An active line of research has studied the detection and representation of trends in social media content. There is still relatively little understanding, however, of methods to characterize the early adopters of these trends: who picks up on these trends at different points in time, and what is their role in the system? We develop a framework for analyzing the population of users who participate in trending topics over the course of these topics' lifecycles. Central to our analysis is the notion of a "status gradient", describing how users of different activity levels adopt a trend at different points in time. Across multiple datasets, we find that this methodology reveals key differences in the nature of the early adopters in different domains.

preprint2015arXiv

Coordination and Efficiency in Decentralized Collaboration

Environments for decentralized on-line collaboration are now widespread on the Web, underpinning open-source efforts, knowledge creation sites including Wikipedia, and other experiments in joint production. When a distributed group works together in such a setting, the mechanisms they use for coordination can play an important role in the effectiveness of the group's performance. Here we consider the trade-offs inherent in coordination in these on-line settings, balancing the benefits to collaboration with the cost in effort that could be spent in other ways. We consider two diverse domains that each contain a wide range of collaborations taking place simultaneously -- Wikipedia and GitHub -- allowing us to study how coordination varies across different projects. We analyze trade-offs in coordination along two main dimensions, finding similar effects in both our domains of study: first we show that, in aggregate, high-status projects on these sites manage the coordination trade-off at a different level than typical projects; and second, we show that projects use a different balance of coordination when they are "crowded," with relatively small size but many participants. We also develop a stylized theoretical model for the cost-benefit trade-off inherent in coordination and show that it qualitatively matches the trade-offs we observe between crowdedness and coordination.

preprint2015arXiv

Selection and Influence in Cultural Dynamics

One of the fundamental principles driving diversity or homogeneity in domains such as cultural differentiation, political affiliation, and product adoption is the tension between two forces: influence (the tendency of people to become similar to others they interact with) and selection (the tendency to be affected most by the behavior of others who are already similar). Influence tends to promote homogeneity within a society, while selection frequently causes fragmentation. When both forces act simultaneously, it becomes an interesting question to analyze which societal outcomes should be expected. To study this issue more formally, we analyze a natural stylized model built upon active lines of work in political opinion formation, cultural diversity, and language evolution. We assume that the population is partitioned into "types" according to some traits (such as language spoken or political affiliation). While all types of people interact with one another, only people with sufficiently similar types can possibly influence one another. The "similarity" is captured by a graph on types in which individuals of the same or adjacent types can influence one another. We achieve an essentially complete characterization of (stable) equilibrium outcomes and prove convergence from all starting states. We also consider generalizations of this model.

preprint2015arXiv

The Lifecycles of Apps in a Social Ecosystem

Apps are emerging as an important form of on-line content, and they combine aspects of Web usage in interesting ways --- they exhibit a rich temporal structure of user adoption and long-term engagement, and they exist in a broader social ecosystem that helps drive these patterns of adoption and engagement. It has been difficult, however, to study apps in their natural setting since this requires a simultaneous analysis of a large set of popular apps and the underlying social network they inhabit. In this work we address this challenge through an analysis of the collection of apps on Facebook Login, developing a novel framework for analyzing both temporal and social properties. At the temporal level, we develop a retention model that represents a user's tendency to return to an app using a very small parameter set. At the social level, we organize the space of apps along two fundamental axes --- popularity and sociality --- and we show how a user's probability of adopting an app depends both on properties of the local network structure and on the match between the user's attributes, his or her friends' attributes, and the dominant attributes within the app's user population. We also develop models that show the importance of different feature sets with strong performance in predicting app success.

preprint2014arXiv

Can Cascades be Predicted?

On many social networking web sites such as Facebook and Twitter, resharing or reposting functionality allows users to share others' content with their own friends or followers. As content is reshared from user to user, large cascades of reshares can form. While a growing body of research has focused on analyzing and characterizing such cascades, a recent, parallel line of work has argued that the future trajectory of a cascade may be inherently unpredictable. In this work, we develop a framework for addressing cascade prediction problems. On a large sample of photo reshare cascades on Facebook, we find strong performance in predicting whether a cascade will continue to grow in the future. We find that the relative growth of a cascade becomes more predictable as we observe more of its reshares, that temporal and structural features are key predictors of cascade size, and that initially, breadth, rather than depth in a cascade is a better indicator of larger cascades. This prediction performance is robust in the sense that multiple distinct classes of features all achieve similar performance. We also discover that temporal features are predictive of a cascade's eventual shape. Observing independent cascades of the same content, we find that while these cascades differ greatly in size, we are still able to predict which ends up the largest.

preprint2014arXiv

Dynamic Models of Reputation and Competition in Job-Market Matching

A fundamental decision faced by a firm hiring employees - and a familiar one to anyone who has dealt with the academic job market, for example - is deciding what caliber of candidates to pursue. Should the firm try to increase its reputation by making offers to higher-quality candidates, despite the risk that the candidates might reject the offers and leave the firm empty-handed? Or should it concentrate on weaker candidates who are more likely to accept the offer? The question acquires an added level of complexity once we take into account the effect one hiring cycle has on the next: hiring better employees in the current cycle increases the firm's reputation, which in turn increases its attractiveness for higher-quality candidates in the next hiring cycle. These considerations introduce an interesting temporal dynamic aspect to the rich line of research on matching models for job markets, in which long-range planning and evolving reputational effects enter into the strategic decisions made by competing firms. We develop a model based on two competing firms to try capturing as cleanly as possible the elements that we believe constitute the strategic tension at the core of the problem: the trade-off between short-term recruiting success and long-range reputation-building; the inefficiency that results from underemployment of people who are not ranked highest; and the influence of earlier accidental outcomes on long-term reputations. Our model exhibits all these phenomena in a stylized setting, governed by a parameter q that captures the difference in strength between the two top candidates in each hiring cycle. We show that when q is relatively low the efficiency of the job market is improved by long-range reputational effects, but when q is relatively high, taking future reputations into account can sometimes reduce the efficiency.

preprint2014arXiv

Engaging with Massive Online Courses

The Web has enabled one of the most visible recent developments in education---the deployment of massive open online courses. With their global reach and often staggering enrollments, MOOCs have the potential to become a major new mechanism for learning. Despite this early promise, however, MOOCs are still relatively unexplored and poorly understood. In a MOOC, each student's complete interaction with the course materials takes place on the Web, thus providing a record of learner activity of unprecedented scale and resolution. In this work, we use such trace data to develop a conceptual framework for understanding how users currently engage with MOOCs. We develop a taxonomy of individual behavior, examine the different behavioral patterns of high- and low-achieving students, and investigate how forum participation relates to other parts of the course. We also report on a large-scale deployment of badges as incentives for engagement in a MOOC, including randomized experiments in which the presentation of badges was varied across sub-populations. We find that making badges more salient produced increases in forum engagement.

preprint2014arXiv

Time-Inconsistent Planning: A Computational Problem in Behavioral Economics

In many settings, people exhibit behavior that is inconsistent across time --- we allocate a block of time to get work done and then procrastinate, or put effort into a project and then later fail to complete it. An active line of research in behavioral economics and related fields has developed and analyzed models for this type of time-inconsistent behavior. Here we propose a graph-theoretic model of tasks and goals, in which dependencies among actions are represented by a directed graph, and a time-inconsistent agent constructs a path through this graph. We first show how instances of this path-finding problem on different input graphs can reconstruct a wide range of qualitative phenomena observed in the literature on time-inconsistency, including procrastination, abandonment of long-range tasks, and the benefits of reduced sets of choices. We then explore a set of analyses that quantify over the set of all graphs; among other results, we find that in any graph, there can be only polynomially many distinct forms of time-inconsistent behavior; and any graph in which a time-inconsistent agent incurs significantly more cost than an optimal agent must contain a large "procrastination" structure as a minor. Finally, we use this graph-theoretic model to explore ways in which tasks can be designed to help motivate agents to reach designated goals.

preprint2013arXiv

Characterizing and curating conversation threads: Expansion, focus, volume, re-entry

Discussion threads form a central part of the experience on many Web sites, including social networking sites such as Facebook and Google Plus and knowledge creation sites such as Wikipedia. To help users manage the challenge of allocating their attention among the discussions that are relevant to them, there has been a growing need for the algorithmic curation of on-line conversations --- the development of automated methods to select a subset of discussions to present to a user. Here we consider two key sub-problems inherent in conversational curation: length prediction --- predicting the number of comments a discussion thread will receive --- and the novel task of re-entry prediction --- predicting whether a user who has participated in a thread will later contribute another comment to it. The first of these sub-problems arises in estimating how interesting a thread is, in the sense of generating a lot of conversation; the second can help determine whether users should be kept notified of the progress of a thread to which they have already contributed. We develop and evaluate a range of approaches for these tasks, based on an analysis of the network structure and arrival pattern among the participants, as well as a novel dichotomy in the structure of long threads. We find that for both tasks, learning-based approaches using these sources of information yield improvements for all the performance metrics we used.

preprint2013arXiv

Graph cluster randomization: network exposure to multiple universes

A/B testing is a standard approach for evaluating the effect of online experiments; the goal is to estimate the `average treatment effect' of a new feature or condition by exposing a sample of the overall population to it. A drawback with A/B testing is that it is poorly suited for experiments involving social interference, when the treatment of individuals spills over to neighboring individuals along an underlying social network. In this work, we propose a novel methodology using graph clustering to analyze average treatment effects under social interference. To begin, we characterize graph-theoretic conditions under which individuals can be considered to be `network exposed' to an experiment. We then show how graph cluster randomization admits an efficient exact algorithm to compute the probabilities for each vertex being network exposed under several of these exposure conditions. Using these probabilities as inverse weights, a Horvitz-Thompson estimator can then provide an effect estimate that is unbiased, provided that the exposure model has been properly specified. Given an estimator that is unbiased, we focus on minimizing the variance. First, we develop simple sufficient conditions for the variance of the estimator to be asymptotically small in n, the size of the graph. However, for general randomization schemes, this variance can be lower bounded by an exponential function of the degrees of a graph. In contrast, we show that if a graph satisfies a restricted-growth condition on the growth rate of neighborhoods, then there exists a natural clustering algorithm, based on vertex neighborhoods, for which the variance of the estimator can be upper bounded by a linear function of the degrees. Thus we show that proper cluster randomization can lead to exponentially lower estimator variance when experimentally measuring average treatment effects under interference.

preprint2013arXiv

On Discrete Preferences and Coordination

An active line of research has considered games played on networks in which payoffs depend on both a player's individual decision and also the decisions of her neighbors. Such games have been used to model issues including the formation of opinions and the adoption of technology. A basic question that has remained largely open in this area is to consider games where the strategies available to the players come from a fixed, discrete set, and where players may have different intrinsic preferences among the possible strategies. It is natural to model the tension among these different preferences by positing a distance function on the strategy set that determines a notion of "similarity" among strategies; a player's payoff is determined by the distance from her chosen strategy to her preferred strategy and to the strategies chosen by her network neighbors. Even when there are only two strategies available, this framework already leads to natural open questions about a version of the classical Battle of the Sexes problem played on a graph. We develop a set of techniques for analyzing this class of games, which we refer to as discrete preference games. We parametrize the games by the relative extent to which a player takes into account the effect of her preferred strategy and the effect of her neighbors' strategies, allowing us to interpolate between network coordination games and unilateral decision-making. When these two effects are balanced, we show that the price of stability is equal to 1 for any discrete preference game in which the distance function on the strategies is a tree metric; as a special case, this includes the Battle of the Sexes on a graph. We also show that trees form the maximal family of metrics for which the price of stability is 1, and produce a collection of metrics on which the price of stability converges to a tight bound of 2.

preprint2013arXiv

Romantic Partnerships and the Dispersion of Social Ties: A Network Analysis of Relationship Status on Facebook

A crucial task in the analysis of on-line social-networking systems is to identify important people --- those linked by strong social ties --- within an individual's network neighborhood. Here we investigate this question for a particular category of strong ties, those involving spouses or romantic partners. We organize our analysis around a basic question: given all the connections among a person's friends, can you recognize his or her romantic partner from the network structure alone? Using data from a large sample of Facebook users, we find that this task can be accomplished with high accuracy, but doing so requires the development of a new measure of tie strength that we term `dispersion' --- the extent to which two people's mutual friends are not themselves well-connected. The results offer methods for identifying types of structurally significant people in on-line applications, and suggest a potential expansion of existing theories of tie strength.

preprint2013arXiv

Subgraph Frequencies: Mapping the Empirical and Extremal Geography of Large Graph Collections

A growing set of on-line applications are generating data that can be viewed as very large collections of small, dense social graphs -- these range from sets of social groups, events, or collaboration projects to the vast collection of graph neighborhoods in large social networks. A natural question is how to usefully define a domain-independent coordinate system for such a collection of graphs, so that the set of possible structures can be compactly represented and understood within a common space. In this work, we draw on the theory of graph homomorphisms to formulate and analyze such a representation, based on computing the frequencies of small induced subgraphs within each graph. We find that the space of subgraph frequencies is governed both by its combinatorial properties, based on extremal results that constrain all graphs, as well as by its empirical properties, manifested in the way that real social graphs appear to lie near a simple one-dimensional curve through this space. We develop flexible frameworks for studying each of these aspects. For capturing empirical properties, we characterize a simple stochastic generative model, a single-parameter extension of Erdos-Renyi random graphs, whose stationary distribution over subgraphs closely tracks the concentration of the real social graph families. For the extremal properties, we develop a tractable linear program for bounding the feasible space of subgraph frequencies by harnessing a toolkit of known extremal graph theory. Together, these two complementary frameworks shed light on a fundamental question pertaining to social graphs: what properties of social graphs are 'social' properties and what properties are 'graph' properties? We conclude with a brief demonstration of how the coordinate system we examine can also be used to perform classification tasks, distinguishing between social graphs of different origins.

preprint2012arXiv

Echoes of power: Language effects and power differences in social interaction

Understanding social interaction within groups is key to analyzing online communities. Most current work focuses on structural properties: who talks to whom, and how such interactions form larger network structures. The interactions themselves, however, generally take place in the form of natural language --- either spoken or written --- and one could reasonably suppose that signals manifested in language might also provide information about roles, status, and other aspects of the group's dynamics. To date, however, finding such domain-independent language-based signals has been a challenge. Here, we show that in group discussions power differentials between participants are subtly revealed by how much one individual immediately echoes the linguistic style of the person they are responding to. Starting from this observation, we propose an analysis framework based on linguistic coordination that can be used to shed light on power relationships and that works consistently across multiple types of power --- including a more "static" form of power based on status differences, and a more "situational" form of power in which one individual experiences a type of dependence on another. Using this framework, we study how conversational behavior can reveal power relationships in two very different settings: discussions among Wikipedians and arguments before the U.S. Supreme Court.

preprint2012arXiv

How Bad is Forming Your Own Opinion?

The question of how people form their opinion has fascinated economists and sociologists for quite some time. In many of the models, a group of people in a social network, each holding a numerical opinion, arrive at a shared opinion through repeated averaging with their neighbors in the network. Motivated by the observation that consensus is rarely reached in real opinion dynamics, we study a related sociological model in which individuals' intrinsic beliefs counterbalance the averaging process and yield a diversity of opinions. By interpreting the repeated averaging as best-response dynamics in an underlying game with natural payoffs, and the limit of the process as an equilibrium, we are able to study the cost of disagreement in these models relative to a social optimum. We provide a tight bound on the cost at equilibrium relative to the optimum; our analysis draws a connection between these agreement models and extremal problems that lead to generalized eigenvalues. We also consider a natural network design problem in this setting: which links can we add to the underlying network to reduce the cost of disagreement at equilibrium?

preprint2012arXiv

You had me at hello: How phrasing affects memorability

Understanding the ways in which information achieves widespread public awareness is a research question of significant interest. We consider whether, and how, the way in which the information is phrased --- the choice of words and sentence structure --- can affect this process. To this end, we develop an analysis framework and build a corpus of movie quotes, annotated with memorability information, in which we are able to control for both the speaker and the setting of the quotes. We find that there are significant differences between memorable and non-memorable quotes in several key dimensions, even after controlling for situational and contextual factors. One is lexical distinctiveness: in aggregate, memorable quotes use less common word choices, but at the same time are built upon a scaffolding of common syntactic patterns. Another is that memorable quotes tend to be more general in ways that make them easy to apply in new contexts --- that is, more portable. We also show how the concept of "memorable language" can be extended across domains.

preprint2011arXiv

Voting with Limited Information and Many Alternatives

The traditional axiomatic approach to voting is motivated by the problem of reconciling differences in subjective preferences. In contrast, a dominant line of work in the theory of voting over the past 15 years has considered a different kind of scenario, also fundamental to voting, in which there is a genuinely "best" outcome that voters would agree on if they only had enough information. This type of scenario has its roots in the classical Condorcet Jury Theorem; it includes cases such as jurors in a criminal trial who all want to reach the correct verdict but disagree in their inferences from the available evidence, or a corporate board of directors who all want to improve the company's revenue, but who have different information that favors different options. This style of voting leads to a natural set of questions: each voter has a {\em private signal} that provides probabilistic information about which option is best, and a central question is whether a simple plurality voting system, which tabulates votes for different options, can cause the group decision to arrive at the correct option. We show that plurality voting is powerful enough to achieve this: there is a way for voters to map their signals into votes for options in such a way that --- with sufficiently many voters --- the correct option receives the greatest number of votes with high probability. We show further, however, that any process for achieving this is inherently expensive in the number of voters it requires: succeeding in identifying the correct option with probability at least $1 - η$ requires $Ω(n^3 ε^{-2} \log η^{-1})$ voters, where $n$ is the number of options and $ε$ is a distributional measure of the minimum difference between the options.

preprint2010arXiv

Governance in Social Media: A case study of the Wikipedia promotion process

Social media sites are often guided by a core group of committed users engaged in various forms of governance. A crucial aspect of this type of governance is deliberation, in which such a group reaches decisions on issues of importance to the site. Despite its crucial --- though subtle --- role in how a number of prominent social media sites function, there has been relatively little investigation of the deliberative aspects of social media governance. Here we explore this issue, investigating a particular deliberative process that is extensive, public, and recorded: the promotion of Wikipedia admins, which is determined by elections that engage committed members of the Wikipedia community. We find that the group decision-making at the heart of this process exhibits several fundamental forms of relative assessment. First we observe that the chance that a voter will support a candidate is strongly dependent on the relationship between characteristics of the voter and the candidate. Second we investigate how both individual voter decisions and overall election outcomes can be based on models that take into account the sequential, public nature of the voting.

preprint2010arXiv

Information-Sharing and Privacy in Social Networks

We present a new model for reasoning about the way information is shared among friends in a social network, and the resulting ways in which it spreads. Our model formalizes the intuition that revealing personal information in social settings involves a trade-off between the benefits of sharing information with friends, and the risks that additional gossiping will propagate it to people with whom one is not on friendly terms. We study the behavior of rational agents in such a situation, and we characterize the existence and computability of stable information-sharing networks, in which agents do not have an incentive to change the partners with whom they share information. We analyze the implications of these stable networks for social welfare, and the resulting fragmentation of the social network.

preprint2010arXiv

Predicting Positive and Negative Links in Online Social Networks

We study online social networks in which relationships can be either positive (indicating relations such as friendship) or negative (indicating relations such as opposition or antagonism). Such a mix of positive and negative links arise in a variety of online settings; we study datasets from Epinions, Slashdot and Wikipedia. We find that the signs of links in the underlying social networks can be predicted with high accuracy, using models that generalize across this diverse range of sites. These models provide insight into some of the fundamental principles that drive the formation of signed links in networks, shedding light on theories of balance and status from social psychology; they also suggest social computing applications by which the attitude of one user toward another can be estimated from evidence provided by their relationships with other members of the surrounding social network.

preprint2010arXiv

Signed Networks in Social Media

Relations between users on social media sites often reflect a mixture of positive (friendly) and negative (antagonistic) interactions. In contrast to the bulk of research on social networks that has focused almost exclusively on positive interpretations of links between people, we study how the interplay between positive and negative relationships affects the structure of on-line social networks. We connect our analyses to theories of signed networks from social psychology. We find that the classical theory of structural balance tends to capture certain common patterns of interaction, but that it is also at odds with some of the fundamental phenomena we observe --- particularly related to the evolving, directed nature of these on-line networks. We then develop an alternate theory of status that better explains the observed edge signs and provides insights into the underlying social mechanisms. Our work provides one of the first large-scale evaluations of theories of signed networks using on-line datasets, as well as providing a perspective for reasoning about social media sites.

preprint2010arXiv

The Directed Closure Process in Hybrid Social-Information Networks, with an Analysis of Link Formation on Twitter

It has often been taken as a working assumption that directed links in information networks are frequently formed by "short-cutting" a two-step path between the source and the destination -- a kind of implicit "link copying" analogous to the process of triadic closure in social networks. Despite the role of this assumption in theoretical models such as preferential attachment, it has received very little direct empirical investigation. Here we develop a formalization and methodology for studying this type of directed closure process, and we provide evidence for its important role in the formation of links on Twitter. We then analyze a sequence of models designed to capture the structural phenomena related to directed closure that we observe in the Twitter data.

Jon Kleinberg

What is connected

Connect this record

See the researcher in context

Building this map preview

47 published item(s)

Mistake-Bounded Language Generation

Combinatorial Characterizations and Impossibilities for Higher-order Homophily

Allocating Stimulus Checks in Times of Crisis

Core-periphery Models for Hypergraphs

Detecting Individual Decision-Making Style: Exploring Behavioral Stylometry in Chess

Four Years of FAccT: A Reflexive, Mixed-Methods Analysis of Research Contributions, Shortcomings, and Future Prospects

Learning Models of Individual Behavior in Chess

Mimetic Models: Ethical Implications of AI that Acts Like You

On the Effect of Triadic Closure on Network Segregation

Ordered Submodularity and its Applications to Diversifying Recommendations

Algorithmic Monoculture and Social Welfare

Allocating Opportunities in a Dynamic Model of Intergenerational Mobility

Random Graphs with Prescribed $K$-Core Sequences: A New Null Model for Network Analysis

Adversarial Perturbations of Opinion Dynamics in Networks

Aligning Superhuman AI with Human Behavior: Chess as a Model System

Frozen Binomials on the Web: Word Ordering and Language Conventions in Online Text

Hypergraph Cuts with General Splitting Functions

Minimizing Localized Ratio Cut Objectives in Hypergraphs

Roles for Computing in Social Change

Assessing Human Error Against a Benchmark of Perfection

Do Cascades Recur?

Inherent Trade-Offs in the Fair Determination of Risk Scores

Planning Problems for Sophisticated Agents with Present Bias

Social Networks Under Stress

Survey of Expressivity in Deep Neural Networks

The Status Gradient of Trends in Social Media

Coordination and Efficiency in Decentralized Collaboration

Selection and Influence in Cultural Dynamics

The Lifecycles of Apps in a Social Ecosystem

Can Cascades be Predicted?

Dynamic Models of Reputation and Competition in Job-Market Matching

Engaging with Massive Online Courses

Time-Inconsistent Planning: A Computational Problem in Behavioral Economics

Characterizing and curating conversation threads: Expansion, focus, volume, re-entry

Graph cluster randomization: network exposure to multiple universes

On Discrete Preferences and Coordination

Romantic Partnerships and the Dispersion of Social Ties: A Network Analysis of Relationship Status on Facebook

Subgraph Frequencies: Mapping the Empirical and Extremal Geography of Large Graph Collections

Echoes of power: Language effects and power differences in social interaction

How Bad is Forming Your Own Opinion?

You had me at hello: How phrasing affects memorability

Voting with Limited Information and Many Alternatives

Governance in Social Media: A case study of the Wikipedia promotion process

Information-Sharing and Privacy in Social Networks

Predicting Positive and Negative Links in Online Social Networks

Signed Networks in Social Media

The Directed Closure Process in Hybrid Social-Information Networks, with an Analysis of Link Formation on Twitter