Source author record

James A. Evans

James A. Evans appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Digital Libraries Social and Information Networks Artificial Intelligence cs.CY physics.soc-ph Applications Computation and Language econ.GN Human-Computer Interaction Information Retrieval Machine Learning math-ph math.MP Multiagent Systems Populations and Evolution q-fin.EC

Catalog footprint

What is connected

8works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Missing vs. Unused Knowledge Hypothesis for Language Model Bottlenecks in Patent Understanding

While large language models (LLMs) excel at factual recall, the real challenge lies in knowledge application. A gap persists between their ability to answer complex questions and their effectiveness in performing tasks that require that knowledge. We investigate this gap using a patent classification problem that requires deep conceptual understanding to distinguish semantically similar but objectively different patents written in dense, strategic technical language. We find that LLMs often struggle with this distinction. To diagnose the source of these failures, we introduce a framework that decomposes model errors into two categories: missing knowledge and unused knowledge. Our method prompts models to generate clarifying questions and compares three settings -- raw performance, self-answered questions that activate internal knowledge, and externally provided answers that supply missing knowledge (if any). We show that most errors stem from failures to deploy existing knowledge rather than from true knowledge gaps. We also examine how models differ in constructing task-specific question-answer databases. Smaller models tend to generate simpler questions that they, and other models, can retrieve and use effectively, whereas larger models produce more complex questions that are less effective, suggesting complementary strengths across model scales. Together, our findings highlight that shifting evaluation from static fact recall to dynamic knowledge application offers a more informative view of model capabilities.

preprint2023arXiv

Disrupted Routines Anticipate Musical Exploration

Prior research suggests that taste preferences relate to personality traits, values, shifts in mood, and immigration destination, but understanding everyday patterns of listening and the function music plays in life have remained elusive, despite speculations that musical nostalgia may compensate for local disruption. Using more than a hundred million streams of 4 million songs by tens of thousands of international listeners from a global music service catering to local tastes, here we show that breaches in personal routine are systematically associated with personal musical exploration. As people visited new cities and countries, their preferences diversified, converging towards their destinations. As people experienced COVID-19 lock-downs, and then again when they experienced reopenings, their preferences diversified further.

preprint2021arXiv

Policy-Aware Mobility Model Explains the Growth of COVID-19 in Cities

With the continued spread of coronavirus, the task of forecasting distinctive COVID-19 growth curves in different cities, which remain inadequately explained by standard epidemiological models, is critical for medical supply and treatment. Predictions must take into account non-pharmaceutical interventions to slow the spread of coronavirus, including stay-at-home orders, social distancing, quarantine and compulsory mask-wearing, leading to reductions in intra-city mobility and viral transmission. Moreover, recent work associating coronavirus with human mobility and detailed movement data suggest the need to consider urban mobility in disease forecasts. Here we show that by incorporating intra-city mobility and policy adoption into a novel metapopulation SEIR model, we can accurately predict complex COVID-19 growth patterns in U.S. cities ($R^2$ = 0.990). Estimated mobility change due to policy interventions is consistent with empirical observation from Apple Mobility Trends Reports (Pearson's R = 0.872), suggesting the utility of model-based predictions where data are limited. Our model also reproduces urban "superspreading", where a few neighborhoods account for most secondary infections across urban space, arising from uneven neighborhood populations and heightened intra-city churn in popular neighborhoods. Therefore, our model can facilitate location-aware mobility reduction policy that more effectively mitigates disease transmission at similar social cost. Finally, we demonstrate our model can serve as a fine-grained analytic and simulation framework that informs the design of rational non-pharmaceutical interventions policies.

preprint2020arXiv

Human Evaluation of Interpretability: The Case of AI-Generated Music Knowledge

Interpretability of machine learning models has gained more and more attention among researchers in the artificial intelligence (AI) and human-computer interaction (HCI) communities. Most existing work focuses on decision making, whereas we consider knowledge discovery. In particular, we focus on evaluating AI-discovered knowledge/rules in the arts and humanities. From a specific scenario, we present an experimental procedure to collect and assess human-generated verbal interpretations of AI-generated music theory/rules rendered as sophisticated symbolic/numeric objects. Our goal is to reveal both the possibilities and the challenges in such a process of decoding expressive messages from AI sources. We treat this as a first step towards 1) better design of AI representations that are human interpretable and 2) a general methodology to evaluate interpretability of AI-discovered knowledge representations.

preprint2020arXiv

Too many cooks: Bayesian inference for coordinating multi-agent collaboration

Collaboration requires agents to coordinate their behavior on the fly, sometimes cooperating to solve a single task together and other times dividing it up into sub-tasks to work on in parallel. Underlying the human ability to collaborate is theory-of-mind, the ability to infer the hidden mental states that drive others to act. Here, we develop Bayesian Delegation, a decentralized multi-agent learning mechanism with these abilities. Bayesian Delegation enables agents to rapidly infer the hidden intentions of others by inverse planning. We test Bayesian Delegation in a suite of multi-agent Markov decision processes inspired by cooking problems. On these tasks, agents with Bayesian Delegation coordinate both their high-level plans (e.g. what sub-task they should work on) and their low-level actions (e.g. avoiding getting in each other's way). In a self-play evaluation, Bayesian Delegation outperforms alternative algorithms. Bayesian Delegation is also a capable ad-hoc collaborator and successfully coordinates with other agent types even in the absence of prior experience. Finally, in a behavioral experiment, we show that Bayesian Delegation makes inferences similar to human observers about the intent of others. Together, these results demonstrate the power of Bayesian Delegation for decentralized multi-agent collaboration.

preprint2019arXiv

Quantifying dynamics of failure across science, startups, and security

Human achievements are often preceded by repeated attempts that initially fail, yet little is known about the mechanisms governing the dynamics of failure. Here, building on the rich literature on innovation, human dynamics and learning, we develop a simple one-parameter model that mimics how successful future attempts build on those past. Analytically solving this model reveals a phase transition that separates dynamics of failure into regions of stagnation or progression, predicting that near the critical threshold, agents who share similar characteristics and learning strategies may experience fundamentally different outcomes following failures. Below the critical point, we see those who explore disjoint opportunities without a pattern of improvement, and above it, those who exploit incremental refinements to systematically advance toward success. The model makes several empirically testable predictions, demonstrating that those who eventually succeed and those who do not may be initially similar, yet are characterized by fundamentally distinct failure dynamics in terms of the efficiency and quality of each subsequent attempt. We collected large-scale data from three disparate domains, tracing repeated attempts by (i) NIH investigators to fund their research, (ii) innovators to successfully exit their startup ventures, and (iii) terrorist organizations to post casualties in violent attacks, finding broadly consistent empirical support across all three domains. Together, our findings unveil identifiable yet previously unknown early signals that allow us to identify failure dynamics that will lead to ultimate victory or defeat. Given the ubiquitous nature of failures and the paucity of quantitative approaches to understand them, these results represent a crucial step toward deeper understanding of the complex dynamics beneath failures, the essential prerequisites for success.

preprint2015arXiv

Full two-scale asymptotic expansion and higher-order constitutive laws in the homogenisation of the system of Maxwell equations

For the system of Maxwell equations of electromagnetism in an $l$-periodic composite medium of overall size $L$ ($0<l<L<\infty$), in the low-frequency quasistatic approximation, we develop an electromagnetic version of strain-gradient theories, where the magnetic field is not a function of the magnetic induction alone but also of its spatial gradients, and the electric field depends not only on the displacement but also on displacement gradients. Following the work (Smyshlyaev, V.P., Cherednichenko, K.D., 2000. On rigorous derivation of strain gradient effects in the overall behaviour of periodic heterogeneous media, ${\mathit J.\ Mech.\ Phys.\ Solids\ }{\mathbf{48}},$ $1325-1357$), we develop a combination of variational and asymptotic approaches to the multiscale analysis of the Maxwell system. We provide rigorous convergence estimates of higher order of smallness with respect to the inverse of the "scale separation parameter" $L/l.$ Using a special "ensemble averaging" procedure for a family of periodic problems, we derive an infinite-order version of the classical homogenised system of Maxwell equations.

preprint2013arXiv

Tradition and Innovation in Scientists' Research Strategies

What factors affect a scientist's choice of research problem? Qualitative research in the history, philosophy, and sociology of science suggests that this choice is shaped by an "essential tension" between the professional demand for productivity and a conflicting drive toward risky innovation. We examine this tension empirically in the context of biomedical chemistry. We use complex networks to represent the evolving state of scientific knowledge, as expressed in publications. We then define research strategies relative to these networks. Scientists can introduce novel chemicals or chemical relationships--or delve deeper into known ones. They can consolidate existing knowledge clusters, or bridge distant ones. Analyzing such choices in aggregate, we find that the distribution of strategies remains remarkably stable, even as chemical knowledge grows dramatically. High-risk strategies, which explore new chemical relationships, are less prevalent in the literature, reflecting a growing focus on established knowledge at the expense of new opportunities. Research following a risky strategy is more likely to be ignored but also more likely to achieve high impact and recognition. While the outcome of a risky strategy has a higher expected reward than the outcome of a conservative strategy, the additional reward is insufficient to compensate for the additional risk. By studying the winners of 137 different prizes in biomedicine and chemistry, we show that the occasional "gamble" for extraordinary impact is the most plausible explanation for observed levels of risk-taking. Our empirical demonstration and unpacking of the "essential tension" suggests policy interventions that may foster more innovative research.