Source author record

Marcelo G. Mattar

Marcelo G. Mattar appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Neurons and Cognition Artificial Intelligence Machine Learning Computation and Language

Catalog footprint

What is connected

9works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Extracting Search Trees from LLM Reasoning Traces Reveals Myopic Planning

Large language models (LLMs), especially reasoning models, generate extended chain-of-thought (CoT) reasoning that often contains explicit deliberation over future outcomes. Yet whether this deliberation constitutes genuine planning, how it is structured, and what aspects of it drive performance remain poorly understood. In this work, we introduce a new method to characterize LLM planning by extracting and quantifying search trees from reasoning traces in the four-in-a-row board game. By fitting computational models on the extracted search trees, we characterize how plans are structured and how they influence move decisions. We find that LLMs' search is shallower than humans', and that performance is predicted by search breadth rather than depth. Most strikingly, although LLMs expand deep nodes in their traces, their move choices are best explained by a myopic model that ignores those nodes entirely. A causal intervention study where we selectively prune CoT paragraphs further suggests that move selection is driven predominantly by shallow rather than deep nodes. These patterns contrast with human planning, where performance is driven primarily by deep search. Together, our findings reveal a key difference between LLM and human planning: while human expertise is driven by deeper search, LLMs do not act on deep lookahead. This dissociation offers targeted guidance for aligning LLM and human planning. More broadly, our framework provides a generalizable approach for interpreting the structure of LLM planning across strategic domains.

preprint2026arXiv

Post-training makes large language models less human-like

Large language models (LLMs) are increasingly used as surrogates for human participants, but it remains unclear which models best capture human behavior and why. To address this, we introduce Psych-201, a novel dataset that enables us to measure behavioral alignment at scale. We find that post-training -- the stage that turns base models into useful assistants -- consistently reduces alignment with human behavior across model families, sizes, and objectives. Moreover, this misalignment widens in newer model generations even as base models continue to improve. Finally, we find that persona-induction -- a popular technique for eliciting human-like behavior by conditioning models on participant-specific information -- does not improve predictions at the level of individuals. Taken together, our results suggest that the very processes that are currently employed to turn LLMs into useful assistants also make them less accurate models of human behavior.

preprint2026arXiv

Reason to Play: Behavioral and Brain Alignment Between Frontier LRMs and Human Game Learners

Humans rapidly learn abstract knowledge when encountering novel environments and flexibly deploy this knowledge to guide efficient and intelligent action. Can modern AI systems learn and plan in a similar way? We study this question using a dataset of complex human gameplay with concurrent fMRI recordings, in which participants learn novel video games that require rule discovery, hypothesis revision, and multi-step planning. We jointly evaluate models by their ability to play the games, match human learning behavior, and predict brain activity during the same task, comparing a suite of frontier Large Reasoning Models (LRMs) against model-free and model-based deep reinforcement learning agents and a Bayesian theory-based agent. We find that frontier LRMs most closely match human behavioral patterns during game discovery and predict brain activity an order of magnitude better than both reinforcement learning alternatives across cortical and subcortical regions, with effects robust to permutation controls. Through targeted manipulations, we further show that brain alignment reflects the model's in-context representation of the game state rather than its downstream planning or reasoning. Our results establish LRMs as compelling computational accounts of human learning and decision making in complex, naturalistic environments. Project page with interactive replays: https://botcs.github.io/reason-to-play/

preprint2026arXiv

The Position Curse: LLMs Struggle to Locate the Last Few Items in a List

Modern large language models (LLMs) can find a needle in a haystack (locating a single relevant fact buried among hundreds of thousands of irrelevant tokens) with near-saturated accuracy, yet fail to retrieve the last few items in a short list. We call this failure the Position Curse. For instance, even in a two-line code snippet, Claude Opus 4.6 misidentifies the second-to-last line most of the time. To characterize this failure, we evaluated two complementary queries: given a position in a sequence (of letters or words), retrieve the corresponding item; and given an item, return its position. Each position is specified as a forward or backward offset from an anchor, either an endpoint of the list (its start or end) or another item in the list. Across both open-source and frontier closed-source models, backward retrieval substantially lags forward retrieval. To test whether this capability can be rescued by post-training, we constructed PosBench, a position-focused training dataset. LoRA fine-tuning improves both forward and backward retrieval and generalizes to a held-out code-understanding benchmark (PyIndex), yet absolute performance remains far from saturated. As LLM coding agents increasingly operate over large codebases where precise indexing becomes essential for code understanding and editing, position-based retrieval emerges as a key capability for future pretraining objectives and model design.

preprint2022arXiv

Predecessor Features

Any reinforcement learning system must be able to identify which past events contributed to observed outcomes, a problem known as credit assignment. A common solution to this problem is to use an eligibility trace to assign credit to recency-weighted set of experienced events. However, in many realistic tasks, the set of recently experienced events are only one of the many possible action events that could have preceded the current outcome. This suggests that reinforcement learning can be made more efficient by allowing credit assignment to any viable preceding state, rather than only those most recently experienced. Accordingly, we examine ``Predecessor Features'', the fully bootstrapped version of van Hasselt's ``Expected Trace'', an algorithm that achieves this richer form of credit assignment. By maintaining a representation that approximates the expected sum of past occupancies, this algorithm allows temporal difference (TD) errors to be propagated accurately to a larger number of predecessor states than conventional methods, greatly improving learning speed. The algorithm can also be naturally extended from tabular state representation to feature representations allowing for increased performance on a wide range of environments. We demonstrate several use cases for Predecessor Features and compare its performance with other approaches.

preprint2020arXiv

Learning differentially reorganizes brain activity and connectivity

Human learning is a complex process in which future behavior is altered via the reorganization of brain activity and connectivity. It remains unknown whether activity and connectivity differentially reorganize during learning, and, if so, how that differential reorganization tracks stages of learning across distinct brain areas. Here, we address this gap in knowledge by measuring brain activity and functional connectivity in a longitudinal fMRI experiment in which healthy adult human participants learn the values of novel objects over the course of four days. An increasing similarity in activity or functional connectivity across subjects during learning reflects reorganization toward a common functional architecture. We assessed the presence of reorganization in activity and connectivity both during value learning and during the resting-state, allowing us to differentiate common elicited processes from intrinsic processes. We found a complex and dynamic reorganization of brain connectivity and activity--as a function of time, space, and performance--that occurs while subjects learn. Spatially localized brain activity reorganizes across the brain to a common functional architecture early in learning, and this reorganization tracks early learning performance. In contrast, spatially distributed connectivity reorganizes across the brain to a common functional architecture as training progresses, and this reorganization tracks later learning performance. Particularly good performance is associated with a sticky connectivity, that persists into the resting state. Broadly, our work uncovers distinct principles of reorganization in activity and connectivity at different phases of value learning, which inform the ongoing study of learning processes more generally.

preprint2016arXiv

Brain Network Architecture: Implications for Human Learning

Human learning is a complex phenomenon that requires adaptive processes across a range of temporal and spacial scales. While our understanding of those processes at single scales has increased exponentially over the last few years, a mechanistic understanding of the entire phenomenon has remained elusive. We propose that progress has been stymied by the lack of a quantitative framework that can account for the full range of neurophysiological and behavioral dynamics both across scales in the systems and also across different types of learning. We posit that network neuroscience offers promise in meeting this challenge. Built on the mathematical fields of complex systems science and graph theory, network neuroscience embraces the interconnected and hierarchical nature of human learning, offering insights into the emergent properties of adaptability. In this review, we discuss the utility of network neuroscience as a tool to build a quantitative framework in which to study human learning, which seeks to explain the full chain of events in the brain from sensory input to motor output, being both biologically plausible and able to make predictions about how an intervention at a single level of the chain may cause alterations in another level of the chain. We close by laying out important remaining challenges in network neuroscience in explicitly bridging spatial scales at which neurophysiological processes occur, and underscore the utility of such a quantitative framework for education and therapy.

preprint2016arXiv

Structural Pathways Supporting Swift Acquisition of New Visuo-Motor Skills

Human skill learning requires fine-scale coordination of distributed networks of brain regions that are directly linked to one another by white matter tracts to allow for effective information transmission. Yet how individual differences in these anatomical pathways may impact individual differences in learning remains far from understood. Here, we test the hypothesis that individual differences in the organization of structural networks supporting task performance predict individual differences in the rate at which humans learn a visuo-motor skill. Over the course of 6 weeks, twenty-two healthy adult subjects practiced a discrete sequence production task, where they learned a sequence of finger movements based on discrete visual cues. We collected structural imaging data during four MRI scanning sessions spaced approximately two weeks apart, and using deterministic tractography, structural networks were generated for each participant to identify streamlines that connect cortical and sub-cortical brain regions. We observed that increased white matter connectivity linking early visual (but not motor) regions was associated with a faster learning rate. Moreover, we observed that the strength of multi-edge paths between motor and visual modules was also correlated with learning rate, supporting the role of polysynaptic connections in successful skill acquisition. Our results demonstrate that the combination of diffusion imaging and tractography-based connectivity can be used to predict future individual differences in learning capacity, particularly when combined with methods from network science and graph theory.

preprint2016arXiv

The Energy Landscape Underpinning Module Dynamics in the Human Brain Connectome

Human brain dynamics can be profitably viewed through the lens of statistical mechanics, where neurophysiological activity evolves around and between local attractors representing preferred mental states. Many physically-inspired models of these dynamics define the state of the brain based on instantaneous measurements of regional activity. Yet, recent work in network neuroscience has provided initial evidence that the brain might also be well-characterized by time-varying states composed of locally coherent activity or functional modules. Here we study this network-based notion of brain state to understand how functional modules dynamically interact with one another to perform cognitive functions. We estimate the functional relationships between regions of interest (ROIs) by fitting a pair-wise maximum entropy model to each ROI's pattern of allegiance to functional modules. Local minima in this model represent attractor states characterized by specific patterns of modular structure. The clustering of local minima highlights three classes of ROIs with similar patterns of allegiance to community states. Visual, attention, sensorimotor, and subcortical ROIs tend to form a single functional community. The remaining ROIs tend to form a putative executive control community or a putative default mode and salience community. We simulate the brain's dynamic transitions between these community states using a Markov Chain Monte Carlo random walk. We observe that simulated transition probabilities between basins resemble empirically observed transitions between community allegiance states in resting state fMRI data. These results collectively offer a view of the brain as a dynamical system that transitions between basins of attraction characterized by coherent activity in small groups of brain regions, and that the strength of these attractors depends on the cognitive computations being performed.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint