Researcher profile

Ling Tang

Ling Tang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

Attributing Emergence in Million-Agent Systems

Large language models (LLMs) can simulate human-like reasoning and decision-making in individual agents. LLM-powered multi-agent systems (MAS) combine such agents to simulate population-scale social phenomena such as polarization, information cascades, and market panics. Such studies require attributing macro emergence to individual agents, but existing axiomatic methods scale combinatorially in $N$ and have been confined to $N \lesssim 10^3$, while the phenomena they explain occur at $N \geq 10^6$. We address this gap by adapting Aumann--Shapley path-integral attribution to LLM-powered MAS at million-agent scale; the resulting method satisfies all four axioms, runs four to five orders of magnitude faster than sampled Shapley on the same hardware. We use this method to test the scale gap empirically: across 14 days of public Bluesky data ($1{,}671{,}587$ active users), we compute the attribution at both full scale and the visibility-biased $N = 10^2$ convenience sample used by small-scale studies, and the two disagree structurally. At full scale the long tail and middle tier jointly carry the majority; the biased small panel attributes almost everything to a few high-follower accounts. We then prove that under any nonlinear macro indicator the disagreement cannot be reduced by post-hoc rescaling: an Attribution Scaling Bias theorem shows that no global rescaling factor can reconcile small-scale and full-scale attribution. Full-scale attribution is therefore not a methodological choice but a theoretical requirement for any nonlinear macro indicator.

preprint2026arXiv

What Do EEG Foundation Models Capture from Human Brain Signals?

Clinical electroencephalogram (EEG) analysis rests on a hand-crafted feature catalog refined over decades, \emph{e.g.,} band power, connectivity, complexity, and more. Modern EEG foundation models bypass this catalog, learn directly from raw signals via self-supervised pretraining, and match or outperform feature-engineered baselines on most clinical benchmarks. Whether the two representations align is an open question, which we decompose into three sub-questions: \emph{what does the model learn}, \emph{what does the model use}, and \emph{how much can be explained}. We answer them with layer-wise ridge probing, LEACE-style cross-covariance subspace erasure, and a transparent classifier benchmarked against a random-feature baseline. The audit covers three foundation models (CSBrain, CBraMod, LaBraM), five clinical tasks (MDD, Stress, ISRUC-Sleep, TUSL, Siena), and a 6-family 63-feature lexicon. Of the $945$ (model, task, feature) units, $648$ ($68.6\%$) are representation-causal and $199$ ($21.1\%$) are encoded-only. Across tasks, $50$ features qualify as universal candidates with strong support (all three architectures RC) in two or more tasks. Frequency-domain features dominate, but the other five families each contribute substantial causal mass. Confirmed features recover, on average, $79.3\%$ of the foundation model's advantage over the random baseline, with a clean task gradient (MDD $\approx 0.99$ down to Stress $\approx 0.56$): tasks near ceiling are almost fully recovered by the lexicon, while harder tasks leave a non-trivial residual that pinpoints a concrete target for future concept discovery.

preprint2022arXiv

A deep machine learning potential for atomistic simulation of Fe-Si-O systems under Earth's outer core conditions

Using artificial neural-network machine learning (ANN-ML) to generate interatomic potentials has been demonstrated to be a promising approach to address the long-standing challenge of accuracy versus efficiency in molecular dynamics (MD) simulations. Here, taking the Fe-Si-O system as a prototype, we show that accurate and transferable ANN-ML potentials can be developed for reliable MD simulations of materials at high-pressure and high-temperature conditions of the Earth's outer core. The ANN-ML potential for Fe-Si-O system is trained by fitting to the energies and forces of related binaries and ternary liquid structures at high pressures and temperatures obtained by first-principles calculations based on density functional theory (DFT). We show that the generated ANN-ML potential describes well the structure and dynamics of liquid phases of this complex system. The efficient ANN-ML potential with DFT accuracy provides a promising scheme for accurate atomistic simulations of structures and dynamics of complex Fe-Si-O system in the Earth's outer core.

preprint2022arXiv

Batch Normalization Is Blind to the First and Second Derivatives of the Loss

In this paper, we prove the effects of the BN operation on the back-propagation of the first and second derivatives of the loss. When we do the Taylor series expansion of the loss function, we prove that the BN operation will block the influence of the first-order term and most influence of the second-order term of the loss. We also find that such a problem is caused by the standardization phase of the BN operation. Experimental results have verified our theoretical conclusions, and we have found that the BN operation significantly affects feature representations in specific tasks, where losses of different samples share similar analytic formulas.

preprint2022arXiv

Greedy randomized sampling nonlinear Kaczmarz methods

The nonlinear Kaczmarz method was recently proposed to solve the system of nonlinear equations. In this paper, we first discuss two greedy selection rules, i.e., the maximum residual and maximum distance rules, for the nonlinear Kaczmarz iteration. Then, based on them, two kinds of greedy randomized sampling methods are presented. Further, we also devise four corresponding greedy randomized block methods, i.e., the multiple samples-based methods. The linear convergence in expectation of all the proposed methods is proved. Numerical results show that, in some applications including brown almost linear function and generalized linear model, the greedy selection rules give faster convergence rates than the random ones, and the block methods outperform the single sample-based ones.

preprint2022arXiv

Sketch-and-project methods for tensor linear systems

For tensor linear systems with respect to the popular t-product, we first present the sketch-and-project method and its adaptive variants. Their Fourier domain versions are also investigated. Then, considering that the existing sketching tensor or way for sampling has some limitations, we propose two improved strategies. Convergence analyses for the methods mentioned above are provided. We compare our methods with the existing ones using synthetic and real data. Numerical results show that they have quite decent performance in terms of the number of iterations and running time.