Researcher profile

Kunming Zhang

Kunming Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

BandPilot: Towards Performance- and Contention-Aware GPU Dispatching in AI Clusters

Modern multi-tenant AI clusters are increasingly communication-bound, driven by high-volume and multi-round GPU-to-GPU collective communication. Consequently, the GPU dispatcher's choice of a physical GPU subset for each tenant largely determines the job's effective collective bandwidth and thus its performance ceiling. Existing dispatchers predominantly rely on static, topology-aware heuristics that prioritize GPU resource compactness, assuming that minimizing physical distance maximizes communication bandwidth. However, we reveal that this assumption often fails due to complex system-level bottlenecks, such as non-linear NIC saturation and inter-node link heterogeneity.This paper presents BandPilot, a performance- and contention-aware GPU dispatching primitive that optimizes effective collective bandwidth for multi-tenant AI clusters. Specifically, BandPilot learns a data-efficient bandwidth model from sparse NCCL measurements via a hierarchical design. Guided by the model, a fast hybrid search combines an equilibrium-driven constructor with a pruned elimination search to navigate the combinatorial allocation space in real time. To account for multi-tenant interference, BandPilot virtually merges a candidate allocation with co-located cross-host jobs to conservatively estimate shared bottleneck capacity and predict contention-degraded bandwidth. Across a 32-GPU H100 cluster and heterogeneous simulations, BandPilot achieves 92-97% bandwidth efficiency relative to the best-found reference, improving average efficiency by 20-40% over topology-compactness heuristics.

preprint2025arXiv

Atomic-scale spin sensing of a 2D $d$-wave altermagnet via helical tunneling

Altermagnetism simultaneously possesses nonrelativistic spin responses and zero net magnetization, thus combining advantages of ferromagnetism and antiferromagnetism. This superiority originates from its unique dual feature, i.e., opposite-magnetic sublattices in real space and alternating spin polarization in momentum space enforced by the same crystal symmetry. Therefore, the determination of an altermagnetic order and its unique spin response inherently necessitates atomic-scale spin-resolved measurements in real and momentum spaces, an experimental milestone yet to be achieved. Here, via utilizing the helical edge (hinge) modes of a higher order topological insulator as the spin sensor, we realize spin-resolved scanning tunneling microscopy which enables us to pin down the dual-space feature of a layered $d$-wave altermagnet, KV$_2$Se$_2$O. In real space, atomic-registered mapping demonstrates the checkerboard antiferromagnetic order together with density-wave lattice modulation, and in momentum space, spin-resolved spectroscopic imaging provides a direct visualization of d-wave spin splitting of the band structure. Critically, using this new topology-guaranteed spin filter we directly reveal the unidirectional, spin-polarized quasiparticle excitations originating from the crystal symmetry-paired X and Y valleys around opposite magnetic sublattices simultaneously --the unique spin response for $d$-wave altermagnetism. Our experiments establish a solid basis for the exploration and utilization of altermagnetism in layered materials and further facilitate access to atomic-scale spin sensing and manipulating of 2D quantum materials.