Researcher profile

Tong Gao

Tong Gao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

OpenCompass: A Universal Evaluation Platform for Large Language Models

In recent years, the field of artificial intelligence has undergone a paradigm shift from task-specific small-scale models to general-purpose large language models (LLMs). With the rapid iteration of LLMs, objective, quantitative, and comprehensive evaluation of their capabilities has become a critical link in advancing technological development. Currently, the mainstream static benchmark dataset-based evaluation methods face challenges such as the diversity of task types, inconsistent evaluation criteria, and fragmentation of data and processing workflows, making it difficult to efficiently conduct cross-domain and large-scale model evaluation. To address the aforementioned issues, this paper proposes and open-sources OpenCompass, a one-stop, scalable, and high-concurrency-supported general-purpose LLM evaluation platform. Adhering to the design philosophy of modularization and component decoupling, the platform boasts three core advantages: high compatibility, flexibility, and high concurrency. The core architecture of OpenCompass comprises five key components: the Configuration System, Task Partitioning Module, Execution and Scheduling Module, Task Execution Unit, and Result Visualization Module. Its workflow provides rule-based, LLM-as-a-Judge, and cascaded evaluators to adapt to the requirements of different task scenarios. Supporting mainstream benchmark datasets across multiple domains, including knowledge, reasoning, computation, science, language, code, etc., the platform offers a unified and efficient LLM evaluation tool for both academia and industry, facilitating the accurate identification of strengths and weaknesses of LLMs as well as their subsequent optimization.

preprint2022arXiv

Hydrodynamic instabilities and collective dynamics in activity-balanced pusher-puller mixtures

Microorganisms living in microfluidic environments often form multi-species swarms, where they can leverage collective motions to achieve enhanced transport and spreading. Nevertheless, there is a general lack of physical understandings of the origins of the multiscale unstable dynamics observed within these systems. Here, we build a computational model to study binary suspensions of rear- and front-actuated microswimmers, or respectively the so-called "pusher" and "puller" particles, that have different populations and swimming speeds. We perform direct particle simulations to reveal that collective system dynamics are possible even in the scenario of an "activity-balanced" mixture, which produces near zero mean extra stress. We first construct a continuum kinetic model to describe the initial transient period when the system is near uniform isotropy and then perform linear stability analysis to reveal the system's finite-wavelength hydrodynamic instabilities, in contrast with the long-wavelength instabilities of pure pusher/puller suspensions. Then, we carry out slender-body discrete particle simulations to resolve both the short time instabilities and the the longtime dynamics, which feature non-trivial density fluctuations and spatially-correlated motions, distinct from those of single-species.

preprint2022arXiv

The planar thermal Hall conductivity in the Kitaev magnet α-RuCl3

We report detailed measurements of the Onsager-like planar thermal Hall conductivity $κ_{xy}$ in $α$-RuCl$_3$, a spin-liquid candidate of topical interest. With the thermal current ${\bf J}_{\rm Q}$ and magnetic field $\bf B\parallel a$ (zigzag axis), the observed $κ_{xy}/T$ varies strongly with temperature $T$ (1-10 K). The results are well-described by bosonic edge excitations which evolve to topological magnons at large $B$. Fits to $κ_{xy}/T$ yield a Chern number $\sim 1$ and a band energy $ω_1\sim$1 meV, in agreement with sharp modes seen in electron spin-resonance experiments. The bosonic character is incompatible with half-quantization of $κ_{xy}/T$.

preprint2022arXiv

Towards Automated Error Analysis: Learning to Characterize Errors

Characterizing the patterns of errors that a system makes helps researchers focus future development on increasing its accuracy and robustness. We propose a novel form of "meta learning" that automatically learns interpretable rules that characterize the types of errors that a system makes, and demonstrate these rules' ability to help understand and improve two NLP systems. Our approach works by collecting error cases on validation data, extracting meta-features describing these samples, and finally learning rules that characterize errors using these features. We apply our approach to VilBERT, for Visual Question Answering, and RoBERTa, for Common Sense Question Answering. Our system learns interpretable rules that provide insights into systemic errors these systems make on the given tasks. Using these insights, we are also able to "close the loop" and modestly improve performance of these systems.

preprint2019arXiv

High mobility in a van der Waals layered antiferromagnetic metal

Magnetic van der Waals (vdW) materials have been heavily pursued for fundamental physics as well as for device design. Despite the rapid advances, so far magnetic vdW materials are mainly insulating or semiconducting, and none of them possesses a high electronic mobility - a property that is rare in layered vdW materials in general. The realization of a magnetic high-mobility vdW material would open the possibility for novel magnetic twistronic or spintronic devices. Here we report very high carrier mobility in the layered vdW antiferromagnet GdTe3. The electron mobility is beyond 60,000 cm2 V-1 s-1, which is the highest among all known layered magnetic materials, to the best of our knowledge. Among all known vdW materials, the mobility of bulk GdTe3 is comparable to that of black phosphorus, and is only surpassed by graphite. By mechanical exfoliation, we further demonstrate that GdTe3 can be exfoliated to ultrathin flakes of three monolayers, and that the magnetic order and relatively high mobility is retained in approximately 20-nm-thin flakes.

preprint2019arXiv

Weak-field induced nonmagnetic state in a Co-based honeycomb

Layered honeycomb magnets are of interest as potential realizations of the Kitaev quantum spin liquid (KQSL), a quantum state with long-range spin entanglement and an exactly solvable Hamiltonian. Conventional magnetically ordered states are present for all currently known candidate materials, however, because non-Kitaev terms in the Hamiltonians obscure the Kitaev physics. Current experimental studies of the KQSL are focused on 4d- or 5d-transition-metal-based honeycombs, in which strong spin-orbit coupling can be expected, yielding Kitaev interaction that dominate in an applied magnetic field. In contrast, for 3d-based layered honeycomb magnets, spin orbit coupling is weak and thus Kitaev-physics should be substantially less accessible. Here we report our studies on BaCo2(AsO4)2, for which we find that the magnetic order associated with the non-Kitaev interactions can be fully suppressed by a relatively low magnetic field, yielding a non-magnetic material and implying the presence of strong magnetic frustration and weak non-Kitaev interactions.