Researcher profile

Yingyi Huang

Yingyi Huang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

FlashInfer-Bench: Building the Virtuous Cycle for AI-driven LLM Systems

Recent advances show that large language models (LLMs) can act as autonomous agents capable of generating GPU kernels, but integrating these AI-generated kernels into real-world inference systems remains challenging. FlashInfer-Bench addresses this gap by establishing a standardized, closed-loop framework that connects kernel generation, benchmarking, and deployment. At its core, FlashInfer Trace provides a unified schema describing kernel definitions, workloads, implementations, and evaluations, enabling consistent communication between agents and systems. Built on real serving traces, FlashInfer-Bench includes a curated dataset, a robust correctness- and performance-aware benchmarking framework, a public leaderboard to track LLM agents' GPU programming capabilities, and a dynamic substitution mechanism (apply()) that seamlessly injects the best-performing kernels into production LLM engines such as SGLang and vLLM. Using FlashInfer-Bench, we further evaluate the performance and limitations of LLM agents, compare the trade-offs among different GPU programming languages, and provide insights for future agent design. FlashInfer-Bench thus establishes a practical, reproducible pathway for continuously improving AI-generated kernels and deploying them into large-scale LLM inference.

preprint2019arXiv

Scalable Majorana vortex modes in iron-based superconductors

A vortex in an s-wave superconductor with a surface Dirac cone can trap a Majorana bound state with zero energy leading to a zero-bias peak (ZBP) of tunneling conductance. The iron-based superconductor FeTe$_x$Se$_{1-x}$ is one of the material candidates hosting these Majorana vortex modes. It has been observed by recent scanning tunneling spectroscopy measurement that the fraction of vortex cores possessing ZBPs decreases with increasing magnetic field on the surface of this iron-based superconductor. We construct a three-dimensional tight-binding model simulating the physics of over a hundred Majorana vortex modes in FeTe$_x$Se$_{1-x}$ with realistic physical parameters. Our simulation shows that the Majorana hybridization and disordered vortex distribution can explain the decreasing fraction of the ZBPs observed in the experiment. Furthermore, we find the statistics of the energy peaks off zero energy in our simulation with the Majorana physics in agreement with the analyzed peak statistics in the vortex cores from the experiment. This agreement and the explanation of the decreasing ZBP fraction lead to an important indication of scalable Majorana vortex modes in the iron-based superconductor. Thus, FeTe$_x$Se$_{1-x}$ can be one promising platform possessing scalable Majorana qubits for quantum computing. In addition, we further show the interplay of the ZBP presence and the vortex locations qualitatively agrees with our additional experimental observation and predict the universal spin signature of the hybridized multiple Majorana vortex modes.