Source author record

Yuanzhe Li

Yuanzhe Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Distributed, Parallel, and Cluster Computing Machine Learning physics.optics

Catalog footprint

What is connected

2works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

AIConfigurator: Lightning-Fast Configuration Optimization for Multi-Framework LLM Serving

Optimizing Large Language Model (LLM) inference in production systems is increasingly difficult due to dynamic workloads, stringent latency/throughput targets, and a rapidly expanding configuration space. This complexity spans not only distributed parallelism strategies (tensor/pipeline/expert) but also intricate framework-specific runtime parameters such as those concerning the enablement of CUDA graphs, available KV-cache memory fractions, and maximum token capacity, which drastically impact performance. The diversity of modern inference frameworks (e.g., TRT-LLM, vLLM, SGLang), each employing distinct kernels and execution policies, makes manual tuning both framework-specific and computationally prohibitive. We present AIConfigurator, a unified performance-modeling system that enables rapid, framework-agnostic inference configuration search without requiring GPU-based profiling. AIConfigurator combines (1) a methodology that decomposes inference into analytically modelable primitives - GEMM, attention, communication, and memory operations while capturing framework-specific scheduling dynamics; (2) a calibrated kernel-level performance database for these primitives across a wide range of hardware platforms and popular open-weights models (GPT-OSS, Qwen, DeepSeek, LLama, Mistral); and (3) an abstraction layer that automatically resolves optimal launch parameters for the target backend, seamlessly integrating into production-grade orchestration systems. Evaluation on production LLM serving workloads demonstrates that AIConfigurator identifies superior serving configurations that improve performance by up to 40% for dense models (e.g., Qwen3-32B) and 50% for MoE architectures (e.g., DeepSeek-V3), while completing searches within 30 seconds on average. Enabling the rapid exploration of vast design spaces - from cluster topology down to engine specific flags.

preprint2026arXiv

Programmable calculus operations in electromagnetic space using space-time-coding metasurface

With the rapid advancement of metasurfaces and the increasing demand for programmable metasurfaces to simplify information systems, wave-based computation using metasurfaces has emerged as an attractive research topic. To facilitate the mathematical operations in electromagnetic (EM) space, here we propose a space-time coding metasurface (STCM) system capable of directly performing calculus operations on the spatial energy distributions of EM waves. By exploiting harmonic characteristics induced by time-varying coding, the responses of meta-atoms at specific harmonics can be flexibly controlled, which enables the metasurface system to address more complex tasks. Owing to its programmability, the STCM can dynamically switch functions in real time to accommodate different calculus tasks. To fully leverage the capability of STCM, we not only present the space-time coding sequences for differentiation and integration of EM waves, but also develop and numerically simulate the space-time coding sequences that can independently and simultaneously implement different calculus operations on the same incident EM waves. To experimentally validate the feasibility of the EM calculus operations, proof-of-concept experiments are conducted using a programmable 2-bit STCM. Good agreements among the theory, numerical simulations, and experiments confirm the feasibility of performing calculus operations in the EM space and demonstrate the broad application prospects of STCM in EM wave manipulations, wireless communications, and signal processing.

Yuanzhe Li

What is connected

Connect this record

See the researcher in context

Building this map preview

2 published item(s)

AIConfigurator: Lightning-Fast Configuration Optimization for Multi-Framework LLM Serving

Programmable calculus operations in electromagnetic space using space-time-coding metasurface