Researcher profile

Yuming Huang

Yuming Huang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

High-Q AlN microresonators for nonlinear near-infrared and near-visible photonics

High Q-factors of microresonators are crucial for nonlinear integrated photonics, as many nonlinear dynamics have quadratic or even cubic dependence on Q-factors. The unique material properties make AlN microresonators invaluable for microcomb generation, Raman lasing and visible integrated photonics. However, the loss level of AlN falls behind other integrated platforms. By optimizing the fabrication, we demonstrate record Q-factors of 5.4$\times$10$^6$ and 2.2$\times$10$^6$ for AlN microresonators in the near-infrared and near-visible, respectively. Polarized-mode-interaction was used to create anomalous dispersion to support bright AlN Dirac solitons. Measurement of polarization-dependent spectra reveals the polarization hybridization of the Dirac soliton. In a microresonator with normal dispersion, Raman assisted four-wave-mixing (RFWM) was observed to initiate platicon formation, adding an approach to generate normal dispersion microcombs. A design of width-varying waveguides was used to ensure both efficient coupling and high Q-factor for racetrack microresonators at 780 nm. The microresonator was pumped to generate near-visble Raman laser at 820 nm with a fundamental linewidth narrower than 220 Hz. Our work unlocks new opportunities for integrated AlN photonics by improving Q-factors and uncovering nonlinear dynamics in AlN microresonators.

preprint2026arXiv

Reduced-order Neural Modeling with Differentiable Simulation for High-Detail Tactile Perception

Tactile perception is key to dexterous manipulation, yet simulating high-resolution elastomer deformation remains computationally prohibitive. Finite element methods (FEM) deliver high fidelity but demand costly remeshing, while Material Point Methods (MPM) suffer from heavy particle-memory tradeoffs. We propose a {reduced-order neural simulation framework} that couples coarse-grained MPM dynamics with an implicit neural decoder to reconstruct sub-particle tactile details from compact latent states. The framework learns a continuous deformation manifold from paired high- and low-resolution simulations, enabling physically consistent, differentiable inference. Compared to the TacIPC, our method achieves over 65\% faster simulation and {40\% lower memory usage}, while maintaining better geometric fidelity. In tactile rendering and 3D surface reconstruction, our methods further improve accuracy by 25\% and produce realistic depth images and surface mesh within a faster inference speed. These results demonstrate that the proposed reduced-order neural model enables high-detail, physically grounded tactile simulation with substantial efficiency gains for robotic interaction and optimization.

preprint2022arXiv

Cost-Effective Algorithms for Average-Case Interactive Graph Search

Interactive graph search (IGS) uses human intelligence to locate the target node in hierarchy, which can be applied for image classification, product categorization and searching a database. Specifically, IGS aims to categorize an object from a given category hierarchy via several rounds of interactive queries. In each round of query, the search algorithm picks a category and receives a boolean answer on whether the object is under the chosen category. The main efficiency goal asks for the minimum number of queries to identify the correct hierarchical category for the object. In this paper, we study the average-case interactive graph search (AIGS) problem that aims to minimize the expected number of queries when the objects follow a probability distribution. We propose a greedy search policy that splits the candidate categories as evenly as possible with respect to the probability weights, which offers an approximation guarantee of $O(\log n)$ for AIGS given the category hierarchy is a directed acyclic graph (DAG), where $n$ is the total number of categories. Meanwhile, if the input hierarchy is a tree, we show that a constant approximation factor of $(1+\sqrt{5})/2$ can be achieved. Furthermore, we present efficient implementations of the greedy policy, namely GreedyTree and GreedyDAG, that can quickly categorize the object in practice. Extensive experiments in real-world scenarios are carried out to demonstrate the superiority of our proposed methods.

preprint2022arXiv

INTERN: A New Learning Paradigm Towards General Vision

Enormous waves of technological innovations over the past several years, marked by the advances in AI technologies, are profoundly reshaping the industry and the society. However, down the road, a key challenge awaits us, that is, our capability of meeting rapidly-growing scenario-specific demands is severely limited by the cost of acquiring a commensurate amount of training data. This difficult situation is in essence due to limitations of the mainstream learning paradigm: we need to train a new model for each new scenario, based on a large quantity of well-annotated data and commonly from scratch. In tackling this fundamental problem, we move beyond and develop a new learning paradigm named INTERN. By learning with supervisory signals from multiple sources in multiple stages, the model being trained will develop strong generalizability. We evaluate our model on 26 well-known datasets that cover four categories of tasks in computer vision. In most cases, our models, adapted with only 10% of the training data in the target domain, outperform the counterparts trained with the full set of data, often by a significant margin. This is an important step towards a promising prospect where such a model with general vision capability can dramatically reduce our reliance on data, thus expediting the adoption of AI technologies. Furthermore, revolving around our new paradigm, we also introduce a new data system, a new architecture, and a new benchmark, which, together, form a general vision ecosystem to support its future development in an open and inclusive manner. See project website at https://opengvlab.shlab.org.cn .

preprint2022arXiv

MaskRange: A Mask-classification Model for Range-view based LiDAR Segmentation

Range-view based LiDAR segmentation methods are attractive for practical applications due to their direct inheritance from efficient 2D CNN architectures. In literature, most range-view based methods follow the per-pixel classification paradigm. Recently, in the image segmentation domain, another paradigm formulates segmentation as a mask-classification problem and has achieved remarkable performance. This raises an interesting question: can the mask-classification paradigm benefit the range-view based LiDAR segmentation and achieve better performance than the counterpart per-pixel paradigm? To answer this question, we propose a unified mask-classification model, MaskRange, for the range-view based LiDAR semantic and panoptic segmentation. Along with the new paradigm, we also propose a novel data augmentation method to deal with overfitting, context-reliance, and class-imbalance problems. Extensive experiments are conducted on the SemanticKITTI benchmark. Among all published range-view based methods, our MaskRange achieves state-of-the-art performance with $66.10$ mIoU on semantic segmentation and promising results with $53.10$ PQ on panoptic segmentation with high efficiency. Our code will be released.