Source author record

Yu Zhong

Yu Zhong appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence Computation and Language cond-mat.mtrl-sci eess.SP eess.SY Information Retrieval Machine Learning physics.comp-ph physics.optics Robotics Systems and Control

Catalog footprint

What is connected

9works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Masgent: An AI-assisted Materials Simulation Agent

Density functional theory (DFT) and machine learning potentials (MLPs) are essential for predicting and understanding materials properties, yet preparing, executing, and analyzing these simulations typically requires extensive scripting, multi-step procedures, and significant high-performance computing (HPC) expertise. These challenges hinder reproducibility and slow down discovery. Here, we introduce Masgent, an AI-assisted materials simulation agent that unifies structure manipulation, automated VASP input generation, DFT workflow construction and analysis, fast MLP-based simulations, and lightweight machine learning (ML) utilities within a single platform. Powered by large language models (LLMs), Masgent enables researchers to perform complex simulation tasks through natural-language interaction, eliminating most manual scripting and reducing setup time from hours to seconds. By standardizing protocols and integrating advanced simulation and data-driven tools, Masgent lowers the barrier to performing state-of-the-art computational methodologies, enabling faster hypothesis testing, pre-screening, and exploratory research for both new and experienced practitioners.

preprint2026arXiv

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Autonomous systems are increasingly deployed in open and dynamic environments -- from city streets to aerial and indoor spaces -- where perception models must remain reliable under sensor noise, environmental variation, and platform shifts. However, even state-of-the-art methods often degrade under unseen conditions, highlighting the need for robust and generalizable robot sensing. The RoboSense 2025 Challenge is designed to advance robustness and adaptability in robot perception across diverse sensing scenarios. It unifies five complementary research tracks spanning language-grounded decision making, socially compliant navigation, sensor configuration generalization, cross-view and cross-modal correspondence, and cross-platform 3D perception. Together, these tasks form a comprehensive benchmark for evaluating real-world sensing reliability under domain shifts, sensor failures, and platform discrepancies. RoboSense 2025 provides standardized datasets, baseline models, and unified evaluation protocols, enabling large-scale and reproducible comparison of robust perception methods. The challenge attracted 143 teams from 85 institutions across 16 countries, reflecting broad community engagement. By consolidating insights from 23 winning solutions, this report highlights emerging methodological trends, shared design principles, and open challenges across all tracks, marking a step toward building robots that can sense reliably, act robustly, and adapt across platforms in real-world environments.

preprint2026arXiv

Unified Personalized Understanding, Generating and Editing

Unified large multimodal models (LMMs) have achieved remarkable progress in general-purpose multimodal understanding and generation. However, they still operate under a ``one-size-fits-all'' paradigm and struggle to model user-specific concepts (e.g., generate a photo of \texttt{<maeve>}) in a consistent and controllable manner. Existing personalization methods typically rely on external retrieval, which is inefficient and poorly integrated into unified multimodal pipelines. Recent personalized unified models introduce learnable soft prompts to encode concept information, yet they either couple understanding and generation or depend on complex multi-stage training, leading to cross-task interference and ultimately to fuzzy or misaligned personalized knowledge. We present \textbf{OmniPersona}, an end-to-end personalization framework for unified LMMs that, for the first time, integrates personalized understanding, generation, and image editing within a single architecture. OmniPersona introduces structurally decoupled concept tokens, allocating dedicated subspaces for different tasks to minimize interference, and incorporates an explicit knowledge replay mechanism that propagates personalized attribute knowledge across tasks, enabling consistent personalized behavior. To systematically evaluate unified personalization, we propose \textbf{\texttt{OmniPBench}}, extending the public UnifyBench concept set with personalized editing tasks and cross-task evaluation protocols integrating understanding, generation, and editing. Experimental results demonstrate that OmniPersona delivers competitive and robust performance across diverse personalization tasks. We hope OmniPersona will serve as a strong baseline and spur further research on controllable, unified personalization.

preprint2022arXiv

Measurement of minute volumes of chiral molecules using in-fiber polarimetry

We report an opto-fluidic method that enables to efficiently measure the enantiomeric excess of chiral molecules at low concentration. The approach is to monitor the optical activity induced by a Kagome-lattice hollow-core photonic crystal fiber filled with a sub-ul volume of chiral compound. The technique also allows monitoring the enzymatic racemization of R mandelic acid.

preprint2022arXiv

Multi-Scaling Differential Contraction Integral Method for Inverse Scattering Problems with Inhomogeneous Media

Practical applications of microwave imaging often require the solution of inverse scattering problems with inhomogeneous backgrounds. Towards this end, a novel inversion strategy, which combines the multi-scaling (MS) regularization scheme and the Difference Contraction Integral Equation (DCIE) formulation, is proposed. Such an integrated approach mitigates the non-linearity and the ill-posedness of the problem to obtain reliable high-resolution reconstructions of the unknown scattering profiles. The arising algorithmic implementation, denoted as MS-DCIE, does not require the computation of the Green's function of the inhomogeneous background, thus it provides an efficient and effective way to deal with complex scenarios. The performance of the MS-DCIE are assessed by means of numerical and experimental tests, in comparison with competitive state-of-the-art inversion strategies, as well.

preprint2021arXiv

Exposing Semantic Segmentation Failures via Maximum Discrepancy Competition

Semantic segmentation is an extensively studied task in computer vision, with numerous methods proposed every year. Thanks to the advent of deep learning in semantic segmentation, the performance on existing benchmarks is close to saturation. A natural question then arises: Does the superior performance on the closed (and frequently re-used) test sets transfer to the open visual world with unconstrained variations? In this paper, we take steps toward answering the question by exposing failures of existing semantic segmentation methods in the open visual world under the constraint of very limited human labeling effort. Inspired by previous research on model falsification, we start from an arbitrarily large image set, and automatically sample a small image set by MAximizing the Discrepancy (MAD) between two segmentation methods. The selected images have the greatest potential in falsifying either (or both) of the two methods. We also explicitly enforce several conditions to diversify the exposed failures, corresponding to different underlying root causes. A segmentation method, whose failures are more difficult to be exposed in the MAD competition, is considered better. We conduct a thorough MAD diagnosis of ten PASCAL VOC semantic segmentation algorithms. With detailed analysis of experimental results, we point out strengths and weaknesses of the competing algorithms, as well as potential research directions for further advancement in semantic segmentation. The codes are publicly available at \url{https://github.com/QTJiebin/MAD_Segmentation}.

preprint2020arXiv

Social Biases in NLP Models as Barriers for Persons with Disabilities

Building equitable and inclusive NLP technologies demands consideration of whether and how social attitudes are represented in ML models. In particular, representations encoded in models often inadvertently perpetuate undesirable social biases from the data on which they are trained. In this paper, we present evidence of such undesirable biases towards mentions of disability in two different English language models: toxicity prediction and sentiment analysis. Next, we demonstrate that the neural embeddings that are the critical first step in most NLP pipelines similarly contain undesirable biases towards mentions of disability. We end by highlighting topical biases in the discourse about disability which may contribute to the observed model biases; for instance, gun violence, homelessness, and drug addiction are over-represented in texts discussing mental illness.

preprint2016arXiv

Enlightening Deep Neural Networks with Knowledge of Confounding Factors

Deep learning techniques have demonstrated significant capacity in modeling some of the most challenging real world problems of high complexity. Despite the popularity of deep models, we still strive to better understand the underlying mechanism that drives their success. Motivated by observations that neurons in trained deep nets predict attributes indirectly related to the training tasks, we recognize that a deep network learns representations more general than the task at hand to disentangle impacts of multiple confounding factors governing the data, in order to isolate the effects of the concerning factors and optimize a given objective. Consequently, we propose a general framework to augment training of deep models with information on auxiliary explanatory data variables, in an effort to boost this disentanglement and train deep networks that comprehend the data interactions and distributions more accurately, and thus improve their generalizability. We incorporate information on prominent auxiliary explanatory factors of the data population into existing architectures as secondary objective/loss blocks that take inputs from hidden layers during training. Once trained, these secondary circuits can be removed to leave a model with the same architecture as the original, but more generalizable and discerning thanks to its comprehension of data interactions. Since pose is one of the most dominant confounding factors for object recognition, we apply this principle to instantiate a pose-aware deep convolutional neural network and demonstrate that auxiliary pose information indeed improves the classification accuracy in our experiments on SAR target classification tasks.

preprint2015arXiv

Efficient Similarity Indexing and Searching in High Dimensions

Efficient indexing and searching of high dimensional data has been an area of active research due to the growing exploitation of high dimensional data and the vulnerability of traditional search methods to the curse of dimensionality. This paper presents a new approach for fast and effective searching and indexing of high dimensional features using random partitions of the feature space. Experiments on both handwritten digits and 3-D shape descriptors have shown the proposed algorithm to be highly effective and efficient in indexing and searching real data sets of several hundred dimensions. We also compare its performance to that of the state-of-the-art locality sensitive hashing algorithm.

Yu Zhong

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Masgent: An AI-assisted Materials Simulation Agent

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Unified Personalized Understanding, Generating and Editing

Measurement of minute volumes of chiral molecules using in-fiber polarimetry

Multi-Scaling Differential Contraction Integral Method for Inverse Scattering Problems with Inhomogeneous Media

Exposing Semantic Segmentation Failures via Maximum Discrepancy Competition

Social Biases in NLP Models as Barriers for Persons with Disabilities

Enlightening Deep Neural Networks with Knowledge of Confounding Factors

Efficient Similarity Indexing and Searching in High Dimensions