Researcher profile

Liang Shi

Liang Shi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

RoSHAP: A Distributional Framework and Robust Metric for Stable Feature Attribution

Feature attribution analysis is critical for interpreting machine learning models and supporting reliable data-driven decisions. However, feature attribution measures often exhibit stochastic variation: different train--test splits, random seeds, or model-fitting procedures can produce substantially different attribution values and feature rankings. This paper proposes a framework for incorporating stochastic nature of feature attribution and a robust attribution metric, RoSHAP, for stable feature ranking based on the SHAP metric. The proposed framework models the distribution of feature attribution scores and estimates it through bootstrap resampling and kernel density estimation. We show that, under mild regularity conditions, the aggregated feature attribution score is asymptotically Gaussian, which greatly reduces the computational cost of distribution estimation. The RoSHAP summarizes the distribution of SHAP into a robust feature-ranking criterion that simultaneously rewards features that are active, strong, and stable. Through simulations and real-data experiments, the proposed framework and RoSHAP outperform standard single-run attribution measures in identifying signal features. In addition, models built using RoSHAP-selected features achieve predictive performance comparable to full-feature models while using substantially fewer predictors. The proposed RoSHAP approach improves the stability and interpretability of machine learning models, enabling reliable and consistent insights for analysis.

preprint2024arXiv

Computational Discovery of Microstructured Composites with Optimal Stiffness-Toughness Trade-Offs

The conflict between stiffness and toughness is a fundamental problem in engineering materials design. However, the systematic discovery of microstructured composites with optimal stiffness-toughness trade-offs has never been demonstrated, hindered by the discrepancies between simulation and reality and the lack of data-efficient exploration of the entire Pareto front. We introduce a generalizable pipeline that integrates physical experiments, numerical simulations, and artificial neural networks to address both challenges. Without any prescribed expert knowledge of material design, our approach implements a nested-loop proposal-validation workflow to bridge the simulation-to-reality gap and discover microstructured composites that are stiff and tough with high sample efficiency. Further analysis of Pareto-optimal designs allows us to automatically identify existing toughness enhancement mechanisms, which were previously discovered through trial-and-error or biomimicry. On a broader scale, our method provides a blueprint for computational design in various research areas beyond solid mechanics, such as polymer chemistry, fluid dynamics, meteorology, and robotics.

preprint2022arXiv

LDP-IDS: Local Differential Privacy for Infinite Data Streams

Streaming data collection is essential to real-time data analytics in various IoTs and mobile device-based systems, which, however, may expose end users' privacy. Local differential privacy (LDP) is a promising solution to privacy-preserving data collection and analysis. However, existing few LDP studies over streams are either applicable to finite streams only or suffering from insufficient protection. This paper investigates this problem by proposing LDP-IDS, a novel $w$-event LDP paradigm to provide practical privacy guarantee for infinite streams at users end, and adapting the popular budget division framework in centralized differential privacy (CDP). By constructing a unified error analysi for LDP, we first develop two adatpive budget division-based LDP methods for LDP-IDS that can enhance data utility via leveraging the non-deterministic sparsity in streams. Beyond that, we further propose a novel population division framework that can not only avoid the high sensitivity of LDP noise to budget division but also require significantly less communication. Based on the framework, we also present two adaptive population division methods for LDP-IDS with theoretical analysis. We conduct extensive experiments on synthetic and real-world datasets to evaluate the effectiveness and efficiency pf our proposed frameworks and methods. Experimental results demonstrate that, despite the effectiveness of the adaptive budget division methods, the proposed population division framework and methods can further achieve much higher effectiveness and efficiency.

preprint2020arXiv

nsCouette -- A high-performance code for direct numerical simulations of turbulent Taylor-Couette flow

We present nsCouette, a highly scalable software tool to solve the Navier-Stokes equations for incompressible fluid flow between differentially heated and independently rotating, concentric cylinders. It is based on a pseudospectral spatial discretization and dynamic time-stepping. It is implemented in modern Fortran with a hybrid MPI-OpenMP parallelization scheme and thus designed to compute turbulent flows at high Reynolds and Rayleigh numbers. An additional GPU implementation (C-CUDA) for intermediate problem sizes and a basic version for turbulent pipe flow (nsPipe) are also provided.

preprint2020arXiv

Unified Multi-Criteria Chinese Word Segmentation with BERT

Multi-Criteria Chinese Word Segmentation (MCCWS) aims at finding word boundaries in a Chinese sentence composed of continuous characters while multiple segmentation criteria exist. The unified framework has been widely used in MCCWS and shows its effectiveness. Besides, the pre-trained BERT language model has been also introduced into the MCCWS task in a multi-task learning framework. In this paper, we combine the superiority of the unified framework and pretrained language model, and propose a unified MCCWS model based on BERT. Moreover, we augment the unified BERT-based MCCWS model with the bigram features and an auxiliary criterion classification task. Experiments on eight datasets with diverse criteria demonstrate that our methods could achieve new state-of-the-art results for MCCWS.