Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
23topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2026arXiv

3D Primitives are a Spatial Language for VLMs

Vision-language models (VLMs) exhibit a striking paradox: they can generate executable code that reconstructs a 3D scene from geometric primitives with correct object counts, classes, and approximate positions, yet the same models fail at simpler spatial questions on the same image. We show that 3D geometric primitives (cubes, spheres, cylinders, expressed in executable code) serve as a powerful intermediate representation for spatial understanding, and exploit this through three contributions. First, we introduce \textbf{\textsc{SpatialBabel}}, a benchmark evaluating fourteen VLMs on primitive-based 3D scene reconstruction across six \emph{scene-code languages} (programming languages and declarative formats for 3D primitive scenes), revealing that a single model's object-detection F1 can vary by up to $5.7\times$ across languages. Second, we propose \textbf{Code-CoT} (Code Chain-of-Thought), a training-free inference strategy that routes spatial reasoning through primitive-based code generation. Code-CoT lifts the SpatialBabel-QA-Score by up to $+6.4$\% on primitive scenes and real-photo CV-Bench-3D accuracy by $+5.0$\% for VLMs with strong coding capabilities. Third, we propose \textbf{S$^{3}$-FT} (Self-Supervised Spatial Fine-Tuning), which self-supervisedly distills primitive spatial knowledge into general visual reasoning by parsing the model's own Three.js primitive-reconstructions into structured annotations and fine-tuning on the result, with \emph{no human labels and no teacher model}. Training on primitive images alone, S$^3$-FT improves Qwen3-VL-8B by $+4.6$ to $+8.6$\% on SpatialBabel-Primitive-QA, $+9.7$\% on CV-Bench-2D, and $+17$\% on HallusionBench; the recipe transfers across model families. These results establish geometric primitives in code as both a diagnostic and a transferable spatial vocabulary for VLMs. We will release all artifacts upon publication.

preprint2026arXiv

An Adaptive Online Smoother with Closed-Form Solutions and Information-Theoretic Lag Selection for Conditional Gaussian Nonlinear Systems

Data assimilation (DA) combines partial observations with dynamical models to improve state estimation. Filter-based DA uses only past and present data and is the prerequisite for real-time forecasts. Smoother-based DA exploits both past and future observations. It aims to fill in missing data, provide more accurate estimations, and develop high-quality datasets. However, the standard smoothing procedure requires using all historical state estimations, which is storage-demanding, especially for high-dimensional systems. This paper develops an adaptive-lag online smoother for a large class of complex dynamical systems with strong nonlinear and non-Gaussian features, which has important applications to many real-world problems. The adaptive lag allows the utilization of observations only within a nearby window, thus reducing computational complexity and storage needs. Online lag adjustment is essential for tackling turbulent systems, where temporal autocorrelation varies significantly over time due to intermittency, extreme events, and nonlinearity. Based on the uncertainty reduction in the estimated state, an information criterion is developed to systematically determine the adaptive lag. Notably, the mathematical structure of these systems facilitates the use of closed analytic formulae to calculate the online smoother and adaptive lag, avoiding empirical tunings as in ensemble-based DA methods. The adaptive online smoother is applied to studying three important scientific problems. First, it helps detect online causal relationships between state variables. Second, the advantage of reduced computational storage expenditure is illustrated via Lagrangian DA, a high-dimensional nonlinear problem. Finally, the adaptive smoother advances online parameter estimation with partial observations, emphasizing the role of the observed extreme events in accelerating convergence.

preprint2022arXiv

A Physics-Informed Data-Driven Algorithm for Ensemble Forecast of Complex Turbulent Systems

A new ensemble forecast algorithm, named as the physics-informed data-driven algorithm with conditional Gaussian statistics (PIDD-CG), is developed to predict the time evolution of the probability density functions (PDFs) of complex turbulent systems with partial observations. The PIDD-CG algorithm integrates a unique multiscale statistical closure model with an extremely efficient nonlinear data assimilation scheme to represent the PDF as a mixture of conditional statistics, which overcomes the curse of dimensionality for high-dimensional systems. The multiscale features in the time evolution of each conditional statistics ensemble member effectively captured by an appropriate combination of physics-informed analytic formulae and recurrent neural networks. An information metric is adopted as the loss function for the latter to more accurately calibrate the key turbulent signals with strong fluctuations. The proposed algorithm succeeds in forecasting both the transient and statistical equilibrium non-Gaussian PDFs of strongly turbulent systems with intermittency, regime switching and extreme events.

preprint2022arXiv

An Efficient Routing Protocol for Quantum Key Distribution Networks

Quantum key distribution (QKD) can provide point-to-point information-theoretic secure key services for two connected users. In fact, the development of QKD networks needs more focus from the scientific community in order to broaden the service scale of QKD technology to deliver end-to-end secure key services. Of course, some recent efforts have been made to develop secure communication protocols based on QKD. However, due to the limited key generation capability of QKD devices, high quantum secure key utilization is the major concern for QKD networks. Since traditional routing techniques do not account for the state of quantum secure keys on links, applying them in QKD networks directly will result in underutilization of quantum secure keys. Therefore, an efficient routing protocol for QKD networks, especially for large-scale QKD networks, is desperately needed. In this study, an efficient routing protocol based on optimized link-state routing, namely QOLSR, is proposed for QKD networks. QOLSR considerably improves quantum key utilization in QKD networks through link-state awareness and path optimization. Simulation results demonstrate the validity and efficiency of the proposed QOLSR routing protocol. Most importantly, with the growth of communication traffic, the benefit becomes even more apparent.

preprint2022arXiv

LASSO-Based Multiple-Line Outage Identification In Partially Observable Power Systems

Phasor measurement units (PMUs) create ample real-time monitoring opportunities for modern power systems. Among them, line outage detection and identification remains a crucial but challenging task. Current works on outage identification succeed in full PMU deployment and single-line outages. Performance however degrades for multiple-line outage with partial system observability. We propose a novel framework of multiple-line outage identification using partial nodal voltage measurements. Using alternating current (AC) power flow model, phase angle signatures of outages are extracted and used to group lines into minimal diagnosable clusters. Identification is then formulated into an underdetermined sparse regression problem solved by lasso. Tested on IEEE 39-bus system with 25% and 50% PMU coverage, the proposed identification method is 93% and 80% accurate for single- and double-line outages. Our study suggests that the AC power flow is better at capturing outage patterns and sacrificing some precision could yield substantial improvement in identification accuracy. These findings could contribute to the development of future control schemes that help power systems resist and recover from outage disruptions in real time.

preprint2022arXiv

MC-UNet Multi-module Concatenation based on U-shape Network for Retinal Blood Vessels Segmentation

Accurate segmentation of the blood vessels of the retina is an important step in clinical diagnosis of ophthalmic diseases. Many deep learning frameworks have come up for retinal blood vessels segmentation tasks. However, the complex vascular structure and uncertain pathological features make the blood vessel segmentation still very challenging. A novel U-shaped network named Multi-module Concatenation which is based on Atrous convolution and multi-kernel pooling is put forward to retinal vessels segmentation in this paper. The proposed network structure retains three layers the essential structure of U-Net, in which the atrous convolution combining the multi-kernel pooling blocks are designed to obtain more contextual information. The spatial attention module is concatenated with dense atrous convolution module and multi-kernel pooling module to form a multi-module concatenation. And different dilation rates are selected by cascading to acquire a larger receptive field in atrous convolution. Adequate comparative experiments are conducted on these public retinal datasets: DRIVE, STARE and CHASE_DB1. The results show that the proposed method is effective, especially for microvessels. The code will be put out at https://github.com/Rebeccala/MC-UNet

preprint2022arXiv

Superfloe Parameterization with Physics Constraints for Uncertainty Quantification of Sea Ice Floes

The discrete element method (DEM) is providing a new modeling approach for describing sea ice dynamics. It exploits particle-based methods to characterize the physical quantities of each sea ice floe along its trajectory under Lagrangian coordinates. One major challenge in applying the DEM models is the heavy computational cost when the number of floes becomes large. In this paper, an efficient Lagrangian parameterization algorithm is developed, which aims at reducing the computational cost of simulating the DEM models while preserving the key features of the sea ice. The new parameterization takes advantage of a small number of artificial ice floes, named the superfloes, to effectively approximate a considerable number of the floes, where the parameterization scheme satisfies several important physics constraints. The physics constraints guarantee the superfloe parameterized system will have similar short-term dynamical behavior as the full system. These constraints also allow the superfloe parameterized system to accurately quantify the long-range uncertainty, especially the non-Gaussian statistical features, of the full system. In addition, the superfloe parameterization facilitates a systematic noise inflation strategy that significantly advances an ensemble-based data assimilation algorithm for recovering the unobserved ocean field underneath the sea ice. Such a new noise inflation method avoids ad hoc tunings as in many traditional algorithms and is computationally extremely efficient. Numerical experiments based on an idealized DEM model with multiscale features illustrate the success of the superfloe parameterization in quantifying the uncertainty and assimilating both the sea ice and the associated ocean field.

preprint2021arXiv

An Efficient and Statistically Accurate Lagrangian Data Assimilation Algorithm with Applications to Discrete Element Sea Ice Models

Lagrangian data assimilation of complex nonlinear turbulent flows is an important but computationally challenging topic. In this article, an efficient data-driven statistically accurate reduced-order modeling algorithm is developed that significantly accelerates the computational efficiency of Lagrangian data assimilation. The algorithm starts with a Fourier transform of the high-dimensional flow field, which is followed by an effective model reduction that retains only a small subset of the Fourier coefficients corresponding to the energetic modes. Then a linear stochastic model is developed to approximate the nonlinear dynamics of each Fourier coefficient. Effective additive and multiplicative noise processes are incorporated to characterize the modes that exhibit Gaussian and non-Gaussian statistics, respectively. All the parameters in the reduced order system, including the multiplicative noise coefficients, are determined systematically via closed analytic formulae. These linear stochastic models succeed in forecasting the uncertainty and facilitate an extremely rapid data assimilation scheme. The new Lagrangian data assimilation is then applied to observations of sea ice floe trajectories that are driven by atmospheric winds and turbulent ocean currents. It is shown that observing only about $30$ non-interacting floes in a $200$km$\times200$km domain is sufficient to recover the key multi-scale features of the ocean currents. The additional observations of the floe angular displacements are found to be suitable supplements to the center-of-mass positions for improving the data assimilation skill. In addition, the observed large and small floes are more useful in recovering the large- and small-scale features of the ocean, respectively. The Fourier domain data assimilation also succeeds in recovering the ocean features in the areas where cloud cover obscures the observations.

preprint2021arXiv

Conditional Gaussian Nonlinear System: a Fast Preconditioner and a Cheap Surrogate Model For Complex Nonlinear Systems

Developing suitable approximate models for analyzing and simulating complex nonlinear systems is practically important. This paper aims at exploring the skill of a rich class of nonlinear stochastic models, known as the conditional Gaussian nonlinear system (CGNS), as both a cheap surrogate model and a fast preconditioner for facilitating many computationally challenging tasks. The CGNS preserves the underlying physics to a large extent and can reproduce intermittency, extreme events and other non-Gaussian features in many complex systems arising from practical applications. Three interrelated topics are studied. First, the closed analytic formulae of solving the conditional statistics provide an efficient and accurate data assimilation scheme. It is shown that the data assimilation skill of a suitable CGNS approximate forecast model outweighs that by applying an ensemble method even to the perfect model with strong nonlinearity, where the latter suffers from filter divergence. Second, the CGNS allows the development of a fast algorithm for simultaneously estimating the parameters and the unobserved variables with uncertainty quantification in the presence of only partial observations. Utilizing an appropriate CGNS as a preconditioner significantly reduces the computational cost in accurately estimating the parameters in the original complex system. Finally, the CGNS advances rapid and statistically accurate algorithms for computing the probability density function and sampling the trajectories of the unobserved state variables. These fast algorithms facilitate the development of an efficient and accurate data-driven method for predicting the linear response of the original system with respect to parameter perturbations based on a suitable CGNS preconditioner.

preprint2021arXiv

Shock trace prediction by reduced models for a viscous stochastic Burgers equation

Viscous shocks are a particular type of extreme events in nonlinear multiscale systems, and their representation requires small scales. Model reduction can thus play an important role in reducing the computational cost for an efficient prediction of shocks. Yet, reduced models typically aim to approximate large-scale dominating dynamics, which do not resolve the small scales by design. To resolve this representation barrier, we introduce a new qualitative characterization of the space-time locations of shocks, named as the ``shock trace'', via a space-time indicator function based on an empirical resolution-adaptive threshold. Different from the exact shocks, the shock traces can be captured within the representation capacity of the large scales, which facilitates the forecast of the timing and locations of the shocks utilizing reduced models. Within the context of a viscous stochastic Burgers equation, we show that a data-driven reduced model, in the form of nonlinear autoregression (NAR) time series models, can accurately predict the random shock traces, with relatively low rates of false predictions. The NAR model significantly outperforms the corresponding Galerkin truncated model in the scenario of either noiseless or noisy observations. The results illustrate the importance of the data-driven closure terms in the NAR model, which account for the effects of the unresolved small scale dynamics on the resolved ones due to nonlinear interactions.

preprint2020arXiv

A Control Chart Approach to Power System Line Outage Detection Under Transient Dynamics

Online transmission line outage detection over the entire network enables timely corrective action to be taken, which prevents a local event from cascading into a large scale blackout. Line outage detection aims to detect an outage as soon as possible after it happened. Traditional methods either do not consider the transient dynamics following an outage or require a full Phasor Measurement Unit (PMU) deployment. Using voltage phase angle data collected from a limited number of PMUs, we propose a real-time dynamic outage detection scheme based on alternating current (AC) power flow model and statistical change detection theory. The proposed method can capture system dynamics since it retains the time-variant and nonlinear nature of the power system. The method is computationally efficient and scales to large and realistic networks. Extensive simulation studies on IEEE 39-bus and 2383-bus systems demonstrated the effectiveness of the proposed method.

preprint2020arXiv

Conditional Kernel Density Estimation Considering Autocorrelation for Renewable Energy Probabilistic Modeling

Renewable energy is essential for energy security and global warming mitigation. However, power generation from renewable energy sources is uncertain due to volatile weather conditions and complex equipment operations. To improve equipment's operation efficiency, it is important to understand and characterize the uncertainty in renewable power generation. In this paper, we proposed a conditional kernel density estimation method to model the distribution of equipment's power output given any weather conditions. It explicitly accounts for the temporal dependence in the data stream and uses an iterative procedure to reduce the bias in kernel density estimation. Compared with existing literature, our approach is especially useful for the purposes of equipment condition monitoring or short-term renewable energy forecasting, where the data dependence plays a more significant role. We demonstrate our method and compare it with alternatives through real applications.

preprint2020arXiv

Energy and Information Management of Electric Vehicular Network: A Survey

The connected vehicle paradigm empowers vehicles with the capability to communicate with neighboring vehicles and infrastructure, shifting the role of vehicles from a transportation tool to an intelligent service platform. Meanwhile, the transportation electrification pushes forward the electric vehicle (EV) commercialization to reduce the greenhouse gas emission by petroleum combustion. The unstoppable trends of connected vehicle and EVs transform the traditional vehicular system to an electric vehicular network (EVN), a clean, mobile, and safe system. However, due to the mobility and heterogeneity of the EVN, improper management of the network could result in charging overload and data congestion. Thus, energy and information management of the EVN should be carefully studied. In this paper, we provide a comprehensive survey on the deployment and management of EVN considering all three aspects of energy flow, data communication, and computation. We first introduce the management framework of EVN. Then, research works on the EV aggregator (AG) deployment are reviewed to provide energy and information infrastructure for the EVN. Based on the deployed AGs, we present the research work review on EV scheduling that includes both charging and vehicle-to-grid (V2G) scheduling. Moreover, related works on information communication and computing are surveyed under each scenario. Finally, we discuss open research issues in the EVN.

preprint2020arXiv

Information Relaxation and A Duality-Driven Algorithm for Stochastic Dynamic Programs

We use the technique of information relaxation to develop a duality-driven iterative approach to obtaining and improving confidence interval estimates for the true value of finite-horizon stochastic dynamic programming problems. We show that the sequence of dual value estimates yielded from the proposed approach in principle monotonically converges to the true value function in a finite number of dual iterations. Aiming to overcome the curse of dimensionality in various applications, we also introduce a regression-based Monte Carlo algorithm for implementation. The new approach can be used not only to assess the quality of heuristic policies, but also to improve them if we find that their duality gap is large. We obtain the convergence rate of our Monte Carlo method in terms of the amounts of both basis functions and the sampled states. Finally, we demonstrate the effectiveness of our method in an optimal order execution problem with market friction and in an inventory management problem in the presence of lost sale and lead time. Both examples are well known in the literature to be difficult to solve for optimality. The experiments show that our method can significantly improve the heuristics suggested in the literature and obtain new policies with a satisfactory performance guarantee.

preprint2020arXiv

Optimal Placement of Limited PMUs for Transmission Line Outage Detection and Identification

Phasor Measurement Unit (PMU) technology is increasingly used for real-time monitoring applications, especially line outage detection and identification (D&I) in the power system. Current outage D&I schemes either assume a full PMU deployment or a partial deployment with fixed PMU placement. However, the placement of the PMUs has a fundamental impact on the effectiveness of the D&I scheme. Building on a dynamic relationship between the substation voltage phase angle and active power, we formulated the optimal PMU placement problem for outage D&I as an optimization problem readily solvable by any heuristic algorithm. We tested the formulation using a genetic algorithm and simulated outages of IEEE 39 bus system. The optimal placement found produces a better D&I result of single-line outages than a randomly scattered, tree-like, and degree-based placements.

preprint2020arXiv

Phase I analysis of hidden operating status for wind turbine

Data-driven methods based on Supervisory Control and Data Acquisition (SCADA) become a recent trend for wind turbine condition monitoring. However, SCADA data are known to be of low quality due to low sampling frequency and complex turbine working dynamics. In this work, we focus on the phase I analysis of SCADA data to better understand turbines' operating status. As one of the most important characterization, the power curve is used as a benchmark to represent normal performance. A powerful distribution-free control chart is applied after the power generation is adjusted by an accurate power curve model, which explicitly takes into account the known factors that can affect turbines' performance. Informative out-of-control segments have been revealed in real field case studies. This phase I analysis can help improve wind turbine's monitoring, reliability, and maintenance for a smarter wind energy system.

preprint2020arXiv

Porous carbon nanowire array for highly sensitive, biocompatible, reproducible surface-enhanced Raman spectroscopy

Surface-enhanced Raman spectroscopy (SERS) is a powerful tool for vibrational spectroscopy as it provides several orders of magnitude higher sensitivity than inherently weak spontaneous Raman scattering by exciting localized surface plasmon resonance (LSPR) on metal substrates. However, SERS is not very reliable, especially for use in life sciences, since it sacrifices reproducibility and biocompatibility due to its strong dependence on "hot spots" and large photothermal heat generation. Here we report a metal-free (i.e., LSPR-free), topologically tailored nanostructure composed of porous carbon nanowires in an array as a SERS substrate that addresses the decades-old problem. Specifically, it offers not only high signal enhancement due to its strong broadband charge-transfer resonance, but also extraordinarily high reproducibility or substrate-to-substrate, spot-to-spot, sample-to-sample, and time-to-time consistency in SERS spectrum due to the absence of hot spots and high compatibility to biomolecules due to its fluorescence quenching and negligible denaturation capabilities. These excellent properties make SERS suitable for practical use in diverse biomedical applications.

preprint2020arXiv

Statistical design considerations for trials that study multiple indications

Breakthroughs in cancer biology have defined new research programs emphasizing the development of therapies that target specific pathways in tumor cells. Innovations in clinical trial design have followed with master protocols defined by inclusive eligibility criteria and evaluations of multiple therapies and/or histologies. Consequently, characterization of subpopulation heterogeneity has become central to the formulation and selection of a study design. However, this transition to master protocols has led to challenges in identifying the optimal trial design and proper calibration of hyperparameters. We often evaluate a range of null and alternative scenarios, however there has been little guidance on how to synthesize the potentially disparate recommendations for what may be optimal. This may lead to the selection of suboptimal designs and statistical methods that do not fully accommodate the subpopulation heterogeneity. This article proposes novel optimization criteria for calibrating and evaluating candidate statistical designs of master protocols in the presence of the potential for treatment effect heterogeneity among enrolled patient subpopulations. The framework is applied to demonstrate the statistical properties of conventional study designs when treatments offer heterogeneous benefit as well as identify optimal designs devised to monitor the potential for heterogeneity among patients with differing clinical indications using Bayesian modeling.

preprint2020arXiv

Throttling Process of Rotating Bardeen AdS Black Holes

In this paper, the author study the throttling process of the Rotating Bardeen-AdS black hole in a systematic way. In the exended phase space, the mass of black holes should be viewed as entheply. We derive the Joule-Thomson coefficient $μ$ explicitly, and with numerical method, we depict the inversion curves and isenthalpic curves with different parameter $J$ and $g$. It is found that there are only minimum inversion temperature but no maximum inversion temperture, and the ratio between minimum inversion temperature $T_{min}$ and critical temperature $T_C$ is a little greater than $1/2$, and increase with the nonlinear parameter $g$. Furthermore, the shapes of the isenthalpic curves are similar to most case studied before, we calculate the inversion point exists as long as the mass is no less than a certain value $M_{min}$, the effect of the parameter $g$ on the throttling process is also discussed.