Researcher profile

Zhaoyu Wang

Zhaoyu Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2022arXiv

A Two-layer Approach for Estimating Behind-the-Meter PV Generation Using Smart Meter Data

As the cost of the residential solar system decreases, rooftop photovoltaic (PV) has been widely integrated into distribution systems. Most rooftop PV systems are installed behind-the-meter (BTM), i.e., only the net demand is metered, while the native demand and PV generation are not separately recorded. Under this condition, the PV generation and native demand are invisible to utilities, which brings challenges for optimal distribution system operation and expansion. In this paper, we have come up with a novel two-layer approach to disaggregate the unknown PV generation and native demand from the known hourly net demand data recorded by smart meters: 1) At the aggregate level, the proposed approach separates the total PV generation and native demand time series from the total net demand time series for customers with PVs. 2) At the customer level, the separated aggregate-level PV generation is allocated to individual PVs. These two layers leverage the spatial correlations of native demand and PV generation, respectively. One primary advantage of our proposed approach is that it is more independent and practical compared to previous works because it does not require PV array parameters, meteorological data and previously recorded solar power exemplars. We have verified our proposed approach using real native demand and PV generation data.

preprint2022arXiv

An Extended Halo-based Group/Cluster finder: application to the DESI legacy imaging surveys DR8

We extend the halo-based group finder developed by \citet[][]{Yang2005a} to use data {\it simultaneously} with either photometric or spectroscopic redshifts. A mock galaxy redshift survey constructed from a high-resolution N-body simulation is used to evaluate the performance of this extended group finder. For galaxies with magnitude ${\rm z\le 21}$ and redshift $0<z\le 1.0$ in the DESI legacy imaging surveys (the Legacy Surveys), our group finder successfully identifies more than 60\% of the members in about $90\%$ of halos with mass $\ga 10^{12.5}\msunh$. Detected groups with mass $\ga 10^{12.0}\msunh$ have a purity (the fraction of true groups) greater than 90\%. The halo mass assigned to each group has an uncertainty of about 0.2 dex at the high mass end $\ga 10^{13.5}\msunh$ and 0.40 dex at the low mass end. Groups with more than 10 members have a redshift accuracy of $\sim 0.008$. We apply this group finder to the Legacy Surveys DR8 and find 5.2 Million groups with at least 3 members. About 387,000 of these groups have at least 10 members. The resulting catalog containing 3D coordinates, richness, halo masses, and total group luminosities, is made publicly available.

preprint2022arXiv

Data-Driven Affinely Adjustable Robust Volt/VAr Control

This paper proposes a data-driven affinely adjustable robust Volt/VAr control (AARVVC) scheme, which modulates the smart inverter reactive power in an affine function of its active power, based on the voltage sensitivities with respect to real/reactive power injections. To achieve a fast and accurate estimation of voltage sensitivities, we propose a data-driven method based on deep neural network (DNN), together with a rule-based bus-selection process using the bidirectional search method. Our method only uses the operating statuses of selected buses as inputs to DNN, thus significantly improving the training efficiency and reducing information redundancy. Finally, a distributed consensus-based solution, based on the alternating direction method of multipliers (ADMM), for the AARVVC is applied to decide the inverter reactive power adjustment rule with respect to its active power. Only limited information exchange is required between each local agent and the central agent to obtain the slope of the reactive power adjustment rule, and there is no need for the central agent to solve any (sub)optimization problems. Numerical results on the modified IEEE-123 bus system validate the effectiveness and superiority of the proposed data-driven AARVVC method.

preprint2022arXiv

Learning Latent Interactions for Event classification via Graph Neural Networks and PMU Data

Phasor measurement units (PMUs) are being widely installed on power systems, providing a unique opportunity to enhance wide-area situational awareness. One essential application is the use of PMU data for real-time event identification. However, how to take full advantage of all PMU data in event identification is still an open problem. Thus, we propose a novel method that performs event identification by mining interaction graphs among different PMUs. The proposed interaction graph inference method follows an entirely data-driven manner without knowing the physical topology. Moreover, unlike previous works that treat interactive learning and event identification as two different stages, our method learns interactions jointly with the identification task, thereby improving the accuracy of graph learning and ensuring seamless integration between the two stages. Moreover, to capture multi-scale event patterns, a dilated inception-based method is investigated to perform feature extraction of PMU data. To test the proposed data-driven approach, a large real-world dataset from tens of PMU sources and the corresponding event logs have been utilized in this work. Numerical results validate that our method has higher classification accuracy compared to previous methods.

preprint2022arXiv

Online Voltage Control for Unbalanced Distribution Networks Using Projected Newton Method

This paper proposes an online voltage control strategy of distributed energy resources (DERs), based on the projected Newton method (PNM), for unbalanced distribution networks. The optimal Volt/VAr control (VVC) problem is formulated as an optimization program, with the goal of maintaining the voltage profile across the network by coordinating the VAr outputs of DERs. To overcome the slow convergence rate of conventional gradient-based methods, a PNM-based solution algorithm is developed to solve this VVC problem. It utilizes a non-diagonal symmetric positive definite matrix, developed from the Hessian matrix of the objective, to scale the gradient, and thus a fast convergence performance can be expected in this Newton-like algorithm. Moreover, taking advantage of the instantaneous feedback of voltage measurements, the online implementation of the PNM-based VVC is further designed to deal with fast system variations. In this online PNM-based VVC scheme, each bus agent communicates the instantaneous voltage measurements to the central agent, and the central agent communicates the VAr output commands of DERs back to each bus agent. The fast convergence performance of PNM results in a stronger capability to track the system variations in real time. Finally, numerical case studies are performed to validate the effectiveness, superiority, and scalability of the proposed method.

preprint2022arXiv

The complex mKdV equation with step-like initial data: Large time asymptotic analysis

In this paper, we study large-time asymptotics for the complex modified Korteveg-de Vries equation \begin{equation} u_t + \frac{1}{2}u_{xxx}+3|u|^2 u_x=0, \end{equation} with the step-like initial data \begin{equation} u(x,0)=u_0(x)= \begin{cases} 0, & {x \ge 0,}\\ A e^{iBx}, &{x < 0.} \end{cases} \end{equation} It is shown that the step-like initial problem can be described by a matrix Riemann-Hilbert problem. We apply the steepest descent method to obtain different large-time asymptotics in the the Zakharov-Manakov region, a plane wave region and a slow decay region.

preprint2022arXiv

The defocusing NLS equation with nonzero background: Large-time asymptotics in the solitonless region

We consider the Cauchy problem for the defocusing Schr$\ddot{\text{o}}$dinger (NLS) equation with a nonzero background $$\begin{align} &iq_t+q_{xx}-2(|q|^2-1)q=0, \nonumber\\ &q(x,0)=q_0(x), \quad \lim_{x \to \pm \infty}q_0(x)=\pm 1. \end{align}$$ Recently, for the space-time region $|x/(2t)|<1$ which is a solitonic region without stationary phase points on the jump contour, Cuccagna and Jenkins presented the asymptotic stability of the $N$-soliton solutions for the NLS equation by using the $\bar{\partial}$ generalization of the Deift-Zhou nonlinear steepest descent method. Their large-time asymptotic expansion takes the form \begin{align} q(x,t)= T(\infty)^{-2} q^{sol,N}(x,t) + \mathcal{O}(t^{-1 }),\label{res1} \end{align} whose leading term is N-soliton and the second term $\mathcal{O}(t^{-1})$ is a residual error from a $\overline\partial$-equation. In this paper, we are interested in the large-time asymptotics in the space-time region $ |x/(2t)|>1$ which is outside the soliton region, but there will be two stationary points appearing on the jump contour $\mathbb{R}$. We found a asymptotic expansion that is different from (\ref{res1}) $$\begin{align} q(x,t)= e^{-iα(\infty)} \left(1 +t^{-1/2} h(x,t) \right)+\mathcal{O}\left(t^{-3/4}\right),\label{res2} \end{align}$$ whose leading term is a nonzero background, the second $t^{-1/2}$ order term is from continuous spectrum and the third term $\mathcal{O}(t^{-3/4})$ is a residual error from a $\overline\partial$-equation.The above two asymptotic results (\ref{res1}) and (\ref{res2}) imply that the region $ |x/(2t)|<1$ considered by Cuccagna and Jenkins is a fast decaying soliton solution region, while the region $ |x/(2t)|>1$ considered by us is a slow decaying nonzero background region.

preprint2022arXiv

Tractable Data Enriched Distributionally Robust Chance-Constrained CVR

This paper proposes a tractable distributionally robust chance-constrained conservation voltage reduction (DRCC-CVR) method with enriched data-based ambiguity set in unbalanced three-phase distribution systems. The increasing penetration of distributed renewable energy not only brings clean power but also challenges the voltage regulation and energy-saving performance of CVR by introducing high uncertainties to distribution systems. In most cases, the conventional robust optimization methods for CVR only provide conservative solutions. To better consider the impacts of load and PV generation uncertainties on CVR implementation in distribution systems and provide less conservative solutions, this paper develops a data-based DRCC-CVR model with tractable reformulation and data enrichment method. Even though the uncertainties of load and photovoltaic (PV) can be captured by data, the availability of smart meters (SMs) and micro-phasor measurement units (PMUs) is restricted by cost budget. The limited data access may hinder the performance of the proposed DRCC-CVR. Thus, we further present a data enrichment method to statistically recover the high-resolution load and PV generation data from low-resolution data with Gaussian Process Regression (GPR) and Markov Chain (MC) models, which can be used to construct a data-based moment ambiguity set of uncertainty distributions for the proposed DRCC-CVR. Finally, the nonlinear power flow and voltage dependant load models and DRCC with moment-based ambiguity set are reformulated to be computationally tractable and tested on a real distribution feeder in Midwest U. S. to validate the effectiveness and robustness of the proposed method.

preprint2021arXiv

Distribution Grid Modeling Using Smart Meter Data

The knowledge of distribution grid models, including topologies and line impedances, is essential to grid monitoring, control and protection. However, this information is often unavailable, incomplete or outdated. The increasing deployment of smart meters (SMs) provides a unique opportunity to address this issue. This paper proposes a two-stage data-driven framework for distribution grid modeling using SM data. In the first stage, we propose to identify the topology via reconstructing a weighted Laplacian matrix of distribution networks, which is mathematically proven to be robust against moderately heterogeneous R/X profiles. In the second stage, we develop nonlinear least absolute deviations (LAD) and least squares (LS) regression models to estimate line impedances of single branches based on a nonlinear inverse power flow, which is then embedded within a bottom-up sweep algorithm to achieve the identification across the network in a branch-wise manner. Because the estimation models are inherently non-convex programs and NP-hard, we specially address their tractable convex relaxations and verify the exactness. In addition, we design a conductor library to significantly narrow down the solution space. Numerical results on the modified IEEE 13-bus, 37-bus and 69-bus test feeders validate the effectiveness of the proposed methods.

preprint2020arXiv

A Data-Driven Customer Segmentation Strategy Based on Contribution to System Peak Demand

Advanced metering infrastructure (AMI) enables utilities to obtain granular energy consumption data, which offers a unique opportunity to design customer segmentation strategies based on their impact on various operational metrics in distribution grids. However, performing utility-scale segmentation for unobservable customers with only monthly billing information, remains a challenging problem. To address this challenge, we propose a new metric, the coincident monthly peak contribution (CMPC), that quantifies the contribution of individual customers to system peak demand. Furthermore, a novel multi-state machine learning-based segmentation method is developed that estimates CMPC for customers without smart meters (SMs): first, a clustering technique is used to build a databank containing typical daily load patterns in different seasons using the SM data of observable customers. Next, to associate unobservable customers with the discovered typical load profiles, a classification approach is leveraged to compute the likelihood of daily consumption patterns for different unobservable households. In the third stage, a weighted clusterwise regression (WCR) model is utilized to estimate the CMPC of unobservable customers using their monthly billing data and the outcomes of the classification module. The proposed segmentation methodology has been tested and verified using real utility data.

preprint2020arXiv

A Data-Driven Game-Theoretic Approach for Behind-the-Meter PV Generation Disaggregation

Rooftop solar photovoltaic (PV) power generator is a widely used distributed energy resource (DER) in distribution systems. Currently, the majority of PVs are installed behind-the-meter (BTM), where only customers&#39; net demand is recorded by smart meters. Disaggregating BTM PV generation from net demand is critical to utilities for enhancing grid-edge observability. In this paper, a data-driven approach is proposed for BTM PV generation disaggregation using solar and demand exemplars. First, a data clustering procedure is developed to construct a library of candidate load/solar exemplars. To handle the volatility of BTM resources, a novel game-theoretic learning process is proposed to adaptively generate optimal composite exemplars using the constructed library of candidate exemplars, through repeated evaluation of disaggregation residuals. Finally, the composite native demand and solar exemplars are employed to disaggregate solar generation from net demand using a semi-supervised source separator. The proposed methodology has been verified using real smart meter data and feeder models.

preprint2020arXiv

A Time-Series Distribution Test System Based on Real Utility Data

In this paper, we provide a time-series distribution test system. This test system is a fully observable distribution grid in Midwest U.S. with smart meters (SM) installed at all end users. Our goal is to share a real U.S. distribution grid model without modification. This grid model is comprehensive and representative since it consists of both overhead lines and underground cables, and it has standard distribution grid components such as capacitor banks, line switches, substation transformers with load tap changer and secondary distribution transformers. An important uniqueness of this grid model is it has one-year smart meter measurements at all nodes, thus bridging the gap between existing test feeders and quasi-static time-series based distribution system analysis.

preprint2020arXiv

Bayesian estimates of transmission line outage rates that consider line dependencies

Transmission line outage rates are fundamental to power system reliability analysis. Line outages are infrequent, occurring only about once a year, so outage data are limited. We propose a Bayesian hierarchical model that leverages line dependencies to better estimate outage rates of individual transmission lines from limited outage data. The Bayesian estimates have a lower standard deviation than estimating the outage rates simply by dividing the number of outages by the number of years of data, especially when the number of outages is small. The Bayesian model produces more accurate individual line outage rates, as well as estimates of the uncertainty of these rates. Better estimates of line outage rates can improve system risk assessment, outage prediction, and maintenance scheduling.

preprint2020arXiv

CONNA: Addressing Name Disambiguation on The Fly

Name disambiguation is a key and also a very tough problem in many online systems such as social search and academic search. Despite considerable research, a critical issue that has not been systematically studied is disambiguation on the fly -- to complete the disambiguation in the real-time. This is very challenging, as the disambiguation algorithm must be accurate, efficient, and error tolerance. In this paper, we propose a novel framework -- CONNA -- to train a matching component and a decision component jointly via reinforcement learning. The matching component is responsible for finding the top matched candidate for the given paper, and the decision component is responsible for deciding on assigning the top matched person or creating a new person. The two components are intertwined and can be bootstrapped via jointly training. Empirically, we evaluate CONNA on two name disambiguation datasets. Experimental results show that the proposed framework can achieve a 1.21%-19.84% improvement on F1-score using joint training of the matching and the decision components. The proposed CONNA has been successfully deployed on AMiner -- a large online academic search system.

preprint2020arXiv

Critical edge behavior in the singularly perturbed Pollaczek-Jacobi type unitary ensemble

In this paper, we study the strong asymptotic for the orthogonal polynomials and universality associated with singularly perturbed Pollaczek-Jacobi type weight $$w_{p_J2}(x,t)=e^{-\frac{t}{x(1-x)}}x^α(1-x)^β, $$ where $t \ge 0$, $α>0$, $β>0$ and $x \in [0,1].$ Our main results obtained here include two aspects: { I. Strong asymptotics:} We obtain the strong asymptotic expansions for the monic Pollaczek-Jacobi type orthogonal polynomials in different interval $(0,1)$ and outside of interval $\mathbb{C}\backslash (0,1)$, respectively; Due to the effect of $\frac{t}{x(1-x)}$ for varying $t$, different asymptotic behaviors at the hard edge $0$ and $1$ were found with different scaling schemes. Specifically, the uniform asymptotic behavior can be expressed as a Airy function in the neighborhood of point $1$ as $ζ= 2n^2t \to \infty, n\to \infty$, while it is given by a Bessel function as $ζ\to 0, n \to \infty$. { II. Universality:} We respectively calculate the limit of the eigenvalue correlation kernel in the bulk of the spectrum and at the both side of hard edge, which will involve a $ψ$-functions associated with a particular Painlev$\acute{e}$ \uppercase\expandafter{\romannumeral3} equation near $x=\pm 1$. Further, we also prove the $ψ$-funcation can be approximated by a Bessel kernel as $ζ\to 0$ compared with a Airy kernel as $ζ\to \infty$. Our analysis is based on the Deift-Zhou nonlinear steepest descent method for the Riemann-Hilbert problems.

preprint2020arXiv

High-Fidelity Large-Signal Order Reduction Approach for Composite Load Model

With the increasing penetration of electronic loads and distributed energy resources (DERs), conventional load models cannot capture their dynamics. Therefore, a new comprehensive composite load model is developed by Western Electricity Coordinating Council (WECC). However, this model is a complex high-order nonlinear system with multi-time-scale property, which poses challenges on stability analysis and computational burden in large-scale simulations. In order to reduce the computational burden while preserving the accuracy of the original model, this paper proposes a generic high-fidelity order reduction approach and then apply it to WECC composite load model. First, we develop a large-signal order reduction (LSOR) method using singular perturbation theory. In this method, the fast dynamics are integrated into the slow dynamics to preserve the transient characteristics of fast dynamics. Then, we propose the necessary conditions for accurate order reduction and embed them into the LSOR to improve and guarantee the accuracy of reduced-order model. Finally, we develop the reduced-order WECC composite load model using the proposed algorithm. Simulation results show the reduced-order large signal model significantly alleviates the computational burden while maintaining similar dynamic responses as the original composite load model.

preprint2020arXiv

Learning-Based Real-Time Event Identification Using Rich Real PMU Data

A large-scale deployment of phasor measurement units (PMUs) that reveal the inherent physical laws of power systems from a data perspective enables an enhanced awareness of power system operation. However, the high-granularity and non-stationary nature of PMU time series and imperfect data quality could bring great technical challenges to real-time system event identification. To address these issues, this paper proposes a two-stage learning-based framework. At the first stage, a Markov transition field (MTF) algorithm is exploited to extract the latent data features by encoding temporal dependency and transition statistics of PMU data in graphs. Then, a spatial pyramid pooling (SPP)-aided convolutional neural network (CNN) is established to efficiently and accurately identify operation events. The proposed method fully builds on and is also tested on a large real dataset from several tens of PMU sources (and the corresponding event logs), located across the U.S., with a time span of two consecutive years. The numerical results validate that our method has high identification accuracy while showing good robustness against poor data quality.

preprint2020arXiv

Populating HI gas in dark matter halos: I. method

We combine data from the Sloan Digital Sky Survey (SDSS) and the Arecibo Legacy Fast ALFA Survey (ALFALFA) to establish an empirical model for the HI gas content within dark matter halos. A cross-match between our SDSS DR7 galaxy group sample and the ALFALFA HI sources provides a catalog of 16,520 HI-galaxy pairs within 14,270 galaxy groups (halos). Using these matched pairs, we model the HI gas mass distributions within halos using two components: 1) {\it in situ} galaxy relations that involve the HI masses, colors $({\rm g-r})$ and stellar masses 2) an {\it ex situ} dependence of the HI mass on the halo mass/environment. We find that if we solely use galaxy associated scaling relations to predict the HI gas distribution (solely component 1), the number of HI detections is significantly over-predicted with respect the ALFALFA observations. We introduce a concept for the survival of the HI masses/members within halos of different masses labelled as the `efficiency&#39; factor, in order to describe the probability that a halo has in retaining its HI detections. Taking the above consideration into account we construct a `halo based HI mass model&#39; which does not only predict the HI masses of galaxies, but also yields similar number, stellar, halo mass and satellite fraction distributions to the HI detections retrieved from observational data.

preprint2020arXiv

Statistical Modeling of Networked Solar Resources for Assessing and Mitigating Risk of Interdependent Inverter Tripping Events in Distribution Grids

It is speculated that higher penetration of inverter-based distributed photo-voltaic (PV) power generators can increase the risk of tripping events due to voltage fluctuations. To quantify this risk utilities need to solve the interactive equations of tripping events for networked PVs in real-time. However, these equations are non-differentiable, nonlinear, and exponentially complex, and thus, cannot be used as a tractable basis for solar curtailment prediction and mitigation. Furthermore, load/PV power values might not be available in real-time due to limited grid observability, which further complicates tripping event prediction. To address these challenges, we have employed Chebyshev&#39;s inequality to obtain an alternative probabilistic model for quantifying the risk of tripping for networked PVs. The proposed model enables operators to estimate the probability of interdependent inverter tripping events using only PV/load statistics and in a scalable manner. Furthermore, by integrating this probabilistic model into an optimization framework, countermeasures are designed to mitigate massive interdependent tripping events. Since the proposed model is parameterized using only the statistical characteristics of nodal active/reactive powers, it is especially beneficial in practical systems, which have limited real-time observability. Numerical experiments have been performed employing real data and feeder models to verify the performance of the proposed technique.