Researcher profile

Yucheng Dong

Yucheng Dong contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2021arXiv

Exploring the space-time pattern of log-transformed infectious count of COVID-19: a clustering-segmented autoregressive sigmoid model

At the end of April 20, 2020, there were only a few new COVID-19 cases remaining in China, whereas the rest of the world had shown increases in the number of new cases. It is of extreme importance to develop an efficient statistical model of COVID-19 spread, which could help in the global fight against the virus. We propose a clustering-segmented autoregressive sigmoid (CSAS) model to explore the space-time pattern of the log-transformed infectious count. Four key characteristics are included in this CSAS model, including unknown clusters, change points, stretched S-curves, and autoregressive terms, in order to understand how this outbreak is spreading in time and in space, to understand how the spread is affected by epidemic control strategies, and to apply the model to updated data from an extended period of time. We propose a nonparametric graph-based clustering method for discovering dissimilarity of the curve time series in space, which is justified with theoretical support to demonstrate how the model works under mild and easily verified conditions. We propose a very strict purity score that penalizes overestimation of clusters. Simulations show that our nonparametric graph-based clustering method is faster and more accurate than the parametric clustering method regardless of the size of data sets. We provide a Bayesian information criterion (BIC) to identify multiple change points and calculate a confidence interval for a mean response. By applying the CSAS model to the collected data, we can explain the differences between prevention and control policies in China and selected countries.

preprint2020arXiv

Distributed Linguistic Representations in Decision Making: Taxonomy, Key Elements and Applications, and Challenges in Data Science and Explainable Artificial Intelligence

Distributed linguistic representations are powerful tools for modelling the uncertainty and complexity of preference information in linguistic decision making. To provide a comprehensive perspective on the development of distributed linguistic representations in decision making, we present the taxonomy of existing distributed linguistic representations. Then, we review the key elements of distributed linguistic information processing in decision making, including the distance measurement, aggregation methods, distributed linguistic preference relations, and distributed linguistic multiple attribute decision making models. Next, we provide a discussion on ongoing challenges and future research directions from the perspective of data science and explainable artificial intelligence.

preprint2020arXiv

On scenario construction for stochastic shortest path problems in real road networks

Stochastic shortest path computations are often performed under very strict time constraints, so computational efficiency is critical. A major determinant for the CPU time is the number of scenarios used. We demonstrate that by carefully picking the right scenario generation method for finding scenarios, the quality of the computations can be improved substantially over random sampling for a given number of scenarios. We study a real case from a California freeway network with 438 road links and 24 5-minute time periods, implying 10,512 random speed variables, correlated in time and space, leading to a total of 55,245,816 distinct correlations. We find that (1) the scenario generation method generates unbiased scenarios and strongly outperforms random sampling in terms of stability (i.e., relative difference and variance) whichever origin-destination pair and objective function is used; (2) to achieve a certain accuracy, the number of scenarios required for scenario generation is much lower than that for random sampling, typically about 6-10 times lower for a stability level of 1\%; and (3) different origin-destination pairs and different objective functions could require different numbers of scenarios to achieve a specified stability.