Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
23works
0followers
17topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

23 published item(s)

preprint2026arXiv

DepthPilot: From Controllability to Interpretability in Colonoscopy Video Generation

Controllable medical video generation has achieved remarkable progress, but it still lacks interpretability, which requires the alignment of generated contents with physical priors and faithful clinical manifestations. To push the boundaries from mere controllability to interpretability, we propose DepthPilot, the first interpretable framework for colonoscopy video generation. This work takes a step toward trustworthy generation through two synergistic paradigms. To achieve explicit geometric grounding, DepthPilot devises a prior distribution alignment strategy, injecting depth constraints into the diffusion backbone via parameter-efficient fine-tuning to ensure anatomical fidelity. To enhance intrinsic nonlinear modeling under these geometric constraints, DepthPilot employs an adaptive spline denoising module, replacing fixed linear weights with learnable spline functions to capture complex spatio-temporal dynamics. Extensive evaluations across three public datasets and in-house clinical data confirm DepthPilot's robust ability to produce physically consistent videos. It achieves FID scores below 15 across all benchmarks and ranks first in clinician assessments, bridging the gap between "visually realistic" and "clinically interpretable". Moreover, DepthPilot-generated videos are expected to enable reliable 3D reconstruction, facilitating surgical navigation and blind region identification, and serve as a foundation toward the colorectal world model.

preprint2026arXiv

MLB: A Scenario-Driven Benchmark for Evaluating Large Language Models in Clinical Applications

The proliferation of Large Language Models (LLMs) presents transformative potential for healthcare, yet practical deployment is hindered by the absence of frameworks that assess real-world clinical utility. Existing benchmarks test static knowledge, failing to capture the dynamic, application-oriented capabilities required in clinical practice. To bridge this gap, we introduce a Medical LLM Benchmark MLB, a comprehensive benchmark evaluating LLMs on both foundational knowledge and scenario-based reasoning. MLB is structured around five core dimensions: Medical Knowledge (MedKQA), Safety and Ethics (MedSE), Medical Record Understanding (MedRU), Smart Services (SmartServ), and Smart Healthcare (SmartCare). The benchmark integrates 22 datasets (17 newly curated) from diverse Chinese clinical sources, covering 64 clinical specialties. Its design features a rigorous curation pipeline involving 300 licensed physicians. Besides, we provide a scalable evaluation methodology, centered on a specialized judge model trained via Supervised Fine-Tuning (SFT) on expert annotations. Our comprehensive evaluation of 10 leading models reveals a critical translational gap: while the top-ranked model, Kimi-K2-Instruct (77.3% accuracy overall), excels in structured tasks like information extraction (87.8% accuracy in MedRU), performance plummets in patient-facing scenarios (61.3% in SmartServ). Moreover, the exceptional safety score (90.6% in MedSE) of the much smaller Baichuan-M2-32B highlights that targeted training is equally critical. Our specialized judge model, trained via SFT on a 19k expert-annotated medical dataset, achieves 92.1% accuracy, an F1-score of 94.37%, and a Cohen's Kappa of 81.3% for human-AI consistency, validating a reproducible and expert-aligned evaluation protocol. MLB thus provides a rigorous framework to guide the development of clinically viable LLMs.

preprint2026arXiv

Prompt-Induced Over-Generation as Denial-of-Service: A Black-Box Attack-Side Benchmark

Large Language Models (LLMs) can be driven into over-generation, emitting thousands of tokens before producing an end-of-sequence (EOS) token. This degrades answer quality, inflates latency and cost, and can be weaponized as a denial-of-service (DoS) attack. Recent work has begun to study DoS-style prompt attacks, but typically focuses on a single attack algorithm or assumes white-box access, without an attack-side benchmark that compares prompt-based attackers in a black-box, query-only regime with a known tokenizer. We introduce such a benchmark and study two prompt-only attackers. The first is an Evolutionary Over-Generation Prompt Search (EOGen) that searches the token space for prefixes that suppress EOS and induce long continuations. The second is a goal-conditioned reinforcement learning attacker (RL-GOAL) that trains a network to generate prefixes conditioned on a target length. To characterize behavior, we introduce Over-Generation Factor (OGF): the ratio of produced tokens to a model's context window, along with stall and latency summaries. EOGen discovers short-prefix attacks that raise Phi-3 to OGF = 1.39 +/- 1.14 (Success@>=2: 25.2%); RL-GOAL nearly doubles severity to OGF = 2.70 +/- 1.43 (Success@>=2: 64.3%) and drives budget-hit non-termination in 46% of trials.

preprint2024arXiv

USFM: A Universal Ultrasound Foundation Model Generalized to Tasks and Organs towards Label Efficient Image Analysis

Inadequate generality across different organs and tasks constrains the application of ultrasound (US) image analysis methods in smart healthcare. Building a universal US foundation model holds the potential to address these issues. Nevertheless, the development of such foundational models encounters intrinsic challenges in US analysis, i.e., insufficient databases, low quality, and ineffective features. In this paper, we present a universal US foundation model, named USFM, generalized to diverse tasks and organs towards label efficient US image analysis. First, a large-scale Multi-organ, Multi-center, and Multi-device US database was built, comprehensively containing over two million US images. Organ-balanced sampling was employed for unbiased learning. Then, USFM is self-supervised pre-trained on the sufficient US database. To extract the effective features from low-quality US images, we proposed a spatial-frequency dual masked image modeling method. A productive spatial noise addition-recovery approach was designed to learn meaningful US information robustly, while a novel frequency band-stop masking learning approach was also employed to extract complex, implicit grayscale distribution and textural variations. Extensive experiments were conducted on the various tasks of segmentation, classification, and image enhancement from diverse organs and diseases. Comparisons with representative US image analysis models illustrate the universality and effectiveness of USFM. The label efficiency experiments suggest the USFM obtains robust performance with only 20% annotation, laying the groundwork for the rapid development of US models in clinical practices.

preprint2022arXiv

An Online Joint Optimization-Estimation Architecture for Distribution Networks

In this paper, we propose an optimal control-estimation architecture for distribution networks, which jointly solves the optimal power flow (OPF) problem and static state estimation (SE) problem through an online gradient-based feedback algorithm. The main objective is to enable a fast and timely interaction between the optimal controllers and state estimators with limited sensor measurements. First, convergence and optimality of the proposed algorithm are analytically established. Then, the proposed gradient-based algorithm is modified by introducing statistical information of the inherent estimation and linearization errors for an improved and robust performance of the online control decisions. Overall, the proposed method eliminates the traditional separation of control and operation, where control and estimation usually operate at distinct layers and different time-scales. Hence, it enables a computationally affordable, efficient and robust online operational framework for distribution networks under time-varying settings.

preprint2022arXiv

Cosmological Implications of Axion-Matter Couplings

Axions and other light particles appear ubiquitously in physics beyond the Standard Model, with a variety of possible couplings to ordinary matter. Cosmology offers a unique probe of these particles as they can thermalize in the hot environment of the early universe for any such coupling. For sub-MeV particles, their entropy must leave a measurable cosmological signal, usually via the effective number of relativistic particles, $N_\mathrm{eff}$. In this paper, we will revisit the cosmological constraints on the couplings of axions and other pseudo-Nambu-Goldstone bosons to Standard Model fermions from thermalization below the electroweak scale, where these couplings are marginal and give contributions to the radiation density of $ΔN_\mathrm{eff} > 0.027$. We update the calculation of the production rates to eliminate unnecessary approximations and find that the cosmological bounds on these interactions are complementary to astrophysical constraints, e.g. from supernova SN 1987A. We additionally provide quantitative explanations for these bounds and their relationship.

preprint2022arXiv

Emotion Recognition From Gait Analyses: Current Research and Future Directions

Human gait refers to a daily motion that represents not only mobility, but it can also be used to identify the walker by either human observers or computers. Recent studies reveal that gait even conveys information about the walker's emotion. Individuals in different emotion states may show different gait patterns. The mapping between various emotions and gait patterns provides a new source for automated emotion recognition. Compared to traditional emotion detection biometrics, such as facial expression, speech and physiological parameters, gait is remotely observable, more difficult to imitate, and requires less cooperation from the subject. These advantages make gait a promising source for emotion detection. This article reviews current research on gait-based emotion detection, particularly on how gait parameters can be affected by different emotion states and how the emotion states can be recognized through distinct gait patterns. We focus on the detailed methods and techniques applied in the whole process of emotion recognition: data collection, preprocessing, and classification. At last, we discuss possible future developments of efficient and effective gait-based emotion recognition using the state of the art techniques on intelligent computation and big data.

preprint2022arXiv

Incorporate Day-ahead Robustness and Real-time Incentives for Electricity Market Design

In this paper, we propose a two-stage electricity market framework to explore the participation of distributed energy resources (DERs) in a day-ahead (DA) market and a real-time (RT) market. The objective is to determine the optimal bidding strategies of the aggregated DERs in the DA market and generate online incentive signals for DER-owners to optimize the social welfare taking into account network operational constraints. Distributionally robust optimization is used to explicitly incorporate data-based statistical information of renewable forecasts into the supply/demand decisions in the DA market. We evaluate the conservativeness of bidding strategies distinguished by different risk aversion settings. In the RT market, a bi-level time-varying optimization problem is proposed to design the online incentive signals to tradeoff the RT imbalance penalty for distribution system operators (DSOs) and the costs of individual DER-owners. This enables tracking their optimal dispatch to provide fast balancing services, in the presence of time-varying network states while satisfying the voltage regulation requirement. Simulation results on both DA wholesale market and RT balancing market demonstrate the necessity of this two-stage design, and its robustness to uncertainties, the performance of convergence, the tracking ability, and the feasibility of the resulting network operations.

preprint2022arXiv

Multiple Ancillary Services Provision by Distributed Energy Resources in Active Distribution Networks

The electric power system is currently experiencing radical changes stemming from the increasing share of renewable energy resources and the consequent decommissioning of conventional power plants based on synchronous generators. Since the principal providers of ancillary services are being phased out, new flexibility and reserve providers are needed. The proliferation of Distributed Energy Resources (DERs) in modern distribution networks has opened new possibilities for distribution system operators, enabling them to fill the market gap by harnessing the DER flexibility. This paper introduces a novel centralized MPC-based controller that enables the concurrent provision of voltage support, primary and secondary frequency control by adjusting the setpoints of a heterogeneous group of DERs in active distribution grids. The input-multirate control framework is used to accommodate the distinct timescales and provision requirements of each ancillary service and to ensure that the available resources are properly allocated. Furthermore, an efficient way for incorporating network constraints in the formulation is proposed, where network decomposition is applied to a linear power flow formulation together with network reduction. In addition, different timescale dynamics of the employed DERs and their capability curves are included. The performance of the proposed controller is evaluated on several case studies via dynamic simulations of the IEEE 33-bus system.

preprint2022arXiv

Optimal Power Flow with State Estimation In the Loop for Distribution Networks

We propose a framework for integrating optimal power flow (OPF) with state estimation (SE) in the loop for distribution networks. Our approach combines a primal-dual gradient-based OPF solver with a SE feedback loop based on a limited set of sensors for system monitoring, instead of assuming exact knowledge of all states. The estimation algorithm reduces uncertainty on unmeasured grid states based on a few appropriate online state measurements and noisy "pseudo-measurements". We analyze the convergence of the proposed algorithm and quantify the statistical estimation errors based on a weighted least squares (WLS) estimator. The numerical results on a 4521-node network demonstrate that this approach can scale to extremely large networks and provide robustness to both large pseudo measurement variability and inherent sensor measurement noise.

preprint2022arXiv

Optimal Pump Control for Water Distribution Networks via Data-based Distributional Robustness

In this paper, we propose a data-based methodology to solve a multi-period stochastic optimal water flow (OWF) problem for water distribution networks (WDNs). The framework explicitly considers the pump schedule and water network head level with limited information of demand forecast errors for an extended period simulation. The objective is to determine the optimal feedback decisions of network-connected components, such as nominal pump schedules and tank head levels and reserve policies, which specify device reactions to forecast errors for accommodation of fluctuating water demand. Instead of assuming the uncertainties across the water network are generated by a prescribed certain distribution, we consider ambiguity sets of distributions centered at an empirical distribution, which is based directly on a finite training data set. We use a distance-based ambiguity set with the Wasserstein metric to quantify the distance between the real unknown data-generating distribution and the empirical distribution. This allows our multi-period OWF framework to trade off system performance and inherent sampling errors in the training dataset. Case studies on a three-tank water distribution network systematically illustrate the tradeoff between pump operational cost, risks of constraint violation, and out-of-sample performance.

preprint2022arXiv

Sparse Structure Design for Stochastic Linear Systems via a Linear Matrix Inequality Approach

In this paper, we propose a sparsity-promoting feedback control design for stochastic linear systems with multiplicative noise. The objective is to identify a sparse control architecture that optimizes the closed-loop performance while stabilizing the system in the mean-square sense. The proposed approach approximates the nonconvex combinatorial optimization problem by minimizing various matrix norms subject to the Linear Matrix Inequality (LMI) stability condition. We present two design problems to reduce the number of actuators via the static state-feedback and a low-dimensional output. A regularized linear quadratic regulator with multiplicative noise (LQRm) optimal control problem and its convex relaxation are presented to demonstrate the tradeoff between the suboptimal closed-loop performance and the sparsity degree of control structure. Case studies on power grids for wide-area frequency control show that the proposed sparsity-promoting control can considerably reduce the number of actuators without significant loss in system performance. The sparse control architecture is robust to substantial system-level disturbances while achieving mean-square stability.

preprint2022arXiv

Tractable Data Enriched Distributionally Robust Chance-Constrained CVR

This paper proposes a tractable distributionally robust chance-constrained conservation voltage reduction (DRCC-CVR) method with enriched data-based ambiguity set in unbalanced three-phase distribution systems. The increasing penetration of distributed renewable energy not only brings clean power but also challenges the voltage regulation and energy-saving performance of CVR by introducing high uncertainties to distribution systems. In most cases, the conventional robust optimization methods for CVR only provide conservative solutions. To better consider the impacts of load and PV generation uncertainties on CVR implementation in distribution systems and provide less conservative solutions, this paper develops a data-based DRCC-CVR model with tractable reformulation and data enrichment method. Even though the uncertainties of load and photovoltaic (PV) can be captured by data, the availability of smart meters (SMs) and micro-phasor measurement units (PMUs) is restricted by cost budget. The limited data access may hinder the performance of the proposed DRCC-CVR. Thus, we further present a data enrichment method to statistically recover the high-resolution load and PV generation data from low-resolution data with Gaussian Process Regression (GPR) and Markov Chain (MC) models, which can be used to construct a data-based moment ambiguity set of uncertainty distributions for the proposed DRCC-CVR. Finally, the nonlinear power flow and voltage dependant load models and DRCC with moment-based ambiguity set are reformulated to be computationally tractable and tested on a real distribution feeder in Midwest U. S. to validate the effectiveness and robustness of the proposed method.

preprint2021arXiv

Casting voids in nickel superalloy and the mechanical behaviour under room temperature tensile deformation

The microstructure of a second-generation nickel base superalloy is studied using X-ray computed tomography (XCT) and scanning electron microscopy (SEM). The as-cast material contains 0.15 (+-0.001) vol% voids and these are distributed in the inter-dendritic region. The volume fraction of the voids increases to 0.21 (+-0.001) vol% after tensile deformation. Surface observations show evidence of dislocation emissions from the void surface, a mechanism possibly facilitates the expansion of the voids and contributes to the increased void volume fraction. Phenomenological parameters such as stress triaxiality, often believed to control void growth, are investigated through crystal plasticity simulation and compared with literature reported data. The results indicate weak correlation between stress triaxiality and void growth, but this may be possibly due to the lack of data at higher level of plastic deformation, which is limited by the ductility of the material. The distribution of the stress triaxiality field within the sample is heterogeneous and the peak of the triaxiality field is a function of the ratio between notch diameter and sample width. A smaller notch diameter to sample width ratio tend to distribute the triaxiality peaks towards the centre of the sample but also lead to higher strain localisation, an effect that results in early sample failure.

preprint2020arXiv

Algebraic 3D Graphic Statics: reciprocal constructions

The recently developed 3D graphic statics (3DGS) lacks a rigorous mathematical definition relating the geometrical and topological properties of the reciprocal polyhedral diagrams as well as a precise method for the geometric construction of these diagrams. This paper provides a fundamental algebraic formulation for 3DGS by developing equilibrium equations around the edges of the primal diagram and satisfying the equations by the closeness of the polygons constructed by the edges of the corresponding faces in the dual/reciprocal diagram. The research provides multiple numerical methods for solving the equilibrium equations and explains the advantage of using each technique. The approach of this paper can be used for compression-and-tension combined form-finding and analysis as it allows constructing both the form and force diagram based on the interpretation of the input diagram. Besides, the paper expands on the geometric/static degrees of (in)determinacies of the diagrams using the algebraic formulation and shows how these properties can be used for the constrained manipulation of the polyhedrons in an interactive environment without breaking the reciprocity between the two.

preprint2020arXiv

Conceptual Design and Preliminary Results of a VR-based Radiation Safety Training System for Interventional Radiologists

Recent studies have reported an increased risk of developing brain and neck tumors, as well as cataracts, in practitioners in interventional radiology (IR). Occupational radiation protection in IR has been a top concern for regulatory agencies and professional societies. To help minimize occupational radiation exposure in IR, we conceptualized a virtual reality (VR) based radiation safety training system to help operators understand complex radiation fields and to avoid high radiation areas through game-like interactive simulations. The preliminary development of the system has yielded results suggesting that the training system can calculate and report the radiation exposure after each training session based on a database precalculated from computational phantoms and Monte Carlo simulations and the position information provided in real-time by the MS Hololens headset worn by trainee. In addition, real-time dose rate and cumulative dose will be displayed to the trainee by MS Hololens to help them adjust their practice. This paper presents the conceptual design of the overall hardware and software design, as well as preliminary results to combine MS HoloLens headset and complex 3D X-ray field spatial distribution data to create a mixed reality environment for safety training purpose in IR.

preprint2020arXiv

CosmoVAE: Variational Autoencoder for CMB Image Inpainting

Cosmic microwave background radiation (CMB) is critical to the understanding of the early universe and precise estimation of cosmological constants. Due to the contamination of thermal dust noise in the galaxy, the CMB map that is an image on the two-dimensional sphere has missing observations, mainly concentrated on the equatorial region. The noise of the CMB map has a significant impact on the estimation precision for cosmological parameters. Inpainting the CMB map can effectively reduce the uncertainty of parametric estimation. In this paper, we propose a deep learning-based variational autoencoder --- CosmoVAE, to restoring the missing observations of the CMB map. The input and output of CosmoVAE are square images. To generate training, validation, and test data sets, we segment the full-sky CMB map into many small images by Cartesian projection. CosmoVAE assigns physical quantities to the parameters of the VAE network by using the angular power spectrum of the Gaussian random field as latent variables. CosmoVAE adopts a new loss function to improve the learning performance of the model, which consists of $\ell_1$ reconstruction loss, Kullback-Leibler divergence between the posterior distribution of encoder network and the prior distribution of latent variables, perceptual loss, and total-variation regularizer. The proposed model achieves state of the art performance for Planck \texttt{Commander} 2018 CMB map inpainting.

preprint2020arXiv

Depth Edge Guided CNNs for Sparse Depth Upsampling

Guided sparse depth upsampling aims to upsample an irregularly sampled sparse depth map when an aligned high-resolution color image is given as guidance. Many neural networks have been designed for this task. However, they often ignore the structural difference between the depth and the color image, resulting in obvious artifacts such as texture copy and depth blur at the upsampling depth. Inspired by the normalized convolution operation, we propose a guided convolutional layer to recover dense depth from sparse and irregular depth image with an depth edge image as guidance. Our novel guided network can prevent the depth value from crossing the depth edge to facilitate upsampling. We further design a convolution network based on proposed convolutional layer to combine the advantages of different algorithms and achieve better performance. We conduct comprehensive experiments to verify our method on real-world indoor and synthetic outdoor datasets. Our method produces strong results. It outperforms state-of-the-art methods on the Virtual KITTI dataset and the Middlebury dataset. It also presents strong generalization capability under different 3D point densities, various lighting and weather conditions.

preprint2020arXiv

Gradient-Based Multi-Area Distribution System State Estimation

The increasing distributed and renewable energy resources and controllable devices in distribution systems make fast distribution system state estimation (DSSE) crucial in system monitoring and control. We consider a large multi-phase distribution system and formulate DSSE as a weighted least squares (WLS) problem. We divide the large distribution system into smaller areas of subtree structure, and by jointly exploring the linearized power flow model and the network topology, we propose a gradient-based multi-area algorithm to exactly and efficiently solve the WLS problem. The proposed algorithm enables distributed and parallel computation of the state estimation problem without compromising any performance. Numerical results on a 4,521-node test feeder show that the designed algorithm features fast convergence and accurate estimation results. Comparison with traditional Gauss-Newton method shows that the proposed method has much better performance in distribution systems with a limited amount of reliable measurement. The real-time implementation of the algorithm tracks time-varying system states with high accuracy.

preprint2020arXiv

Integrating Crowdsourcing and Active Learning for Classification of Work-Life Events from Tweets

Social media, especially Twitter, is being increasingly used for research with predictive analytics. In social media studies, natural language processing (NLP) techniques are used in conjunction with expert-based, manual and qualitative analyses. However, social media data are unstructured and must undergo complex manipulation for research use. The manual annotation is the most resource and time-consuming process that multiple expert raters have to reach consensus on every item, but is essential to create gold-standard datasets for training NLP-based machine learning classifiers. To reduce the burden of the manual annotation, yet maintaining its reliability, we devised a crowdsourcing pipeline combined with active learning strategies. We demonstrated its effectiveness through a case study that identifies job loss events from individual tweets. We used Amazon Mechanical Turk platform to recruit annotators from the Internet and designed a number of quality control measures to assure annotation accuracy. We evaluated 4 different active learning strategies (i.e., least confident, entropy, vote entropy, and Kullback-Leibler divergence). The active learning strategies aim at reducing the number of tweets needed to reach a desired performance of automated classification. Results show that crowdsourcing is useful to create high-quality annotations and active learning helps in reducing the number of required tweets, although there was no substantial difference among the strategies tested.

preprint2020arXiv

Solving Optimal Power Flow for Distribution Networks with State Estimation Feedback

Conventional optimal power flow (OPF) solvers assume full observability of the involved system states. However, in practice, there is a lack of reliable system monitoring devices in the distribution networks. To close the gap between the theoretic algorithm design and practical implementation, this work proposes to solve the OPF problems based on the state estimation (SE) feedback for the distribution networks where only a part of the involved system states are physically measured. The SE feedback increases the observability of the under-measured system and provides more accurate system states monitoring when the measurements are noisy. We analytically investigate the convergence of the proposed algorithm. The numerical results demonstrate that the proposed approach is more robust to large pseudo measurement variability and inherent sensor noise in comparison to the other frameworks without SE feedback.

preprint2020arXiv

Stochastic Dynamic Programming for Wind Farm Power Maximization

Wind farms can increase annual energy production (AEP) with advanced control algorithms by coordinating the set points of individual turbine controllers across the farm. However, it remains a significant challenge to achieve performance improvements in practice because of the difficulty of utilizing models that capture pertinent complex aerodynamic phenomena while remaining amenable to control design. We formulate a multi-stage stochastic optimal control problem for wind farm power maximization and show that it can be solved analytically via dynamic programming. In particular, our model incorporates state- and input-dependent multiplicative noise whose distributions capture stochastic wind fluctuations. The optimal control policies and value functions explicitly incorporate the moments of these distributions, establishing a connection between wind flow data and optimal feedback control. We illustrate the results with numerical experiments that demonstrate the advantages of our approach over existing methods based on deterministic models.

preprint2020arXiv

Survey on Visual Analysis of Event Sequence Data

Event sequence data record series of discrete events in the time order of occurrence. They are commonly observed in a variety of applications ranging from electronic health records to network logs, with the characteristics of large-scale, high-dimensional, and heterogeneous. This high complexity of event sequence data makes it difficult for analysts to manually explore and find patterns, resulting in ever-increasing needs for computational and perceptual aids from visual analytics techniques to extract and communicate insights from event sequence datasets. In this paper, we review the state-of-the-art visual analytics approaches, characterize them with our proposed design space, and categorize them based on analytical tasks and applications. From our review of relevant literature, we have also identified several remaining research challenges and future research opportunities.