Researcher profile

Yu Ding

Yu Ding contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2022arXiv

A Spatio-temporal Track Association Algorithm Based on Marine Vessel Automatic Identification System Data

Tracking multiple moving objects in real-time in a dynamic threat environment is an important element in national security and surveillance system. It helps pinpoint and distinguish potential candidates posing threats from other normal objects and monitor the anomalous trajectories until intervention. To locate the anomalous pattern of movements, one needs to have an accurate data association algorithm that can associate the sequential observations of locations and motion with the underlying moving objects, and therefore, build the trajectories of the objects as the objects are moving. In this work, we develop a spatio-temporal approach for tracking maritime vessels as the vessel's location and motion observations are collected by an Automatic Identification System. The proposed approach is developed as an effort to address a data association challenge in which the number of vessels as well as the vessel identification are purposely withheld and time gaps are created in the datasets to mimic the real-life operational complexities under a threat environment. Three training datasets and five test sets are provided in the challenge and a set of quantitative performance metrics is devised by the data challenge organizer for evaluating and comparing resulting methods developed by participants. When our proposed track association algorithm is applied to the five test sets, the algorithm scores a very competitive performance.

preprint2022arXiv

Augmented Equivariant Attention Networks for Microscopy Image Reconstruction

It is time-consuming and expensive to take high-quality or high-resolution electron microscopy (EM) and fluorescence microscopy (FM) images. Taking these images could be even invasive to samples and may damage certain subtleties in the samples after long or intense exposures, often necessary for achieving high-quality or high resolution in the first place. Advances in deep learning enable us to perform image-to-image transformation tasks for various types of microscopy image reconstruction, computationally producing high-quality images from the physically acquired low-quality ones. When training image-to-image transformation models on pairs of experimentally acquired microscopy images, prior models suffer from performance loss due to their inability to capture inter-image dependencies and common features shared among images. Existing methods that take advantage of shared features in image classification tasks cannot be properly applied to image reconstruction tasks because they fail to preserve the equivariance property under spatial permutations, something essential in image-to-image transformation. To address these limitations, we propose the augmented equivariant attention networks (AEANets) with better capability to capture inter-image dependencies, while preserving the equivariance property. The proposed AEANets captures inter-image dependencies and shared features via two augmentations on the attention mechanism, which are the shared references and the batch-aware attention during training. We theoretically derive the equivariance property of the proposed augmented attention model and experimentally demonstrate its consistent superiority in both quantitative and visual results over the baseline methods.

preprint2022arXiv

Hypothesis Tests with Functional Data for Surface Quality Change Detection in Surface Finishing Processes

This work is concerned with providing a principled decision process for stopping or tool-changing in a surface finishing process. The decision process is supposed to work for products of non-flat geometry. The solution is based on conducting hypothesis testing on the bearing area curves from two consecutive stages of a surface finishing process. In each stage, the bearing area curves, which are in fact the nonparametric quantile curves representing the surface roughness, are extracted from surface profile measurements at a number of sampling locations on the surface of the products. The hypothesis test of these curves informs the decision makers whether there is a change in surface quality induced by the current finishing action. When such change is detected, the current action is deemed effective and should thus continue, while when no change is detected, the effectiveness of the current action is then called into question, signaling possibly some change in the course of action. Application of the hypothesis testing-based decision procedure to both spherical and flat surfaces demonstrates the effectiveness and benefit of the proposed method and confirms its geometry-agnostic nature.

preprint2022arXiv

Transformer-based Multimodal Information Fusion for Facial Expression Analysis

Human affective behavior analysis has received much attention in human-computer interaction (HCI). In this paper, we introduce our submission to the CVPR 2022 Competition on Affective Behavior Analysis in-the-wild (ABAW). To fully exploit affective knowledge from multiple views, we utilize the multimodal features of spoken words, speech prosody, and facial expression, which are extracted from the video clips in the Aff-Wild2 dataset. Based on these features, we propose a unified transformer-based multimodal framework for Action Unit detection and also expression recognition. Specifically, the static vision feature is first encoded from the current frame image. At the same time, we clip its adjacent frames by a sliding window and extract three kinds of multimodal features from the sequence of images, audio, and text. Then, we introduce a transformer-based fusion module that integrates the static vision features and the dynamic multimodal features. The cross-attention module in the fusion module makes the output integrated features focus on the crucial parts that facilitate the downstream detection tasks. We also leverage some data balancing techniques, data augmentation techniques, and postprocessing methods to further improve the model performance. In the official test of ABAW3 Competition, our model ranks first in the EXPR and AU tracks. The extensive quantitative evaluations, as well as ablation studies on the Aff-Wild2 dataset, prove the effectiveness of our proposed method.

preprint2021arXiv

A Graph-Theoretic Approach for Spatial Filtering and Its Impact on Mixed-type Spatial Pattern Recognition in Wafer Bin Maps

Statistical quality control in semiconductor manufacturing hinges on effective diagnostics of wafer bin maps, wherein a key challenge is to detect how defective chips tend to spatially cluster on a wafer--a problem known as spatial pattern recognition. Recently, there has been a growing interest in mixed-type spatial pattern recognition--when multiple defect patterns, of different shapes, co-exist on the same wafer. Mixed-type spatial pattern recognition entails two central tasks: (1) spatial filtering, to distinguish systematic patterns from random noises; and (2) spatial clustering, to group filtered patterns into distinct defect types. Observing that spatial filtering is instrumental to high-quality mixed-type pattern recognition, we propose to use a graph-theoretic method, called adjacency-clustering, which leverages spatial dependence among adjacent defective chips to effectively filter the raw wafer maps. Tested on real-world data and compared against a state-of the-art approach, our proposed method achieves at least 46% gain in terms of internal cluster validation quality (i.e., validation without external class labels), and about ~5% gain in terms of Normalized Mutual Information--an external cluster validation metric based on external class labels. Interestingly, the margin of improvement appears to be a function of the pattern complexity, with larger gains achieved for more complex-shaped patterns.

preprint2021arXiv

Gaussian process aided function comparison using noisy scattered data

This work proposes a nonparametric method to compare the underlying mean functions given two noisy datasets. The motivation for the work stems from an application of comparing wind turbine power curves. Comparing wind turbine data presents new problems, namely the need to identify the regions of difference in the input space and to quantify the extent of difference that is statistically significant. Our proposed method, referred to as funGP, estimates the underlying functions for different data samples using Gaussian process models. We build a confidence band using the probability law of the estimated function differences under the null hypothesis. Then, the confidence band is used for the hypothesis test as well as for identifying the regions of difference. This identification of difference regions is a distinct feature, as existing methods tend to conduct an overall hypothesis test stating whether two functions are different. Understanding the difference regions can lead to further practical insights and help devise better control and maintenance strategies for wind turbines. The merit of funGP is demonstrated by using three simulation studies and four real wind turbine datasets.

preprint2021arXiv

Neighborhood Structure Assisted Non-negative Matrix Factorization and its Application in Unsupervised Point-wise Anomaly Detection

Dimensionality reduction is considered as an important step for ensuring competitive performance in unsupervised learning such as anomaly detection. Non-negative matrix factorization (NMF) is a popular and widely used method to accomplish this goal. But NMF do not have the provision to include the neighborhood structure information and, as a result, may fail to provide satisfactory performance in presence of nonlinear manifold structure. To address that shortcoming, we propose to consider and incorporate the neighborhood structural similarity information within the NMF framework by modeling the data through a minimum spanning tree. We label the resulting method as the neighborhood structure assisted NMF. We further devise both offline and online algorithmic versions of the proposed method. Empirical comparisons using twenty benchmark datasets as well as an industrial dataset extracted from a hydropower plant demonstrate the superiority of the neighborhood structure assisted NMF and support our claim of merit. Looking closer into the formulation and properties of the neighborhood structure assisted NMF with other recent, enhanced versions of NMF reveals that inclusion of the neighborhood structure information using MST plays a key role in attaining the enhanced performance in anomaly detection.

preprint2020arXiv

Building and Maintaining a Third-Party Library Supply Chain for Productive and Secure SGX Enclave Development

The big data industry is facing new challenges as concerns about privacy leakage soar. One of the remedies to privacy breach incidents is to encapsulate computations over sensitive data within hardware-assisted Trusted Execution Environments (TEE). Such TEE-powered software is called secure enclaves. Secure enclaves hold various advantages against competing for privacy-preserving computation solutions. However, enclaves are much more challenging to build compared with ordinary software. The reason is that the development of TEE software must follow a restrictive programming model to make effective use of strong memory encryption and segregation enforced by hardware. These constraints transitively apply to all third-party dependencies of the software. If these dependencies do not officially support TEE hardware, TEE developers have to spend additional engineering effort in porting them. High development and maintenance cost is one of the major obstacles against adopting TEE-based privacy protection solutions in production. In this paper, we present our experience and achievements with regard to constructing and continuously maintaining a third-party library supply chain for TEE developers. In particular, we port a large collection of Rust third-party libraries into Intel SGX, one of the most mature trusted computing platforms. Our supply chain accepts upstream patches in a timely manner with SGX-specific security auditing. We have been able to maintain the SGX ports of 159 open-source Rust libraries with reasonable operational costs. Our work can effectively reduce the engineering cost of developing SGX enclaves for privacy-preserving data processing and exchange.

preprint2020arXiv

Data-Mining Element Charges in Inorganic Materials

Oxidation states are well-established in chemical science teaching and research. We data-mine more than 168,000 crystallographic reports to find an optimal allocation of oxidation states to each element. In doing so we uncover discrepancies between text-book chemistry and reported charge states observed in materials. We go on to show how the oxidation states we recommend can significantly facilitate materials discovery and heuristic design of novel inorganic compounds.

preprint2020arXiv

Effective Super-Resolution Method for Paired Electron Microscopic Images

This paper is concerned with investigating super-resolution algorithms and solutions for handling electron microscopic images. We note two main aspects differentiating the problem discussed here from those considered in the literature. The first difference is that in the electron imaging setting. We have a pair of physical high-resolution and low-resolution images, rather than a physical image with its downsampled counterpart. The high-resolution image covers about 25% of the view field of the low-resolution image, and the objective is to enhance the area of the low-resolution image where there is no high-resolution counterpart. The second difference is that the physics behind electron imaging is different from that of optical (visible light) photos. The implication is that super-resolution models trained by optical photos are not effective when applied to electron images. Focusing on the unique properties, we devise a global and local registration method to match the high- and low-resolution image patches and explore training strategies for applying deep learning super-resolution methods to the paired electron images. We also present a simple, non-local-mean approach as an alternative. This alternative performs as a close runner-up to the deep learning approaches, but it takes less time to train and entertains a simpler model structure.

preprint2020arXiv

FReeNet: Multi-Identity Face Reenactment

This paper presents a novel multi-identity face reenactment framework, named FReeNet, to transfer facial expressions from an arbitrary source face to a target face with a shared model. The proposed FReeNet consists of two parts: Unified Landmark Converter (ULC) and Geometry-aware Generator (GAG). The ULC adopts an encode-decoder architecture to efficiently convert expression in a latent landmark space, which significantly narrows the gap of the face contour between source and target identities. The GAG leverages the converted landmark to reenact the photorealistic image with a reference image of the target person. Moreover, a new triplet perceptual loss is proposed to force the GAG module to learn appearance and geometry information simultaneously, which also enriches facial details of the reenacted images. Further experiments demonstrate the superiority of our approach for generating photorealistic and expression-alike faces, as well as the flexibility for transferring facial expressions between identities.

preprint2020arXiv

Multi-label Relation Modeling in Facial Action Units Detection

This paper describes an approach to the facial action units detections. The involved action units (AU) include AU1 (Inner Brow Raiser), AU2 (Outer Brow Raiser), AU4 (Brow Lowerer), AU6 (Cheek Raise), AU12 (Lip Corner Puller), AU15 (Lip Corner Depressor), AU20 (Lip Stretcher), and AU25 (Lip Part). Our work relies on the dataset released by the FG-2020 Competition: Affective Behavior Analysis In-the-Wild (ABAW). The proposed method consists of the data preprocessing, the feature extraction and the AU classification. The data preprocessing includes the detection of face texture and landmarks. The texture static and landmark dynamic features are extracted through neural networks and then fused as the feature latent representation. Finally, the fused feature is taken as the initial hidden state of a recurrent neural network with a trainable lookup AU table. The output of the RNN is the results of AU classification. The detected accuracy is evaluated with 0.5$\times$accuracy + 0.5$\times$F1. Our method achieve 0.56 with the validation data that is specified by the organization committee.

preprint2020arXiv

Towards Memory Safe Python Enclave for Security Sensitive Computation

Intel SGX Guard eXtensions (SGX), a hardware-supported trusted execution environment (TEE), is designed to protect security-sensitive applications. However, since enclave applications are developed with memory unsafe languages such as C/C++, traditional memory corruption is not eliminated in SGX. Rust-SGX is the first toolkit providing enclave developers with a memory-language. However, Rust is considered a Systems language and has become the right choice for concurrent applications and web browsers. Many application domains such as Big Data, Machine Learning, Robotics, Computer Vision are more commonly developed in the python programming language. Therefore, Python application developers cannot benefit from secure enclaves like Intel SGX and rust-SGX. To fill this gap, we propose Python-SGX, which is a memory-safe SGX SDK providing enclave developers a memory-safe Python development environment. The key idea is to enable memory-safe Python language in SGX by solving the following key challenges: (1) defining a memory-safe Python interpreter (2)replacing unsafe elements of Python interpreter with safe ones,(3) achieving comparable performance to non-enclave Python applications, and (4) not introducing any unsafe new code or libraries into SGX. We propose to build Python-SGX with PyPy, a Python interpreter written by RPython, which is a subset of Python, and tame unsafe parts in PyPy by formal verification, security hardening, and memory safe language. We have implemented python-SGX and tested it with a series of benchmarks programs. Our evaluation results show that Python-SGX does not cause significant overhead.