Researcher profile

Jianjun Shi

Jianjun Shi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

ANTLER: Bayesian Nonlinear Tensor Learning and Modeler for Unstructured, Varying-Size Point Cloud Data

Unstructured point clouds with varying sizes are increasingly acquired in a variety of environments through laser triangulation or Light Detection and Ranging (LiDAR). Predicting a scalar response based on unstructured point clouds is a common problem that arises in a wide variety of applications. The current literature relies on several pre-processing steps such as structured subsampling and feature extraction to analyze the point cloud data. Those techniques lead to quantization artifacts and do not consider the relationship between the regression response and the point cloud during pre-processing. Therefore, we propose a general and holistic "Bayesian Nonlinear Tensor Learning and Modeler" (ANTLER) to model the relationship of unstructured, varying-size point cloud data with a scalar or multivariate response. The proposed ANTLER simultaneously optimizes a nonlinear tensor dimensionality reduction and a nonlinear regression model with a 3D point cloud input and a scalar or multivariate response. ANTLER has the ability to consider the complex data representation, high-dimensionality,and inconsistent size of the 3D point cloud data.

preprint2022arXiv

Compressed Smooth Sparse Decomposition

Image-based anomaly detection systems are of vital importance in various manufacturing applications. The resolution and acquisition rate of such systems is increasing significantly in recent years under the fast development of image sensing technology. This enables the detection of tiny defects in real-time. However, such a high resolution and acquisition rate of image data not only slows down the speed of image processing algorithms but also increases data storage and transmission cost. To tackle this problem, we propose a fast and data-efficient method with theoretical performance guarantee that is suitable for sparse anomaly detection in images with a smooth background (smooth plus sparse signal). The proposed method, named Compressed Smooth Sparse Decomposition (CSSD), is a one-step method that unifies the compressive image acquisition and decomposition-based image processing techniques. To further enhance its performance in a high-dimensional scenario, a Kronecker Compressed Smooth Sparse Decomposition (KronCSSD) method is proposed. Compared to traditional smooth and sparse decomposition algorithms, significant transmission cost reduction and computational speed boost can be achieved with negligible performance loss. Simulation examples and several case studies in various applications illustrate the effectiveness of the proposed framework.

preprint2022arXiv

Convex Relaxation for Optimal Fixture Layout Design

This paper proposes a general fixture layout design framework that directly integrates the system equation with the convex relaxation method. Note that the optimal fixture design problem is a large-scale combinatorial optimization problem, we relax it to a convex semidefinite programming (SDP) problem by adopting sparse learning and SDP relaxation techniques. It can be solved efficiently by existing convex optimization algorithms and thus generates a near-optimal fixture layout. A real case study in the half-to-half fuselage assembly process indicates the superiority of our proposed algorithm compared to the current industry practice and state-of-art methods.

preprint2022arXiv

Synthetic Defect Generation for Display Front-of-Screen Quality Inspection: A Survey

Display front-of-screen (FOS) quality inspection is essential for the mass production of displays in the manufacturing process. However, the severe imbalanced data, especially the limited number of defect samples, has been a long-standing problem that hinders the successful application of deep learning algorithms. Synthetic defect data generation can help address this issue. This paper reviews the state-of-the-art synthetic data generation methods and the evaluation metrics that can potentially be applied to display FOS quality inspection tasks.

preprint2020arXiv

Active Learning for Gaussian Process Considering Uncertainties with Application to Shape Control of Composite Fuselage

In the machine learning domain, active learning is an iterative data selection algorithm for maximizing information acquisition and improving model performance with limited training samples. It is very useful, especially for the industrial applications where training samples are expensive, time-consuming, or difficult to obtain. Existing methods mainly focus on active learning for classification, and a few methods are designed for regression such as linear regression or Gaussian process. Uncertainties from measurement errors and intrinsic input noise inevitably exist in the experimental data, which further affects the modeling performance. The existing active learning methods do not incorporate these uncertainties for Gaussian process. In this paper, we propose two new active learning algorithms for the Gaussian process with uncertainties, which are variance-based weighted active learning algorithm and D-optimal weighted active learning algorithm. Through numerical study, we show that the proposed approach can incorporate the impact from uncertainties, and realize better prediction performance. This approach has been applied to improving the predictive modeling for automatic shape control of composite fuselage.

preprint2020arXiv

An Augmented Regression Model for Tensors with Missing Values

Heterogeneous but complementary sources of data provide an unprecedented opportunity for developing accurate statistical models of systems. Although the existing methods have shown promising results, they are mostly applicable to situations where the system output is measured in its complete form. In reality, however, it may not be feasible to obtain the complete output measurement of a system, which results in observations that contain missing values. This paper introduces a general framework that integrates tensor regression with tensor completion and proposes an efficient optimization framework that alternates between two steps for parameter estimation. Through multiple simulations and a case study, we evaluate the performance of the proposed method. The results indicate the superiority of the proposed method in comparison to a benchmark.

preprint2020arXiv

Real-time Data-driven Quality Assessment for Continuous Manufacturing of Carbon Nanotube Buckypaper

Carbon nanotube (CNT) thin sheet, or buckypaper, has shown great potential as a multifunctional platform material due to its desirable properties, including its lightweight nature, high mechanical properties, and good conductivity. However, their mass adoption and applications by industry have run into significant bottlenecks because of large variability and uncertainty in quality during fabrication. There is an urgent demand to produce high-quality, high-performance buckypaper at an industrial scale. Raman spectroscopy provides detailed nanostructure information within seconds, and the obtained spectra can be decomposed into multiple effects associated with diverse quality characteristics of buckypaper. However, the decomposed effects are high-dimensional, and a systematic quantification method for buckypaper quality assessment has been lacking. In this paper, we propose a real-time data-driven quality assessment method, which fills in the blank of quantifying the quality for continuous manufacturing processes of CNT buckypaper. The composite indices derived from the proposed method are developed by analyzing in-line Raman spectroscopy sensing data. Weighted cross-correlation and maximum margin clustering are used to fuse the fixed effects into an inconsistency index to monitor the long-term mean shift of the process and to fuse the normal effects into a uniformity index to monitor the within-sample normality. Those individual quality indices are then combined into a composite index to reflect the overall quality of buckypaper. A case study indicates that our proposed approach can determine the quality rank for ten samples, and can provide quantitative quality indices for single-walled carbon nanotube buckypaper after acid processing or functionalization. The quality assessment results are consistent with evaluations from the experienced engineers.

preprint2019arXiv

Effective Model Calibration via Sensible Variable Identification and Adjustment, with Application to Composite Fuselage Simulation

Estimation of model parameters of computer simulators, also known as calibration, is an important topic in many engineering applications. In this paper, we consider the calibration of computer model parameters with the help of engineering design knowledge. We introduce the concept of sensible (calibration) variables. Sensible variables are model parameters which are sensitive in the engineering modeling, and whose optimal values differ from the engineering design values.We propose an effective calibration method to identify and adjust the sensible variables with limited physical experimental data. The methodology is applied to a composite fuselage simulation problem.

preprint2018arXiv

A novel approach for fusion of heterogeneous sources of data

With advancements in sensor technology, a heterogeneous set of data, containing samples of scalar, waveform signal, image, or even structured point cloud are becoming increasingly popular. Developing a statistical model, representing the behavior of the underlying system based upon such a heterogeneous set of data can be used in monitoring, control, and optimization of the system. Unfortunately, available methods only focus on the scalar and curve data and do not provide a general framework that can integrate different sources of data to construct a model. This paper poses the problem of estimating a process output, measured by a scalar, curve, an image, or a point cloud by a set of heterogeneous process variables such as scalar process setting, sensor readings, and images. We introduce a general approach in which each set of input data (predictor) as well as the output measurements are represented by tensors. We formulate a linear regression model between the input and output tensors and estimate the parameters by minimizing a least square loss function. In order to avoid overfitting and to reduce the number of parameters to be estimated, we decompose the model parameters using several bases, spanning the input and output spaces. Next, we learn both the bases and their spanning coefficients when minimizing the loss function using an alternating least square (ALS) algorithm. We show that such a minimization has a closed-form solution in each iteration and can be computed very efficiently. Through several simulation and case studies, we evaluate the performance of the proposed method. The results reveal the advantage of the proposed method over some benchmarks in the literature in terms of the mean square prediction error.