Source author record

Zhiliang Ying

Zhiliang Ying appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology math.ST Statistics Theory Machine Learning Applications Artificial Intelligence Computation Human-Computer Interaction q-fin.PR

Catalog footprint

What is connected

21works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Exploratory Hierarchical Factor Analysis with an Application to Psychological Measurement

Hierarchical factor models, which include the bifactor model as a special case, are useful in social and behavioural sciences for measuring hierarchically structured constructs. Specifying a hierarchical factor model involves imposing hierarchically structured zero constraints on a factor loading matrix, which is often challenging. Therefore, an exploratory analysis is needed to learn the hierarchical factor structure from data. Unfortunately, there does not exist an identifiability theory for the learnability of this hierarchical structure, nor a computationally efficient method with provable performance. The method of Schmid-Leiman transformation, which is often regarded as the default method for exploratory hierarchical factor analysis, is flawed and likely to fail. The contribution of this paper is three-fold. First, an identifiability result is established for general hierarchical factor models, which shows that the hierarchical factor structure is learnable under mild regularity conditions. Second, a computationally efficient divide-and-conquer approach is proposed for learning the hierarchical factor structure. Finally, asymptotic theory is established for the proposed method, showing that it can consistently recover the true hierarchical factor structure as the sample size grows to infinity. The power of the proposed method is shown via simulation studies and a real data application to a personality test. The computation code for the proposed method is publicly available at https://github.com/EmetSelch97/EHFA/.

preprint2020arXiv

ProcData: An R Package for Process Data Analysis

Process data refer to data recorded in the log files of computer-based items. These data, represented as timestamped action sequences, keep track of respondents' response processes of solving the items. Process data analysis aims at enhancing educational assessment accuracy and serving other assessment purposes by utilizing the rich information contained in response processes. The R package ProcData presented in this article is designed to provide tools for processing, describing, and analyzing process data. We define an S3 class "proc" for organizing process data and extend generic methods summary and print for class "proc". Two feature extraction methods for process data are implemented in the package for compressing information in the irregular response processes into regular numeric vectors. ProcData also provides functions for fitting and making predictions from a neural-network-based sequence model. These functions call relevant functions in package keras for constructing and training neural networks. In addition, several response process generators and a real dataset of response processes of the climate control item in the 2012 Programme for International Student Assessment are included in the package.

preprint2020arXiv

Subtask Analysis of Process Data Through a Predictive Model

Response process data collected from human-computer interactive items contain rich information about respondents' behavioral patterns and cognitive processes. Their irregular formats as well as their large sizes make standard statistical tools difficult to apply. This paper develops a computationally efficient method for exploratory analysis of such process data. The new approach segments a lengthy individual process into a sequence of short subprocesses to achieve complexity reduction, easy clustering and meaningful interpretation. Each subprocess is considered a subtask. The segmentation is based on sequential action predictability using a parsimonious predictive model combined with the Shannon entropy. Simulation studies are conducted to assess performance of the new methods. We use the process data from PIAAC 2012 to demonstrate how exploratory analysis of process data can be done with the new approach.

preprint2020arXiv

Unfolding-Model-Based Visualization: Theory, Method and Applications

Multidimensional unfolding methods are widely used for visualizing item response data. Such methods project respondents and items simultaneously onto a low-dimensional Euclidian space, in which respondents and items are represented by ideal points, with person-person, item-item, and person-item similarities being captured by the Euclidian distances between the points. In this paper, we study the visualization of multidimensional unfolding from a statistical perspective. We cast multidimensional unfolding into an estimation problem, where the respondent and item ideal points are treated as parameters to be estimated. An estimator is then proposed for the simultaneous estimation of these parameters. Asymptotic theory is provided for the recovery of the ideal points, shedding lights on the validity of model-based visualization. An alternating projected gradient descent algorithm is proposed for the parameter estimation. We provide two illustrative examples, one on users' movie rating and the other on senate roll call voting.

preprint2016arXiv

A Fused Latent and Graphical Model for Multivariate Binary Data

We consider modeling, inference, and computation for analyzing multivariate binary data. We propose a new model that consists of a low dimensional latent variable component and a sparse graphical component. Our study is motivated by analysis of item response data in cognitive assessment and has applications to many disciplines where item response data are collected. Standard approaches to item response data in cognitive assessment adopt the multidimensional item response theory (IRT) models. However, human cognition is typically a complicated process and thus may not be adequately described by just a few factors. Consequently, a low-dimensional latent factor model, such as the multidimensional IRT models, is often insufficient to capture the structure of the data. The proposed model adds a sparse graphical component that captures the remaining ad hoc dependence. It reduces to a multidimensional IRT model when the graphical component becomes degenerate. Model selection and parameter estimation are carried out simultaneously through construction of a pseudo-likelihood function and properly chosen penalty terms. The convexity of the pseudo-likelihood function allows us to develop an efficient algorithm, while the penalty terms generate a low-dimensional latent component and a sparse graphical structure. Desirable theoretical properties are established under suitable regularity conditions. The method is applied to the revised Eysenck's personality questionnaire, revealing its usefulness in item analysis. Simulation results are reported that show the new method works well in practical situations.

preprint2016arXiv

Chernoff Index for Cox Test of Separate Parametric Families

The asymptotic efficiency of a generalized likelihood ratio test proposed by Cox is studied under the large deviations framework for error probabilities developed by Chernoff. In particular, two separate parametric families of hypotheses are considered [Cox, 1961, 1962]. The significance level is set such that the maximal type I and type II error probabilities for the generalized likelihood ratio test decay exponentially fast with the same rate. We derive the analytic form of such a rate that is also known as the Chernoff index [Chernoff, 1952], a relative efficiency measure when there is no preference between the null and the alternative hypotheses. We further extend the analysis to approximate error probabilities when the two families are not completely separated. Discussions are provided concerning the implications of the present result on model selection.

preprint2013arXiv

Bootstrapping a Change-Point Cox Model for Survival Data

This paper investigates the (in)-consistency of various bootstrap methods for making inference on a change-point in time in the Cox model with right censored survival data. A criterion is established for the consistency of any bootstrap method. It is shown that the usual nonparametric bootstrap is inconsistent for the maximum partial likelihood estimation of the change-point. A new model-based bootstrap approach is proposed and its consistency established. Simulation studies are carried out to assess the performance of various bootstrap schemes.

preprint2013arXiv

Functional and Parametric Estimation in a Semi- and Nonparametric Model with Application to Mass-Spectrometry Data

Motivated by modeling and analysis of mass-spectrometry data, a semi- and nonparametric model is proposed that consists of a linear parametric component for individual location and scale and a nonparametric regression function for the common shape. A multi-step approach is developed that simultaneously estimates the parametric components and the nonparametric function. Under certain regularity conditions, it is shown that the resulting estimators is consistent and asymptotic normal for the parametric part and achieve the optimal rate of convergence for the nonparametric part when the bandwidth is suitably chosen. Simulation results are presented to demonstrate the effectiveness and finite-sample performance of the method. The method is also applied to a SELDI-TOF mass spectrometry data set from a study of liver cancer patients.

preprint2013arXiv

Least Product Relative Error Estimation

A least product relative error criterion is proposed for multiplicative regression models. It is invariant under scale transformation of the outcome and covariates. In addition, the objective function is smooth and convex, resulting in a simple and uniquely defined estimator of the regression parameter. It is shown that the estimator is asymptotically normal and that the simple plugging-in variance estimation is valid. Simulation results confirm that the proposed method performs well. An application to body fat calculation is presented to illustrate the new method.

preprint2013arXiv

Likelihood Adaptively Modified Penalties

A new family of penalty functions, adaptive to likelihood, is introduced for model selection in general regression models. It arises naturally through assuming certain types of prior distribution on the regression parameters. To study stability properties of the penalized maximum likelihood estimator, two types of asymptotic stability are defined. Theoretical properties, including the parameter estimation consistency, model selection consistency, and asymptotic stability, are established under suitable regularity conditions. An efficient coordinate-descent algorithm is proposed. Simulation results and real data analysis show that the proposed method has competitive performance in comparison with existing ones.

preprint2013arXiv

Non-identifiability, equivalence classes, and attribute-specific classification in Q-matrix based Cognitive Diagnosis Models

There has been growing interest in recent years in Q-matrix based cognitive diagnosis models. Parameter estimation and respondent classification under these models may suffer due to identifiability issues. Non-identifiability can be described by a partition separating attribute profiles into groups of those with identical likelihoods. Marginal identifiability concerns the identifiability of individual attributes. Maximum likelihood estimation of the proportion of respondents within each equivalence class is consistent, making possible a new measure of assessment quality reporting the proportion of respondents for whom each individual attribute is marginally identifiable. Arising from this is a new posterior-based classification method adjusting for non-identifiability.

preprint2013arXiv

Oracle inequalities for the lasso in the Cox model

We study the absolute penalized maximum partial likelihood estimator in sparse, high-dimensional Cox proportional hazards regression models where the number of time-dependent covariates can be larger than the sample size. We establish oracle inequalities based on natural extensions of the compatibility and cone invertibility factors of the Hessian matrix at the true regression coefficients. Similar results based on an extension of the restricted eigenvalue can be also proved by our method. However, the presented oracle inequalities are sharper since the compatibility and cone invertibility factors are always greater than the corresponding restricted eigenvalue. In the Cox regression model, the Hessian matrix is based on time-dependent covariates in censored risk sets, so that the compatibility and cone invertibility factors, and the restricted eigenvalue as well, are random variables even when they are evaluated for the Hessian at the true regression coefficients. Under mild conditions, we prove that these quantities are bounded from below by positive constants for time-dependent covariates, including cases where the number of covariates is of greater order than the sample size. Consequently, the compatibility and cone invertibility factors can be treated as positive constants in our oracle inequalities.

preprint2013arXiv

Sequential Analysis of Cox Model under Response Dependent Allocation

Sellke and Siegmund (1983) developed the Brownian approximation to the Cox partial likelihood score as a process of calendar time, laying the foundation for group sequential analysis of survival studies. We extend their results to cover situations in which treatment allocations may depend on observed outcomes. The new development makes use of the entry time and calendar time along with the corresponding $σ$-filtrations to handle the natural information accumulation. Large sample properties are established under suitable regularity conditions.

preprint2013arXiv

Statistical Inference on Transformation Models: a Self-induced Smoothing Approach

This paper deals with a general class of transformation models that contains many important semiparametric regression models as special cases. It develops a self-induced smoothing for the maximum rank correlation estimator, resulting in simultaneous point and variance estimation. The self-induced smoothing does not require bandwidth selection, yet provides the right amount of smoothness so that the estimator is asymptotically normal with mean zero (unbiased) and variance-covariance matrix consistently estimated by the usual sandwich-type estimator. An iterative algorithm is given for the variance estimation and shown to numerically converge to a consistent limiting variance estimator. The approach is applied to a data set involving survival times of primary biliary cirrhosis patients. Simulations results are reported, showing that the new method performs well under a variety of scenarios.

preprint2013arXiv

Theory of self-learning $Q$-matrix

Cognitive assessment is a growing area in psychological and educational measurement, where tests are given to assess mastery/deficiency of attributes or skills. A key issue is the correct identification of attributes associated with items in a test. In this paper, we set up a mathematical framework under which theoretical properties may be discussed. We establish sufficient conditions to ensure that the attributes required by each item are learnable from the data.

preprint2012arXiv

Focus of Attention for Linear Predictors

We present a method to stop the evaluation of a prediction process when the result of the full evaluation is obvious. This trait is highly desirable in prediction tasks where a predictor evaluates all its features for every example in large datasets. We observe that some examples are easier to classify than others, a phenomenon which is characterized by the event when most of the features agree on the class of an example. By stopping the feature evaluation when encountering an easy- to-classify example, the predictor can achieve substantial gains in computation. Our method provides a natural attention mechanism for linear predictors where the predictor concentrates most of its computation on hard-to-classify examples and quickly discards easy-to-classify ones. By modifying a linear prediction algorithm such as an SVM or AdaBoost to include our attentive method we prove that the average number of features computed is O(sqrt(n log 1/sqrt(delta))) where n is the original number of features, and delta is the error rate incurred due to early stopping. We demonstrate the effectiveness of Attentive Prediction on MNIST, Real-sim, Gisette, and synthetic datasets.

preprint2012arXiv

Parameter Estimation using Empirical Likelihood combined with Market Information

During the last decade Levy processes with jumps have received increasing popularity for modelling market behaviour for both derviative pricing and risk management purposes. Chan et al. (2009) introduced the use of empirical likelihood methods to estimate the parameters of various diffusion processes via their characteristic functions which are readily avaiable in most cases. Return series from the market are used for estimation. In addition to the return series, there are many derivatives actively traded in the market whose prices also contain information about parameters of the underlying process. This observation motivates us, in this paper, to combine the return series and the associated derivative prices observed at the market so as to provide a more reflective estimation with respect to the market movement and achieve a gain of effciency. The usual asymptotic properties, including consistency and asymptotic normality, are established under suitable regularity conditions. Simulation and case studies are performed to demonstrate the feasibility and effectiveness of the proposed method.

preprint2011arXiv

An Empirical Likelihood Approach to Nonparametric Covariate Adjustment in Randomized Clinical Trials

Covariate adjustment is an important tool in the analysis of randomized clinical trials and observational studies. It can be used to increase efficiency and thus power, and to reduce possible bias. While most statistical tests in randomized clinical trials are nonparametric in nature, approaches for covariate adjustment typically rely on specific regression models, such as the linear model for a continuous outcome, the logistic regression model for a dichotomous outcome and the Cox model for survival time. Several recent efforts have focused on model-free covariate adjustment. This paper makes use of the empirical likelihood method and proposes a nonparametric approach to covariate adjustment. A major advantage of the new approach is that it automatically utilizes covariate information in an optimal way without fitting nonparametric regression. The usual asymptotic properties, including the Wilks-type result of convergence to a chi-square distribution for the empirical likelihood ratio based test, and asymptotic normality for the corresponding maximum empirical likelihood estimator, are established. It is also shown that the resulting test is asymptotically most powerful and that the estimator for the treatment effect achieves the semiparametric efficiency bound. The new method is applied to the Global Use of Strategies to Open Occluded Coronary Arteries (GUSTO)-I trial. Extensive simulations are conducted, validating the theoretical findings.

preprint2011arXiv

Learning Item-Attribute Relationship in Q-Matrix Based Diagnostic Classification Models

Recent surge of interests in cognitive assessment has led to the developments of novel statistical models for diagnostic classification. Central to many such models is the well-known Q-matrix, which specifies the item-attribute relationship. This paper proposes a principled estimation procedure for the Q-matrix and related model parameters. Desirable theoretic properties are established through large sample analysis. The proposed method also provides a platform under which important statistical issues, such as hypothesis testing and model selection, can be addressed.

preprint2011arXiv

Rapid Learning with Stochastic Focus of Attention

We present a method to stop the evaluation of a decision making process when the result of the full evaluation is obvious. This trait is highly desirable for online margin-based machine learning algorithms where a classifier traditionally evaluates all the features for every example. We observe that some examples are easier to classify than others, a phenomenon which is characterized by the event when most of the features agree on the class of an example. By stopping the feature evaluation when encountering an easy to classify example, the learning algorithm can achieve substantial gains in computation. Our method provides a natural attention mechanism for learning algorithms. By modifying Pegasos, a margin-based online learning algorithm, to include our attentive method we lower the number of attributes computed from $n$ to an average of $O(\sqrt{n})$ features without loss in prediction accuracy. We demonstrate the effectiveness of Attentive Pegasos on MNIST data.

preprint2010arXiv

The Attentive Perceptron

We propose a focus of attention mechanism to speed up the Perceptron algorithm. Focus of attention speeds up the Perceptron algorithm by lowering the number of features evaluated throughout training and prediction. Whereas the traditional Perceptron evaluates all the features of each example, the Attentive Perceptron evaluates less features for easy to classify examples, thereby achieving significant speedups and small losses in prediction accuracy. Focus of attention allows the Attentive Perceptron to stop the evaluation of features at any interim point and filter the example. This creates an attentive filter which concentrates computation at examples that are hard to classify, and quickly filters examples that are easy to classify.

Zhiliang Ying

What is connected

Connect this record

See the researcher in context

Building this map preview

21 published item(s)

Exploratory Hierarchical Factor Analysis with an Application to Psychological Measurement

ProcData: An R Package for Process Data Analysis

Subtask Analysis of Process Data Through a Predictive Model

Unfolding-Model-Based Visualization: Theory, Method and Applications

A Fused Latent and Graphical Model for Multivariate Binary Data

Chernoff Index for Cox Test of Separate Parametric Families

Bootstrapping a Change-Point Cox Model for Survival Data

Functional and Parametric Estimation in a Semi- and Nonparametric Model with Application to Mass-Spectrometry Data

Least Product Relative Error Estimation

Likelihood Adaptively Modified Penalties

Non-identifiability, equivalence classes, and attribute-specific classification in Q-matrix based Cognitive Diagnosis Models

Oracle inequalities for the lasso in the Cox model

Sequential Analysis of Cox Model under Response Dependent Allocation

Statistical Inference on Transformation Models: a Self-induced Smoothing Approach

Theory of self-learning $Q$-matrix

Focus of Attention for Linear Predictors

Parameter Estimation using Empirical Likelihood combined with Market Information

An Empirical Likelihood Approach to Nonparametric Covariate Adjustment in Randomized Clinical Trials

Learning Item-Attribute Relationship in Q-Matrix Based Diagnostic Classification Models

Rapid Learning with Stochastic Focus of Attention

The Attentive Perceptron