Researcher profile

Peter J. Bickel

Peter J. Bickel contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

Measures of independence and functional dependence

We follow up on Shi et al's (2020) and Cao's and my (2020) work on the local power of a new test for independence, Chatterjee (2019), and its relation to the local power properties of classical tests. We show quite generally that for testing independence with local alternatives either Chatterjee's rank test has no power, or it may be misleading: The Blum, Kiefer, Rosenblatt, and other omnibus classical rank tests do have some local power in any direction other than those where significant results may be misleading. We also suggest methods of selective inference in independence testing. Chatterjee's statistics like Renyi's (1959) also identified functional dependence. We exhibit statistics which have better power properties than Chatterjee's but also identify functional dependence.

preprint2020arXiv

An Assumption-Free Exact Test For Fixed-Design Linear Models With Exchangeable Errors

We propose the Cyclic Permutation Test (CPT) to test general linear hypotheses for linear models. This test is non-randomized and valid in finite samples with exact Type I error $α$ for an arbitrary fixed design matrix and arbitrary exchangeable errors, whenever $1 / α$ is an integer and $n / p \ge 1 / α- 1$. The test involves applying the marginal rank test to $1 / α$ linear statistics of the outcome vector, where the coefficient vectors are determined by solving a linear system such that the joint distribution of the linear statistics is invariant with respect to a non-standard cyclic permutation group under the null hypothesis.The power can be further enhanced by solving a secondary non-linear travelling salesman problem, for which the genetic algorithm can find a reasonably good solution. Extensive simulation studies show that the CPT has comparable power to existing tests. When testing for a single contrast of coefficients, an exact confidence interval can be obtained by inverting the test. Furthermore, we provide a selective yet extensive literature review of the century-long efforts on this problem, highlighting the novelty of our test.

preprint2020arXiv

Generalized Pearson correlation squares for capturing mixtures of bivariate linear dependences

Motivated by the pressing needs for capturing complex but interpretable variable relationships in scientific research, here we generalize the squared Pearson correlation to capture a mixture of linear dependences between two real-valued random variables, with or without an index variable that specifies the line memberships. We construct generalized Pearson correlation squares by focusing on three aspects: the exchangeability of the two variables, the independence of parametric model assumptions, and the availability of population-level parameters. For the computation of the generalized Pearson correlation square from a sample without line-membership specification, we develop a K-lines clustering algorithm, where K, the number of lines, can be chosen in a data-adaptive way. With our defined population-level generalized Pearson correlation squares, we derive the asymptotic distributions of the sample-level statistics to enable efficient statistical inference. Simulation studies verify the theoretical results and compare the generalized Pearson correlation squares with other widely-used association measures in terms of power. Gene expression data analysis demonstrates the effectiveness of the generalized Pearson correlation squares in capturing interpretable gene-gene relationships missed by other measures. We implement the estimation and inference procedures in an R package gR2.

preprint2020arXiv

Hierarchical community detection by recursive partitioning

The problem of community detection in networks is usually formulated as finding a single partition of the network into some "correct" number of communities. We argue that it is more interpretable and in some regimes more accurate to construct a hierarchical tree of communities instead. This can be done with a simple top-down recursive partitioning algorithm, starting with a single community and separating the nodes into two communities by spectral clustering repeatedly, until a stopping rule suggests there are no further communities. This class of algorithms is model-free, computationally efficient, and requires no tuning other than selecting a stopping rule. We show that there are regimes where this approach outperforms K-way spectral clustering, and propose a natural framework for analyzing the algorithm's theoretical performance, the binary tree stochastic block model. Under this model, we prove that the algorithm correctly recovers the entire community tree under relatively mild assumptions. We apply the algorithm to a gene network based on gene co-occurrence in 1580 research papers on anemia, and identify six clusters of genes in a meaningful hierarchy. We also illustrate the algorithm on a dataset of statistics papers.