Source author record

Wei Qian

Wei Qian appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision math.PR math-ph math.MP Methodology math.CV math.OC math.ST q-fin.EC Robotics Statistics Theory

Catalog footprint

What is connected

11works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Towards Modeling Uncertainties of Self-explaining Neural Networks via Conformal Prediction

Despite the recent progress in deep neural networks (DNNs), it remains challenging to explain the predictions made by DNNs. Existing explanation methods for DNNs mainly focus on post-hoc explanations where another explanatory model is employed to provide explanations. The fact that post-hoc methods can fail to reveal the actual original reasoning process of DNNs raises the need to build DNNs with built-in interpretability. Motivated by this, many self-explaining neural networks have been proposed to generate not only accurate predictions but also clear and intuitive insights into why a particular decision was made. However, existing self-explaining networks are limited in providing distribution-free uncertainty quantification for the two simultaneously generated prediction outcomes (i.e., a sample's final prediction and its corresponding explanations for interpreting that prediction). Importantly, they also fail to establish a connection between the confidence values assigned to the generated explanations in the interpretation layer and those allocated to the final predictions in the ultimate prediction layer. To tackle the aforementioned challenges, in this paper, we design a novel uncertainty modeling framework for self-explaining networks, which not only demonstrates strong distribution-free uncertainty modeling performance for the generated explanations in the interpretation layer but also excels in producing efficient and effective prediction sets for the final predictions based on the informative high-level basis explanations. We perform the theoretical analysis for the proposed framework. Extensive experimental evaluation demonstrates the effectiveness of the proposed uncertainty framework.

preprint2023arXiv

Generalized disconnection exponents

We introduce and compute the generalized disconnection exponents $η_κ(β)$ which depend on $κ\in(0,4]$ and another real parameter $β$, extending the Brownian disconnection exponents (corresponding to $κ=8/3$) computed by Lawler, Schramm and Werner 2001 (conjectured by Duplantier and Kwon 1988). For $κ\in(8/3,4]$, the generalized disconnection exponents have a physical interpretation in terms of planar Brownian loop-soups with intensity $c\in (0,1]$, which allows us to obtain the first prediction of the dimension of multiple points on the cluster boundaries of these loop-soups. In particular, according to our prediction, the dimension of double points on the cluster boundaries is strictly positive for $c\in(0,1)$ and equal to zero for the critical intensity $c=1$, leading to an interesting open question of whether such points exist for the critical loop-soup. Our definition of the exponents is based on a certain general version of radial restriction measures that we construct and study. As an important tool, we introduce a new family of radial SLEs depending on $κ$ and two additional parameters $μ, ν$, that we call radial hypergeometric SLEs. This is a natural but substantial extension of the family of radial SLE$_κ(ρ)s$.

preprint2023arXiv

Multi-Target Landmark Detection with Incomplete Images via Reinforcement Learning and Shape Prior

Medical images are generally acquired with limited field-of-view (FOV), which could lead to incomplete regions of interest (ROI), and thus impose a great challenge on medical image analysis. This is particularly evident for the learning-based multi-target landmark detection, where algorithms could be misleading to learn primarily the variation of background due to the varying FOV, failing the detection of targets. Based on learning a navigation policy, instead of predicting targets directly, reinforcement learning (RL)-based methods have the potential totackle this challenge in an efficient manner. Inspired by this, in this work we propose a multi-agent RL framework for simultaneous multi-target landmark detection. This framework is aimed to learn from incomplete or (and) complete images to form an implicit knowledge of global structure, which is consolidated during the training stage for the detection of targets from either complete or incomplete test images. To further explicitly exploit the global structural information from incomplete images, we propose to embed a shape model into the RL process. With this prior knowledge, the proposed RL model can not only localize dozens of targetssimultaneously, but also work effectively and robustly in the presence of incomplete images. We validated the applicability and efficacy of the proposed method on various multi-target detection tasks with incomplete images from practical clinics, using body dual-energy X-ray absorptiometry (DXA), cardiac MRI and head CT datasets. Results showed that our method could predict whole set of landmarks with incomplete training images up to 80% missing proportion (average distance error 2.29 cm on body DXA), and could detect unseen landmarks in regions with missing image information outside FOV of target images (average distance error 6.84 mm on 3D half-head CT).

preprint2022arXiv

Golfer: Trajectory Prediction with Masked Goal Conditioning MnM Network

Transformers have enabled breakthroughs in NLP and computer vision, and have recently began to show promising performance in trajectory prediction for Autonomous Vehicle (AV). How to efficiently model the interactive relationships between the ego agent and other road and dynamic objects remains challenging for the standard attention module. In this work we propose a general Transformer-like architectural module MnM network equipped with novel masked goal conditioning training procedures for AV trajectory prediction. The resulted model, named golfer, achieves state-of-the-art performance, winning the 2nd place in the 2022 Waymo Open Dataset Motion Prediction Challenge and ranked 1st place according to minADE.

preprint2022arXiv

Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph

Two-stage detectors have gained much popularity in 3D object detection. Most two-stage 3D detectors utilize grid points, voxel grids, or sampled keypoints for RoI feature extraction in the second stage. Such methods, however, are inefficient in handling unevenly distributed and sparse outdoor points. This paper solves this problem in three aspects. 1) Dynamic Point Aggregation. We propose the patch search to quickly search points in a local region for each 3D proposal. The dynamic farthest voxel sampling is then applied to evenly sample the points. Especially, the voxel size varies along the distance to accommodate the uneven distribution of points. 2) RoI-graph Pooling. We build local graphs on the sampled points to better model contextual information and mine point relations through iterative message passing. 3) Visual Features Augmentation. We introduce a simple yet effective fusion strategy to compensate for sparse LiDAR points with limited semantic cues. Based on these modules, we construct our Graph R-CNN as the second stage, which can be applied to existing one-stage detectors to consistently improve the detection performance. Extensive experiments show that Graph R-CNN outperforms the state-of-the-art 3D detection models by a large margin on both the KITTI and Waymo Open Dataset. And we rank first place on the KITTI BEV car detection leaderboard. Code will be available at \url{https://github.com/Nightmare-n/GraphRCNN}.

preprint2022arXiv

SPATL: Salient Parameter Aggregation and Transfer Learning for Heterogeneous Clients in Federated Learning

Federated learning~(FL) facilitates the training and deploying AI models on edge devices. Preserving user data privacy in FL introduces several challenges, including expensive communication costs, limited resources, and data heterogeneity. In this paper, we propose SPATL, an FL method that addresses these issues by: (a) introducing a salient parameter selection agent and communicating selected parameters only; (b) splitting a model into a shared encoder and a local predictor, and transferring its knowledge to heterogeneous clients via the locally customized predictor. Additionally, we leverage a gradient control mechanism to further speed up model convergence and increase robustness of training processes. Experiments demonstrate that SPATL reduces communication overhead, accelerates model inference, and enables stable training processes with better results compared to state-of-the-art methods. Our approach reduces communication cost by up to $86.45\%$, accelerates local inference by reducing up to $39.7\%$ FLOPs on VGG-11, and requires $7.4 \times$ less communication overhead when training ResNet-20.

preprint2020arXiv

Structures of Spurious Local Minima in $k$-means

$k$-means clustering is a fundamental problem in unsupervised learning. The problem concerns finding a partition of the data points into $k$ clusters such that the within-cluster variation is minimized. Despite its importance and wide applicability, a theoretical understanding of the $k$-means problem has not been completely satisfactory. Existing algorithms with theoretical performance guarantees often rely on sophisticated (sometimes artificial) algorithmic techniques and restricted assumptions on the data. The main challenge lies in the non-convex nature of the problem; in particular, there exist additional local solutions other than the global optimum. Moreover, the simplest and most popular algorithm for $k$-means, namely Lloyd's algorithm, generally converges to such spurious local solutions both in theory and in practice. In this paper, we approach the $k$-means problem from a new perspective, by investigating the structures of these spurious local solutions under a probabilistic generative model with $k$ ground truth clusters. As soon as $k=3$, spurious local minima provably exist, even for well-separated and balanced clusters. One such local minimum puts two centers at one true cluster, and the third center in the middle of the other two true clusters. For general $k$, one local minimum puts multiple centers at a true cluster, and one center in the middle of multiple true clusters. Perhaps surprisingly, we prove that this is essentially the only type of spurious local minima under a separation condition. Our results pertain to the $k$-means formulation for mixtures of Gaussians or bounded distributions. Our theoretical results corroborate existing empirical observations and provide justification for several improved algorithms for $k$-means clustering.

preprint2018arXiv

The law of a point process of Brownian excursions in a domain is determined by the law of its trace

We show the result that is stated in the title of the paper, which has consequences about decomposition of Brownian loop-soup clusters in two dimensions.

preprint2017arXiv

Decomposition of Brownian loop-soup clusters

We study the structure of Brownian loop-soup clusters in two dimensions. Among other things, we obtain the following decomposition of the clusters with critical intensity: When one conditions a loop-soup cluster by its outer boundary $γ$ (which is known to be an SLE(4)-type loop), then the union of all excursions away from $γ$ by all the Brownian loops in the loop-soup that touch $γ$ is distributed exactly like the union of all excursions of a Poisson point process of Brownian excursions in the domain enclosed by $γ$. A related result that we derive and use is that the couplings of the Gaussian Free Field (GFF) with CLE(4) via level-lines (by Miller-Sheffield), of the square of the GFF with loop-soups via occupation times (by Le Jan), and of the CLE(4) with loop-soups via loop-soup clusters (by Sheffield and Werner) can be made to coincide. An instrumental role in our proof of this fact is played by Lupu's description of CLE(4) as limits of discrete loop-soup clusters.

preprint2016arXiv

Insurance Premium Prediction via Gradient Tree-Boosted Tweedie Compound Poisson Models

The Tweedie GLM is a widely used method for predicting insurance premiums. However, the structure of the logarithmic mean is restricted to a linear form in the Tweedie GLM, which can be too rigid for many applications. As a better alternative, we propose a gradient tree-boosting algorithm and apply it to Tweedie compound Poisson models for pure premiums. We use a profile likelihood approach to estimate the index and dispersion parameters. Our method is capable of fitting a flexible nonlinear Tweedie model and capturing complex interactions among predictors. A simulation study confirms the excellent prediction performance of our method. As an application, we apply our method to an auto insurance claim data and show that the new method is superior to the existing methods in the sense that it generates more accurate premium predictions, thus helping solve the adverse selection issue. We have implemented our method in a user-friendly R package that also includes a nice visualization tool for interpreting the fitted model.

preprint2015arXiv

On the Forecast Combination Puzzle

It is often reported in forecast combination literature that a simple average of candidate forecasts is more robust than sophisticated combining methods. This phenomenon is usually referred to as the "forecast combination puzzle". Motivated by this puzzle, we explore its possible explanations including estimation error, invalid weighting formulas and model screening. We show that existing understanding of the puzzle should be complemented by the distinction of different forecast combination scenarios known as combining for adaptation and combining for improvement. Applying combining methods without consideration of the underlying scenario can itself cause the puzzle. Based on our new understandings, both simulations and real data evaluations are conducted to illustrate the causes of the puzzle. We further propose a multi-level AFTER strategy that can integrate the strengths of different combining methods and adapt intelligently to the underlying scenario. In particular, by treating the simple average as a candidate forecast, the proposed strategy is shown to avoid the heavy cost of estimation error and, to a large extent, solve the forecast combination puzzle.

Wei Qian

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Towards Modeling Uncertainties of Self-explaining Neural Networks via Conformal Prediction

Generalized disconnection exponents

Multi-Target Landmark Detection with Incomplete Images via Reinforcement Learning and Shape Prior

Golfer: Trajectory Prediction with Masked Goal Conditioning MnM Network

Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph

SPATL: Salient Parameter Aggregation and Transfer Learning for Heterogeneous Clients in Federated Learning

Structures of Spurious Local Minima in $k$-means

The law of a point process of Brownian excursions in a domain is determined by the law of its trace

Decomposition of Brownian loop-soup clusters

Insurance Premium Prediction via Gradient Tree-Boosted Tweedie Compound Poisson Models

On the Forecast Combination Puzzle