Source author record

Shang Xu

Shang Xu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision math.FA

Catalog footprint

What is connected

3works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Guide Local Feature Matching by Overlap Estimation

Local image feature matching under large appearance, viewpoint, and distance changes is challenging yet important. Conventional methods detect and match tentative local features across the whole images, with heuristic consistency checks to guarantee reliable matches. In this paper, we introduce a novel Overlap Estimation method conditioned on image pairs with TRansformer, named OETR, to constrain local feature matching in the commonly visible region. OETR performs overlap estimation in a two-step process of feature correlation and then overlap regression. As a preprocessing module, OETR can be plugged into any existing local feature detection and matching pipeline, to mitigate potential view angle or scale variance. Intensive experiments show that OETR can boost state-of-the-art local feature matching performance substantially, especially for image pairs with small shared regions. The code will be publicly available at https://github.com/AbyssGaze/OETR.

preprint2022arXiv

Scatter Points in Space: 3D Detection from Multi-view Monocular Images

3D object detection from monocular image(s) is a challenging and long-standing problem of computer vision. To combine information from different perspectives without troublesome 2D instance tracking, recent methods tend to aggregate multiview feature by sampling regular 3D grid densely in space, which is inefficient. In this paper, we attempt to improve multi-view feature aggregation by proposing a learnable keypoints sampling method, which scatters pseudo surface points in 3D space, in order to keep data sparsity. The scattered points augmented by multi-view geometric constraints and visual features are then employed to infer objects location and shape in the scene. To make up the limitations of single frame and model multi-view geometry explicitly, we further propose a surface filter module for noise suppression. Experimental results show that our method achieves significantly better performance than previous works in terms of 3D detection (more than 0.1 AP improvement on some categories of ScanNet). The code will be publicly available.

preprint2020arXiv

Controlled scaling of Hilbert space frames for R^2

A Hilbert space frame on $R^n$ is {\it scalable} if we can scale the vectors to make them a tight frame. There are known classifications of scalable frames. There are two basic questions here which have never been answered in any $R^n$: Given a frame in $R^n$, how do we scale the vectors to minimize the condition number of the frame? I.e. How do we scale the frame to make it as tight as possible? If we are only allowed to use scaling numbers from the interval $[1-ε,1+ε]$, how do we scale the frame to minimize the condition number? We will answer these two questions in $R^2$ to begin the process towards a solution in $R^n$.