Source author record

Tian Fang

Tian Fang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.mtrl-sci cond-mat.mes-hall quant-ph

Catalog footprint

What is connected

18works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer

Generating robust and reliable correspondences across images is a fundamental task for a diversity of applications. To capture context at both global and local granularity, we propose ASpanFormer, a Transformer-based detector-free matcher that is built on hierarchical attention structure, adopting a novel attention operation which is capable of adjusting attention span in a self-adaptive manner. To achieve this goal, first, flow maps are regressed in each cross attention phase to locate the center of search region. Next, a sampling grid is generated around the center, whose size, instead of being empirically configured as fixed, is adaptively computed from a pixel uncertainty estimated along with the flow map. Finally, attention is computed across two images within derived regions, referred to as attention span. By these means, we are able to not only maintain long-range dependencies, but also enable fine-grained attention among pixels of high relevance that compensates essential locality and piece-wise smoothness in matching tasks. State-of-the-art accuracy on a wide range of evaluation benchmarks validates the strong matching capability of our method.

preprint2022arXiv

Critical Regularizations for Neural Surface Reconstruction in the Wild

Neural implicit functions have recently shown promising results on surface reconstructions from multiple views. However, current methods still suffer from excessive time complexity and poor robustness when reconstructing unbounded or complex scenes. In this paper, we present RegSDF, which shows that proper point cloud supervisions and geometry regularizations are sufficient to produce high-quality and robust reconstruction results. Specifically, RegSDF takes an additional oriented point cloud as input, and optimizes a signed distance field and a surface light field within a differentiable rendering framework. We also introduce the two critical regularizations for this optimization. The first one is the Hessian regularization that smoothly diffuses the signed distance values to the entire distance field given noisy and incomplete input. And the second one is the minimal surface regularization that compactly interpolates and extrapolates the missing geometry. Extensive experiments are conducted on DTU, BlendedMVS, and Tanks and Temples datasets. Compared with recent neural surface reconstruction approaches, RegSDF is able to reconstruct surfaces with fine details even for open scenes with complex topologies and unstructured camera trajectories.

preprint2022arXiv

NeILF: Neural Incident Light Field for Physically-based Material Estimation

We present a differentiable rendering framework for material and lighting estimation from multi-view images and a reconstructed geometry. In the framework, we represent scene lightings as the Neural Incident Light Field (NeILF) and material properties as the surface BRDF modelled by multi-layer perceptrons. Compared with recent approaches that approximate scene lightings as the 2D environment map, NeILF is a fully 5D light field that is capable of modelling illuminations of any static scenes. In addition, occlusions and indirect lights can be handled naturally by the NeILF representation without requiring multiple bounces of ray tracing, making it possible to estimate material properties even for scenes with complex lightings and geometries. We also propose a smoothness regularization and a Lambertian assumption to reduce the material-lighting ambiguity during the optimization. Our method strictly follows the physically-based rendering equation, and jointly optimizes material and lighting through the differentiable rendering process. We have intensively evaluated the proposed method on our in-house synthetic dataset, the DTU MVS dataset, and real-world BlendedMVS scenes. Our method is able to outperform previous methods by a significant margin in terms of novel view rendering quality, setting a new state-of-the-art for image-based material and lighting estimation.

preprint2020arXiv

ASLFeat: Learning Local Features of Accurate Shape and Localization

This work focuses on mitigating two limitations in the joint learning of local feature detectors and descriptors. First, the ability to estimate the local shape (scale, orientation, etc.) of feature points is often neglected during dense feature extraction, while the shape-awareness is crucial to acquire stronger geometric invariance. Second, the localization accuracy of detected keypoints is not sufficient to reliably recover camera geometry, which has become the bottleneck in tasks such as 3D reconstruction. In this paper, we present ASLFeat, with three light-weight yet effective modifications to mitigate above issues. First, we resort to deformable convolutional networks to densely estimate and apply local transformation. Second, we take advantage of the inherent feature hierarchy to restore spatial resolution and low-level details for accurate keypoint localization. Finally, we use a peakiness measurement to relate feature responses and derive more indicative detection scores. The effect of each modification is thoroughly studied, and the evaluation is extensively conducted across a variety of practical scenarios. State-of-the-art results are reported that demonstrate the superiority of our methods.

preprint2020arXiv

BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks

While deep learning has recently achieved great success on multi-view stereo (MVS), limited training data makes the trained model hard to be generalized to unseen scenarios. Compared with other computer vision tasks, it is rather difficult to collect a large-scale MVS dataset as it requires expensive active scanners and labor-intensive process to obtain ground truth 3D structures. In this paper, we introduce BlendedMVS, a novel large-scale dataset, to provide sufficient training ground truth for learning-based MVS. To create the dataset, we apply a 3D reconstruction pipeline to recover high-quality textured meshes from images of well-selected scenes. Then, we render these mesh models to color images and depth maps. To introduce the ambient lighting information during training, the rendered color images are further blended with the input images to generate the training input. Our dataset contains over 17k high-resolution images covering a variety of scenes, including cities, architectures, sculptures and small objects. Extensive experiments demonstrate that BlendedMVS endows the trained model with significantly better generalization ability compared with other MVS datasets. The dataset and pretrained models are available at \url{https://github.com/YoYo000/BlendedMVS}.

preprint2020arXiv

Joint Semantic Segmentation and Boundary Detection using Iterative Pyramid Contexts

In this paper, we present a joint multi-task learning framework for semantic segmentation and boundary detection. The critical component in the framework is the iterative pyramid context module (PCM), which couples two tasks and stores the shared latent semantics to interact between the two tasks. For semantic boundary detection, we propose the novel spatial gradient fusion to suppress nonsemantic edges. As semantic boundary detection is the dual task of semantic segmentation, we introduce a loss function with boundary consistency constraint to improve the boundary pixel accuracy for semantic segmentation. Our extensive experiments demonstrate superior performance over state-of-the-art works, not only in semantic segmentation but also in semantic boundary detection. In particular, a mean IoU score of 81:8% on Cityscapes test set is achieved without using coarse data or any external data for semantic segmentation. For semantic boundary detection, we improve over previous state-of-the-art works by 9.9% in terms of AP and 6:8% in terms of MF(ODS).

preprint2020arXiv

KFNet: Learning Temporal Camera Relocalization using Kalman Filtering

Temporal camera relocalization estimates the pose with respect to each video frame in sequence, as opposed to one-shot relocalization which focuses on a still image. Even though the time dependency has been taken into account, current temporal relocalization methods still generally underperform the state-of-the-art one-shot approaches in terms of accuracy. In this work, we improve the temporal relocalization method by using a network architecture that incorporates Kalman filtering (KFNet) for online camera relocalization. In particular, KFNet extends the scene coordinate regression problem to the time domain in order to recursively establish 2D and 3D correspondences for the pose determination. The network architecture design and the loss formulation are based on Kalman filtering in the context of Bayesian learning. Extensive experiments on multiple relocalization benchmarks demonstrate the high accuracy of KFNet at the top of both one-shot and temporal relocalization approaches. Our codes are released at https://github.com/zlthinker/KFNet.

preprint2020arXiv

Learning Discriminative Feature with CRF for Unsupervised Video Object Segmentation

In this paper, we introduce a novel network, called discriminative feature network (DFNet), to address the unsupervised video object segmentation task. To capture the inherent correlation among video frames, we learn discriminative features (D-features) from the input images that reveal feature distribution from a global perspective. The D-features are then used to establish correspondence with all features of test image under conditional random field (CRF) formulation, which is leveraged to enforce consistency between pixels. The experiments verify that DFNet outperforms state-of-the-art methods by a large margin with a mean IoU score of 83.4% and ranks first on the DAVIS-2016 leaderboard while using much fewer parameters and achieving much more efficient performance in the inference phase. We further evaluate DFNet on the FBMS dataset and the video saliency dataset ViSal, reaching a new state-of-the-art. To further demonstrate the generalizability of our framework, DFNet is also applied to the image object co-segmentation task. We perform experiments on a challenging dataset PASCAL-VOC and observe the superiority of DFNet. The thorough experiments verify that DFNet is able to capture and mine the underlying relations of images and discover the common foreground objects.

preprint2020arXiv

Learning Stereo Matchability in Disparity Regression Networks

Learning-based stereo matching has recently achieved promising results, yet still suffers difficulties in establishing reliable matches in weakly matchable regions that are textureless, non-Lambertian, or occluded. In this paper, we address this challenge by proposing a stereo matching network that considers pixel-wise matchability. Specifically, the network jointly regresses disparity and matchability maps from 3D probability volume through expectation and entropy operations. Next, a learned attenuation is applied as the robust loss function to alleviate the influence of weakly matchable pixels in the training. Finally, a matchability-aware disparity refinement is introduced to improve the depth inference in weakly matchable regions. The proposed deep stereo matchability (DSM) framework can improve the matching result or accelerate the computation while still guaranteeing the quality. Moreover, the DSM framework is portable to many recent stereo networks. Extensive experiments are conducted on Scene Flow and KITTI stereo datasets to demonstrate the effectiveness of the proposed framework over the state-of-the-art learning-based stereo methods.

preprint2020arXiv

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency

Recent learning-based approaches, in which models are trained by single-view images have shown promising results for monocular 3D face reconstruction, but they suffer from the ill-posed face pose and depth ambiguity issue. In contrast to previous works that only enforce 2D feature constraints, we propose a self-supervised training architecture by leveraging the multi-view geometry consistency, which provides reliable constraints on face pose and depth estimation. We first propose an occlusion-aware view synthesis method to apply multi-view geometry consistency to self-supervised learning. Then we design three novel loss functions for multi-view consistency, including the pixel consistency loss, the depth consistency loss, and the facial landmark-based epipolar loss. Our method is accurate and robust, especially under large variations of expressions, poses, and illumination conditions. Comprehensive experiments on the face alignment and 3D face reconstruction benchmarks have demonstrated superiority over state-of-the-art methods. Our code and model are released in https://github.com/jiaxiangshang/MGCNet.

preprint2020arXiv

Visibility-aware Multi-view Stereo Network

Learning-based multi-view stereo (MVS) methods have demonstrated promising results. However, very few existing networks explicitly take the pixel-wise visibility into consideration, resulting in erroneous cost aggregation from occluded pixels. In this paper, we explicitly infer and integrate the pixel-wise occlusion information in the MVS network via the matching uncertainty estimation. The pair-wise uncertainty map is jointly inferred with the pair-wise depth map, which is further used as weighting guidance during the multi-view cost volume fusion. As such, the adverse influence of occluded pixels is suppressed in the cost fusion. The proposed framework Vis-MVSNet significantly improves depth accuracies in the scenes with severe occlusion. Extensive experiments are performed on DTU, BlendedMVS, and Tanks and Temples datasets to justify the effectiveness of the proposed framework.

preprint2013arXiv

Raman and Photoluminescence Study of Dielectric and Thermal Effects on Atomically Thin MoS2

Atomically thin two-dimensional molybdenum disulfide (MoS2) sheets have attracted much attention due to their potential for future electronic applications. They not only present the best planar electrostatic control in a device, but also lend themselves readily for dielectric engineering. In this work, we experimentally investigated the dielectric effect on the Raman and photoluminescence (PL) spectra of monolayer MoS2 by comparing samples with and without HfO2 on top by atomic layer deposition (ALD). Based on considerations of the thermal, doping, strain and dielectric screening influences, it is found that the red shift in the Raman spectrum largely stems from modulation doping of MoS2 by the ALD HfO2, and the red shift in the PL spectrum is most likely due to strain imparted on MoS2 by HfO2. Our work also suggests that due to the intricate dependence of band structure of monolayer MoS2 on strain, one must be cautious to interpret its Raman and PL spectroscopy.

preprint2011arXiv

High field transport in graphene

In this work, high field carrier transport in two dimensional (2D) graphene is investigated. Analytical models are applied to estimate the saturation currents in graphene, based on the high scattering rate of optical phonon emission. Non-equilibrium (hot) phonon effect was studied by Monte Carlo (MC) simulations. MC simulation confirms that hot phonon effects play a dominant role in current saturation in graphene. Current degradation due to elastic scattering events is much smaller compared to the hot phonon effect. Transient phenomenon as such as velocity overshoot was also studied using MC simulation. The simulation results shows promising potential for graphene to be used in high speed electronic devices by shrinking the channel length below 100nm if electrostatic control can be exercised in the absence of a band gap.

preprint2011arXiv

Interband absorption in single layer hexagonal boron nitride

Monolayer of hexagonal boron nitride (h-BN), commonly known as "white graphene" is a promising wide bandgap semiconducting material for deep-ultaviolet optoelectronic devices. In this report, the light absorption of a single layer hexagonal boron nitride is calculated using a tight-binding Hamiltonian. The absorption is found to be monotonically decreasing function of photon energy compared to graphene where absorption coefficient is independent of photon energy and characterized by the effective fine-structure constant.

preprint2011arXiv

Unique prospects of graphene-based THz modulators

The modulation depth of 2-D electron gas (2DEG) based THz modulators using AlGaAs/GaAs heterostructures with metal gates is inherently limited to < 30%. The metal gate not only attenuates the THz signal (> 90%) but also severely degrades the modulation depth. The metal losses can be significantly reduced with an alternative material with tunable conductivity. Graphene presents a unique solution to this problem due to its symmetric band structure and extraordinarily high mobility of holes that is comparable to electron mobility in conventional semiconductors. The hole conductivity in graphene can be electrostatically tuned in the graphene-2DEG parallel capacitor configuration, thus more efficiently tuning the THz transmission. In this work, we show that it is possible to achieve a modulation depth of > 90% while simultaneously minimizing signal attenuation to < 5% by tuning the Fermi level at the Dirac point in graphene.

preprint2010arXiv

Anisotropic charge transport in non-polar GaN QW: polarization induced charge and interface roughness scattering

Charge transport in GaN quantum well (QW) devices grown in non-polar direction has been theoretically investigated . Emergence of anisotropic line charge scattering mechanism originating as a result of anisotropic rough surface morphology in conjunction with in-plane built-in polarization has been proposed. It has shown that in-plane growth anisotropy leads to large anisotropic carrier transport at low temperatures. At high temperatures, this anisotropy in charge transport is partially washed out by strong isotropic optical phonon scattering in GaN QW.

preprint2010arXiv

Charged basal stacking fault (BSF) scattering in nitride semiconductors

A theory of charge transport in semiconductors in the presence of basal stacking faults is developed. It is shown that the presence of basal stacking faults leads to anisotropy in carrier transport. The theory is applied to carrier transport in non-polar GaN films consisting of a large number BSFs, and the result is compared with experimental data.

preprint2010arXiv

Effect of high-K dielectrics on charge transport in graphene

The effect of various dielectrics on charge mobility in single layer graphene is investigated. By calculating the remote optical phonon scattering arising from the polar substrates, and combining it with their effect on Coulombic impurity scattering, a comprehensive picture of the effect of dielectrics on charge transport in graphene emerges. It is found that though high-$κ$ dielectrics can strongly reduce Coulombic scattering by dielectric screening, scattering from surface phonon modes arising from them wash out this advantage. By comparing the room-temperature transport properties with narrow-bandgap III-V semiconductors, strategies to improve the mobility in single layer graphene are outlined.

Tian Fang

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer

Critical Regularizations for Neural Surface Reconstruction in the Wild

NeILF: Neural Incident Light Field for Physically-based Material Estimation

ASLFeat: Learning Local Features of Accurate Shape and Localization

BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks

Joint Semantic Segmentation and Boundary Detection using Iterative Pyramid Contexts

KFNet: Learning Temporal Camera Relocalization using Kalman Filtering

Learning Discriminative Feature with CRF for Unsupervised Video Object Segmentation

Learning Stereo Matchability in Disparity Regression Networks

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency

Visibility-aware Multi-view Stereo Network

Raman and Photoluminescence Study of Dielectric and Thermal Effects on Atomically Thin MoS2

High field transport in graphene

Interband absorption in single layer hexagonal boron nitride

Unique prospects of graphene-based THz modulators

Anisotropic charge transport in non-polar GaN QW: polarization induced charge and interface roughness scattering

Charged basal stacking fault (BSF) scattering in nitride semiconductors

Effect of high-K dielectrics on charge transport in graphene