Source author record

Wei Liang

Wei Liang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision hep-ex hep-ph nucl-th physics.optics eess.IV Machine Learning Methodology Networking and Internet Architecture Artificial Intelligence Computation math.ST Neural and Evolutionary Computing Robotics Statistics Theory

Catalog footprint

What is connected

19works

15topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Floor Plan-Guided Visual Navigation Incorporating Depth and Directional Cues

Current visual navigation strategies mainly follow an exploration-first and then goal-directed navigation paradigm. This exploratory phase inevitably compromises the overall efficiency of navigation. Recent studies propose leveraging floor plans alongside RGB inputs to guide agents, aiming for rapid navigation without prior exploration or mapping. Key issues persist despite early successes. The modal gap and content misalignment between floor plans and RGB images necessitate an efficient approach to extract the most salient and complementary features from both for reliable navigation. Here, we propose GlocDiff, a novel framework that employs a diffusion-based policy to continuously predict future waypoints. This policy is conditioned on two complementary information streams: (1) local depth cues derived from the current RGB observation, and (2) global directional guidance extracted from the floor plan. The former handles immediate navigation safety by capturing surrounding geometry, while the latter ensures goal-directed efficiency by offering definitive directional cues. Extensive evaluations on the FloNa benchmark demonstrate that GlocDiff achieves superior efficiency and effectiveness. Furthermore, its successful deployment in real-world scenarios underscores its strong potential for broad practical application.

preprint2023arXiv

Diffusion-based Generation, Optimization, and Planning in 3D Scenes

We introduce SceneDiffuser, a conditional generative model for 3D scene understanding. SceneDiffuser provides a unified model for solving scene-conditioned generation, optimization, and planning. In contrast to prior works, SceneDiffuser is intrinsically scene-aware, physics-based, and goal-oriented. With an iterative sampling strategy, SceneDiffuser jointly formulates the scene-aware generation, physics-based optimization, and goal-oriented planning via a diffusion-based denoising process in a fully differentiable fashion. Such a design alleviates the discrepancies among different modules and the posterior collapse of previous scene-conditioned generative models. We evaluate SceneDiffuser with various 3D scene understanding tasks, including human pose and motion generation, dexterous grasp generation, path planning for 3D navigation, and motion planning for robot arms. The results show significant improvements compared with previous models, demonstrating the tremendous potential of SceneDiffuser for the broad community of 3D scene understanding.

preprint2023arXiv

The state-of-the-art 3D anisotropic intracranial hemorrhage segmentation on non-contrast head CT: The INSTANCE challenge

Automatic intracranial hemorrhage segmentation in 3D non-contrast head CT (NCCT) scans is significant in clinical practice. Existing hemorrhage segmentation methods usually ignores the anisotropic nature of the NCCT, and are evaluated on different in-house datasets with distinct metrics, making it highly challenging to improve segmentation performance and perform objective comparisons among different methods. The INSTANCE 2022 was a grand challenge held in conjunction with the 2022 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI). It is intended to resolve the above-mentioned problems and promote the development of both intracranial hemorrhage segmentation and anisotropic data processing. The INSTANCE released a training set of 100 cases with ground-truth and a validation set with 30 cases without ground-truth labels that were available to the participants. A held-out testing set with 70 cases is utilized for the final evaluation and ranking. The methods from different participants are ranked based on four metrics, including Dice Similarity Coefficient (DSC), Hausdorff Distance (HD), Relative Volume Difference (RVD) and Normalized Surface Dice (NSD). A total of 13 teams submitted distinct solutions to resolve the challenges, making several baseline models, pre-processing strategies and anisotropic data processing techniques available to future researchers. The winner method achieved an average DSC of 0.6925, demonstrating a significant growth over our proposed baseline method. To the best of our knowledge, the proposed INSTANCE challenge releases the first intracranial hemorrhage segmentation benchmark, and is also the first challenge that intended to resolve the anisotropic problem in 3D medical image segmentation, which provides new alternatives in these research fields.

preprint2022arXiv

An Efficient Target Detection and Recognition Method in Aerial Remote-sensing Images Based on Multiangle Regions-of-Interest

Recently, deep learning technology have been extensively used in the field of image recognition. However, its main application is the recognition and detection of ordinary pictures and common scenes. It is challenging to effectively and expediently analyze remote-sensing images obtained by the image acquisition systems on unmanned aerial vehicles (UAVs), which includes the identification of the target and calculation of its position. Aerial remote sensing images have different shooting angles and methods compared with ordinary pictures or images, which makes remote-sensing images play an irreplaceable role in some areas. In this study, a new target detection and recognition method in remote-sensing images is proposed based on deep convolution neural network (CNN) for the provision of multilevel information of images in combination with a region proposal network used to generate multiangle regions-of-interest. The proposed method generated results that were much more accurate and precise than those obtained with traditional ways. This demonstrated that the model proposed herein displays tremendous applicability potential in remote-sensing image recognition.

preprint2022arXiv

Confidence Band Estimation for Survival Random Forests

Survival random forest is a popular machine learning tool for modeling censored survival data. However, there is currently no statistically valid and computationally feasible approach for estimating its confidence band. This paper proposes an unbiased confidence band estimation by extending recent developments in infinite-order incomplete U-statistics. The idea is to estimate the variance-covariance matrix of the cumulative hazard function prediction on a grid of time points. We then generate the confidence band by viewing the cumulative hazard function estimation as a Gaussian process whose distribution can be approximated through simulation. This approach is computationally easy to implement when the subsampling size of a tree is no larger than half of the total training sample size. Numerical studies show that our proposed method accurately estimates the confidence band and achieves desired coverage rate. We apply this method to veterans' administration lung cancer data.

preprint2022arXiv

Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation

Since the rise of vision-language navigation (VLN), great progress has been made in instruction following -- building a follower to navigate environments under the guidance of instructions. However, far less attention has been paid to the inverse task: instruction generation -- learning a speaker~to generate grounded descriptions for navigation routes. Existing VLN methods train a speaker independently and often treat it as a data augmentation tool to strengthen the follower while ignoring rich cross-task relations. Here we describe an approach that learns the two tasks simultaneously and exploits their intrinsic correlations to boost the training of each: the follower judges whether the speaker-created instruction explains the original navigation route correctly, and vice versa. Without the need of aligned instruction-path pairs, such cycle-consistent learning scheme is complementary to task-specific training targets defined on labeled data, and can also be applied over unlabeled paths (sampled without paired instructions). Another agent, called~creator is added to generate counterfactual environments. It greatly changes current scenes yet leaves novel items -- which are vital for the execution of original instructions -- unchanged. Thus more informative training scenes are synthesized and the three agents compose a powerful VLN learning system. Extensive experiments on a standard benchmark show that our approach improves the performance of various follower models and produces accurate navigation instructions.

preprint2021arXiv

Strong decays of the low-lying doubly bottom baryons

In this work, we adopt the $^3P_0$ model to investigate the strong decays of the low-lying doubly bottom baryons in the $j-j$ coupling scheme systematically. In this scheme, we construct the formulism of $^3P_0$ model under the spectator assumption, and then the heavy diquark symmetry is preserved automatically. Our results show that some of the $λ$-mode $Ξ_{bb}(1P)$ and $Ω_{bb}(1P)$ states are narrow, which have good potentials to be observed by future experiments. For the low-lying $ρ$-mode and $ρ$-$λ$ hybrid states, the Okubo-Zweig-Iizuka-allowed strong decays are highly suppressed and they should be extremely narrow. Future experiments can test our phenomenological predictions at the quark level.

preprint2021arXiv

Structured Scene Memory for Vision-Language Navigation

Recently, numerous algorithms have been developed to tackle the problem of vision-language navigation (VLN), i.e., entailing an agent to navigate 3D environments through following linguistic instructions. However, current VLN agents simply store their past experiences/observations as latent states in recurrent networks, failing to capture environment layouts and make long-term planning. To address these limitations, we propose a crucial architecture, called Structured Scene Memory (SSM). It is compartmentalized enough to accurately memorize the percepts during navigation. It also serves as a structured scene representation, which captures and disentangles visual and geometric cues in the environment. SSM has a collect-read controller that adaptively collects information for supporting current decision making and mimics iterative algorithms for long-range reasoning. As SSM provides a complete action space, i.e., all the navigable places on the map, a frontier-exploration based navigation decision making strategy is introduced to enable efficient and global planning. Experiment results on two VLN datasets (i.e., R2R and R4R) show that our method achieves state-of-the-art performance on several metrics.

preprint2020arXiv

Active Visual Information Gathering for Vision-Language Navigation

Vision-language navigation (VLN) is the task of entailing an agent to carry out navigational instructions inside photo-realistic environments. One of the key challenges in VLN is how to conduct a robust navigation by mitigating the uncertainty caused by ambiguous instructions and insufficient observation of the environment. Agents trained by current approaches typically suffer from this and would consequently struggle to avoid random and inefficient actions at every step. In contrast, when humans face such a challenge, they can still maintain robust navigation by actively exploring the surroundings to gather more information and thus make more confident navigation decisions. This work draws inspiration from human navigation behavior and endows an agent with an active information gathering ability for a more intelligent vision-language navigation policy. To achieve this, we propose an end-to-end framework for learning an exploration policy that decides i) when and where to explore, ii) what information is worth gathering during exploration, and iii) how to adjust the navigation decision after the exploration. The experimental results show promising exploration strategies emerged from training, which leads to significant boost in navigation performance. On the R2R challenge leaderboard, our agent gets promising results all three VLN settings, i.e., single run, pre-exploration, and beam search.

preprint2020arXiv

Empirical Likelihood Weighted Estimation of Average Treatment Effects

There has been growing attention on how to effectively and objectively use covariate information when the primary goal is to estimate the average treatment effect (ATE) in randomized clinical trials (RCTs). In this paper, we propose an effective weighting approach to extract covariate information based on the empirical likelihood (EL) method. The resulting two-sample empirical likelihood weighted (ELW) estimator includes two classes of weights, which are obtained from a constrained empirical likelihood estimation procedure, where the covariate information is effectively incorporated into the form of general estimating equations. Furthermore, this ELW approach separates the estimation of ATE from the analysis of the covariate-outcome relationship, which implies that our approach maintains objectivity. In theory, we show that the proposed ELW estimator is semiparametric efficient. We extend our estimator to tackle the scenarios where the outcomes are missing at random (MAR), and prove the double robustness and multiple robustness properties of our estimator. Furthermore, we derive the semiparametric efficiency bound of all regular and asymptotically linear semiparametric ATE estimators under MAR mechanism and prove that our proposed estimator attains this bound. We conduct simulations to make comparisons with other existing estimators, which confirm the efficiency and multiple robustness property of our proposed ELW estimator. An application to the AIDS Clinical Trials Group Protocol 175 (ACTG 175) data is conducted.

preprint2020arXiv

Robust Encoder-Decoder Learning Framework towards Offline Handwritten Mathematical Expression Recognition Based on Multi-Scale Deep Neural Network

Offline handwritten mathematical expression recognition is a challenging task, because handwritten mathematical expressions mainly have two problems in the process of recognition. On one hand, it is how to correctly recognize different mathematical symbols. On the other hand, it is how to correctly recognize the two-dimensional structure existing in mathematical expressions. Inspired by recent work in deep learning, a new neural network model that combines a Multi-Scale convolutional neural network (CNN) with an Attention recurrent neural network (RNN) is proposed to identify two-dimensional handwritten mathematical expressions as one-dimensional LaTeX sequences. As a result, the model proposed in the present work has achieved a WER error of 25.715% and ExpRate of 28.216%.

preprint2020arXiv

Strong decays of the newly observed narrow $Ω_b$ structures

Motivated by the newly observed narrow structures $Ω_b(6316)^-$, $Ω_b(6330)^-$, $Ω_b(6340)^-$, and $Ω_b(6350)^-$ in the $Ξ_b^0 K^-$ mass spectrum, we investigate the strong decays of the low-lying $Ω_b$ states within the $^3P_0$ model systematically. According to their masses and decay widths, the observed $Ω_b(6316)^-$, $Ω_b(6330)^-$, $Ω_b(6340)^-$, and $Ω_b(6350)^-$ resonances can be reasonably assigned as the $λ-$mode $Ω_b(1P)$ states with $J^P=1/2^-, 3/2^-, 3/2^-$, and $5/2^-$. Meanwhile, the remaining $P-$wave state with $J^P=1/2^-$ should have a rather broad width, which can hardly be observed by experiments. For the $Ω_b(2S)$ and $Ω_b(1D)$ states, our predictions show that these states have relatively narrow total widths and mainly decay into the $Ξ_b \bar K$, $Ξ_b^\prime \bar K$ and $Ξ_b^{\prime*} \bar K$ final states. These abundant theoretical predictions may be valuable for searching more excited $Ω_b$ states in future experiments.

preprint2020arXiv

The newly observed $Λ_b(6072)^0$ structure and its $ρ-$mode nonstrange partners

Inspired by the newly observed $Λ_b(6072)$ structure, we investigate its strong decay behaviors under various assignments within the $^3P_0$ model. Compared with the mass and total decay width, our results suggest that the $Λ_b(6072)$ can be regarded as the lowest $ρ-$mode excitation in $Λ_b$ family. Then, the strong decays of $ρ-$mode nonstrange partners for the $Λ_b(6072)$ are calculated. It is found that the $J^P=5/2^-$ $Λ_b$ and $Λ_c$ states are relatively narrow, and mainly decay into the $Σ_b^{(*)} π$ and $Σ_c^{(*)} π$ final states, respectively. These two states have good potentials to be observed in future experiments, which may help us to distinguish the three-quark model and diquark model.

preprint2015arXiv

A Connectivity-Aware Approximation Algorithm for Relay Node Placement in Wireless Sensor Networks

In two-tiered Wireless Sensor Networks (WSNs) relay node placement is one of the key factors impacting the network energy consumption and the system overhead. In this paper, a novel connectivity-aware approximation algorithm for relay node placement in WSNs is proposed to offer a major step forward in saving system overhead. Specifically, a unique Local Search Approximation Algorithm (LSAA) is introduced to solve the Relay Node Single Cover (RNSC) problem. In this proposed LSAA approach, the sensor nodes are allocated into groups and then a local Set Cover (SC) for each group is achieved by a local search algorithm. The union set of all local SCs constitutes a SC of the RNSC problem. The approximation ratio and the time complexity of the LSAA are analyzed by rigorous proof. Additionally, the LSAA approach has been extended to solve the relay node double cover problem. Then, a Relay Location Selection Algorithm (RLSA) is proposed to utilize the resulting SC from LSAA in combining RLSA with the minimum spanning tree heuristic to build the high-tier connectivity. As the RLSA searches for a nearest location to the sink node for each relay node, the high-tier network built by RLSA becomes denser than that by existing works. As a result, the number of added relay nodes for building the connectivity of the high-tier WSN can be significantly saved. Simulation results clearly demonstrate that the proposed LSAA outperforms the approaches reported in literature and the RLSA-based algorithm can noticeably save relay nodes newly deployed for the high-tier connectivity.

preprint2015arXiv

Set Covering-based Approximation Algorithm for Delay Constrained Relay Node Placement in Wireless Sensor Networks

The Delay Constrained Relay Node Placement (DCRNP) problem in Wireless Sensor Networks (WSNs) aims to deploy minimum relay nodes such that for each sensor node there is a path connecting this sensor node to the sink without violating delay constraint. As WSNs are gradually employed in time-critical applications, the importance of the DCRNP problem becomes noticeable. For the NP-hard nature of DCRNP problem, an approximation algorithm-Set-Covering-based Relay Node Placement (SCA) is proposed to solve the DCRNP problem in this paper. The proposed SCA algorithm deploys relay nodes iteratively from sink to the given sensor nodes in hops, i.e., in the $k$th iteration SCA deploys relay nodes at the locations that are $k$ hops apart from the sink. Specifically, in each iteration, SCA first finds the candidate deployment locations located within 1 hop to the relay nodes and sensor nodes, which have already been connected to the sink. Then, a subset of these candidate deployment locations, which can guarantee the existence of paths connecting unconnected sensor nodes to the sink within delay constraint, is selected to deploy relay nodes based on the set covering method. As the iteration of SCA algorithm, the sensor nodes are gradually connected to the sink with satisfying delay constraint. The elaborated analysis of the approximation ratio of SCA algorithm is given out, and we also prove that the SCA is a polynomial time algorithm through rigorous time complexity analysis. To evaluate the performance of the proposed SCA algorithm, extensive simulations are implemented, and the simulation results show that the SCA algorithm can significantly save the deployed relay nodes comparing to the existing algorithms, i.e., at most 31.48% deployed relay nodes can be saved due to SCA algorithm.

preprint2014arXiv

Generation of a coherent near-infrared Kerr frequency comb in a monolithic microresonator with normal GVD

We demonstrate experimentally, and explain theoretically, generation of a wide, fundamentally phase locked Kerr frequency comb in a nonlinear resonator with a normal group velocity dispersion. A magnesium fluoride whispering gallery resonator characterized with 10 GHz free spectral range and pumped either at 780 nm or 795 nm is used in the experiment. The envelope of the observed frequency comb differs significantly from the Kerr frequency comb spectra reported previously. We show via numerical simulation that, while the frequency comb does not correspond to generation of short optical pulses, the relative phases of the generated harmonics are fixed.

preprint2012arXiv

Chaotic dynamics of frequency combs generated with continuously pumped nonlinear microresonators

We theoretically and experimentally investigate the chaotic regime of optical frequency combs generated in nonlinear ring microresonators pumped with continuous wave light. We show that the chaotic regime reveals itself, in an apparently counter-intuitive way, by a flat top symmetric envelope of the frequency spectrum, when observed by means of an optical spectrum analyzer. The comb demodulated on a fast photodiode produces a noisy radio frequency signal with an spectral width significantly exceeding the linear bandwidth of the microresonator mode.

preprint2012arXiv

Empirical Likelihood for Right Censored Lifetime Data

This paper considers the empirical likelihood (EL) construction of confidence intervals for a linear functional based on right censored lifetime data. Many of the results in literature show that log EL has a limiting scaled chi-square distribution, where the scale parameter is a function of the unknown asymptotic variance. The scale parameter has to be estimated for the construction. Additional estimation would reduce the coverage accuracy for the parameter. This diminishes a main advantage of the EL method for censored data. By utilizing certain influence functions in an estimating equation, it is shown that under very general conditions, log EL converges weakly to a standard chi-square distribution and thereby eliminates the need for estimating the scale parameter. Moreover, a special way of employing influence functions eases the otherwise very demanding computations of the EL method. Our approach yields smaller asymptotic variance of the influence function than those comparable ones considered by Wang and Jing (2001) and Qin and Zhao (2007). Thus it is not surprising that confidence intervals using influence functions give a better coverage accuracy as demonstrated by simulations.

preprint2011arXiv

Transient Regime of Kerr Frequency Comb Formation

Temporal growth of an optical Kerr frequency comb generated in a microresonator is studied both experimentally and numerically. We find that the comb emerges from vacuum fluctuations of the electromagnetic field on timescales significantly exceeding the ringdown time of the resonator modes. The frequency harmonics of the comb spread starting from the optically pumped mode if the microresonator is characterized with anomalous group velocity dispersion. The harmonics have different growth rates resulting from sequential four-wave mixing process that explains intrinsic modelocking of the comb.

Wei Liang

What is connected

Connect this record

See the researcher in context

Building this map preview

19 published item(s)

Floor Plan-Guided Visual Navigation Incorporating Depth and Directional Cues

Diffusion-based Generation, Optimization, and Planning in 3D Scenes

The state-of-the-art 3D anisotropic intracranial hemorrhage segmentation on non-contrast head CT: The INSTANCE challenge

An Efficient Target Detection and Recognition Method in Aerial Remote-sensing Images Based on Multiangle Regions-of-Interest

Confidence Band Estimation for Survival Random Forests

Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation

Strong decays of the low-lying doubly bottom baryons

Structured Scene Memory for Vision-Language Navigation

Active Visual Information Gathering for Vision-Language Navigation

Empirical Likelihood Weighted Estimation of Average Treatment Effects

Robust Encoder-Decoder Learning Framework towards Offline Handwritten Mathematical Expression Recognition Based on Multi-Scale Deep Neural Network

Strong decays of the newly observed narrow $Ω_b$ structures

The newly observed $Λ_b(6072)^0$ structure and its $ρ-$mode nonstrange partners

A Connectivity-Aware Approximation Algorithm for Relay Node Placement in Wireless Sensor Networks

Set Covering-based Approximation Algorithm for Delay Constrained Relay Node Placement in Wireless Sensor Networks

Generation of a coherent near-infrared Kerr frequency comb in a monolithic microresonator with normal GVD

Chaotic dynamics of frequency combs generated with continuously pumped nonlinear microresonators

Empirical Likelihood for Right Censored Lifetime Data

Transient Regime of Kerr Frequency Comb Formation