Researcher profile

Jie Luo

Jie Luo contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
20works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

20 published item(s)

preprint2026arXiv

Arbitrary Reflectionless Optical Routing via Non-Hermitian Zero-Index Networks

Optical routers are fundamental to photonic systems, but their performance is often limited by unwanted reflections and constrained functionalities. Existing design strategies generally lack complete control over reflectionless pathways and typically require computationally intensive iterative optimization. A general analytical framework for the inverse design of arbitrary reflectionless routing has remained unavailable. Here, we present an analytical inverse-design approach based on non-Hermitian zero-index networks, which enables arbitrary reflectionless routing for nearly any desired scattering response. By establishing a direct algebraic mapping between target scattering responses and the network's physical parameters, we transform the design process from iterative optimization into deterministic calculation. This approach enables the precise engineering of arbitrary reflectionless optical routing. We demonstrate its broad utility by designing devices from unicast and multicast routers with full amplitude and phase control to coherent beam combiners and spatial mode demultiplexers in four-port and six-port networks. Our work provides a systematic and analytical route to designing advanced light-control devices.

preprint2026arXiv

Enjoy Your Layer Normalization with the Computational Efficiency of RMSNorm

Layer normalization (LN) is a fundamental component in modern deep learning, but its per-sample centering and scaling introduce non-negligible inference overhead. RMSNorm improves efficiency by removing the centering operation, yet this may discard benefits associated with centering. This paper propose a framework to determine whether an LN in an arbitrary DNN can be replaced by RMSNorm without changing the model function. The key idea is to fold LN's centering operation into upstream general linear layers by enforcing zero-mean outputs through the column-centered constraint (CCC) and column-based weight centering (CBWC). We extend the analysis to arbitrary DNNs, define such LNs as foldable LNs, and develop a graph-based detection algorithm. Our analysis shows that many LNs in widely used architectures are foldable, enabling exact inference-time conversion and end-to-end acceleration of 2% to 12% without changing model predictions. Experiments across multiple task families further show that, when exact equivalence is partially broken in practical training settings, our method remains competitive with vanilla LN while improving efficiency.

preprint2025arXiv

Time-Aware Adaptive Side Information Fusion for Sequential Recommendation

Incorporating item-side information, such as category and brand, into sequential recommendation is a well-established and effective approach for improving performance. However, despite significant advancements, current models are generally limited by three key challenges: they often overlook the fine-grained temporal dynamics inherent in timestamps, exhibit vulnerability to noise in user interaction sequences, and rely on computationally expensive fusion architectures. To systematically address these challenges, we propose the Time-Aware Adaptive Side Information Fusion (TASIF) framework. TASIF integrates three synergistic components: (1) a simple, plug-and-play time span partitioning mechanism to capture global temporal patterns; (2) an adaptive frequency filter that leverages a learnable gate to denoise feature sequences adaptively, thereby providing higher-quality inputs for subsequent fusion modules; and (3) an efficient adaptive side information fusion layer, this layer employs a "guide-not-mix" architecture, where attributes guide the attention mechanism without being mixed into the content-representing item embeddings, ensuring deep interaction while ensuring computational efficiency. Extensive experiments on four public datasets demonstrate that TASIF significantly outperforms state-of-the-art baselines while maintaining excellent efficiency in training. Our source code is available at https://github.com/jluo00/TASIF.

preprint2023arXiv

ESP: Exploiting Symmetry Prior for Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) has achieved promising results in recent years. However, most existing reinforcement learning methods require a large amount of data for model training. In addition, data-efficient reinforcement learning requires the construction of strong inductive biases, which are ignored in the current MARL approaches. Inspired by the symmetry phenomenon in multi-agent systems, this paper proposes a framework for exploiting prior knowledge by integrating data augmentation and a well-designed consistency loss into the existing MARL methods. In addition, the proposed framework is model-agnostic and can be applied to most of the current MARL algorithms. Experimental tests on multiple challenging tasks demonstrate the effectiveness of the proposed framework. Moreover, the proposed framework is applied to a physical multi-robot testbed to show its superiority.

preprint2022arXiv

All-Around Real Label Supervision: Cyclic Prototype Consistency Learning for Semi-supervised Medical Image Segmentation

Semi-supervised learning has substantially advanced medical image segmentation since it alleviates the heavy burden of acquiring the costly expert-examined annotations. Especially, the consistency-based approaches have attracted more attention for their superior performance, wherein the real labels are only utilized to supervise their paired images via supervised loss while the unlabeled images are exploited by enforcing the perturbation-based \textit{"unsupervised"} consistency without explicit guidance from those real labels. However, intuitively, the expert-examined real labels contain more reliable supervision signals. Observing this, we ask an unexplored but interesting question: can we exploit the unlabeled data via explicit real label supervision for semi-supervised training? To this end, we discard the previous perturbation-based consistency but absorb the essence of non-parametric prototype learning. Based on the prototypical network, we then propose a novel cyclic prototype consistency learning (CPCL) framework, which is constructed by a labeled-to-unlabeled (L2U) prototypical forward process and an unlabeled-to-labeled (U2L) backward process. Such two processes synergistically enhance the segmentation network by encouraging more discriminative and compact features. In this way, our framework turns previous \textit{"unsupervised"} consistency into new \textit{"supervised"} consistency, obtaining the \textit{"all-around real label supervision"} property of our method. Extensive experiments on brain tumor segmentation from MRI and kidney segmentation from CT images show that our CPCL can effectively exploit the unlabeled data and outperform other state-of-the-art semi-supervised medical image segmentation methods.

preprint2022arXiv

Delving into the Estimation Shift of Batch Normalization in a Network

Batch normalization (BN) is a milestone technique in deep learning. It normalizes the activation using mini-batch statistics during training but the estimated population statistics during inference. This paper focuses on investigating the estimation of population statistics. We define the estimation shift magnitude of BN to quantitatively measure the difference between its estimated population statistics and expected ones. Our primary observation is that the estimation shift can be accumulated due to the stack of BN in a network, which has detriment effects for the test performance. We further find a batch-free normalization (BFN) can block such an accumulation of estimation shift. These observations motivate our design of XBNBlock that replace one BN with BFN in the bottleneck block of residual-style networks. Experiments on the ImageNet and COCO benchmarks show that XBNBlock consistently improves the performance of different architectures, including ResNet and ResNeXt, by a significant margin and seems to be more robust to distribution shift.

preprint2022arXiv

Double-Uncertainty Guided Spatial and Temporal Consistency Regularization Weighting for Learning-based Abdominal Registration

In order to tackle the difficulty associated with the ill-posed nature of the image registration problem, regularization is often used to constrain the solution space. For most learning-based registration approaches, the regularization usually has a fixed weight and only constrains the spatial transformation. Such convention has two limitations: (i) Besides the laborious grid search for the optimal fixed weight, the regularization strength of a specific image pair should be associated with the content of the images, thus the "one value fits all" training scheme is not ideal; (ii) Only spatially regularizing the transformation may neglect some informative clues related to the ill-posedness. In this study, we propose a mean-teacher based registration framework, which incorporates an additional temporal consistency regularization term by encouraging the teacher model's prediction to be consistent with that of the student model. More importantly, instead of searching for a fixed weight, the teacher enables automatically adjusting the weights of the spatial regularization and the temporal consistency regularization by taking advantage of the transformation uncertainty and appearance uncertainty. Extensive experiments on the challenging abdominal CT-MRI registration show that our training strategy can promisingly advance the original learning-based method in terms of efficient hyperparameter tuning and a better tradeoff between accuracy and smoothness.

preprint2022arXiv

Environment induced emergence of collective behaviour in evolving swarms with limited sensing

Designing controllers for robot swarms is challenging, because human developers have typically no good understanding of the link between the details of a controller that governs individual robots and the swarm behavior that is an indirect result of the interactions between swarm members and the environment. In this paper we investigate whether an evolutionary approach can mitigate this problem. We consider a very challenging task where robots with limited sensing and communication abilities must follow the gradient of an environmental feature and use Differential Evolution to evolve a neural network controller for simulated robots. We conduct a systematic study to measure the flexibility and scalability of the method by varying the size of the arena and number of robots in the swarm. The experiments confirm the feasibility of our approach, the evolved robot controllers induced swarm behavior that solved the task. We found that solutions evolved under the harshest conditions (where the environmental clues were the weakest) were the most flexible and that there is a sweet spot regarding the swarm size. Furthermore, we observed collective motion of the swarm, showcasing truly emergent behavior that was not represented in- and selected for during evolution.

preprint2022arXiv

Trust It or Not: Confidence-Guided Automatic Radiology Report Generation

Medical imaging plays a pivotal role in diagnosis and treatment in clinical practice. Inspired by the significant progress in automatic image captioning, various deep learning (DL)-based methods have been proposed to generate radiology reports for medical images. Despite promising results, previous works overlook the uncertainties of their models and are thus unable to provide clinicians with the reliability/confidence of the generated radiology reports to assist their decision-making. In this paper, we propose a novel method to explicitly quantify both the visual uncertainty and the textual uncertainty for DL-based radiology report generation. Such multi-modal uncertainties can sufficiently capture the model confidence degree at both the report level and the sentence level, and thus they are further leveraged to weight the losses for more comprehensive model optimization. Experimental results have demonstrated that the proposed method for model uncertainty characterization and estimation can produce more reliable confidence scores for radiology report generation, and the modified loss function, which takes into account the uncertainties, leads to better model performance on two public radiology report datasets. In addition, the quality of the automatically generated reports was manually evaluated by human raters and the results also indicate that the proposed uncertainties can reflect the variance of clinical diagnosis.

preprint2021arXiv

Accurate Mode-Coupling Characterization of Low-Crosstalk Ring-Core Fibers using Integral Calculation based Swept-Wavelength Interferometry Measurement

In this paper, to accurately characterize the low inter-mode coupling of the weakly-coupled few mode fibers (FMFs), we propose a modified inter-mode coupling characterization method based on swept-wavelength interferometry measurement, in which an integral calculation approach is used to eliminate significant sources of error that may lead to underestimation of the power coupling coefficient. Using the proposed characterization method, a low-crosstalk ring-core fiber (RCF) with low mode dependent loss (MDL) and with single span length up to 100 km is experimentally measured to have low power coupling coefficients between high-order orbital angular momentum (OAM) mode groups of below -30 dB/km over C band. The measured low coupling coefficients based on the proposed method are verified by the direct system power measurements, proving the feasibility and reliability of the proposed inter-mode coupling characterization method.

preprint2021arXiv

Arm locking using laser frequency comb

In this work, we describe an updated version of single arm locking, and the noise amplification due to the nulls can be flexibly restricted with the help of optical frequency comb. We show that, the laser phase noise can be divided by a specific factor with optical frequency comb as the bridge. The analytical results indicate that, the peaks in the science band have been greatly reduced. The performance of the noise suppression shows that the total noise after arm locking can well satisfy the requirement of time delay interferometry, even with the free-running laser source. We also estimate the frequency pulling characteristics of the updated single arm locking, and the results suggest that the pulling rate can be tolerated, without the risk of mode hopping. Arm locking will be a valuable solution for the noise reduction in the space-borne GW detectors. We demonstrate that, with the precise control of the returned laser phase noise, the noise amplification in the science band can be efficiently suppressed based on the updated single arm locking. Not only our method allows the suppression of the peaks, the high gain, low pulling rate, it can also serve for full year, without the potential risk of locking failure due to the arm length mismatch. We finally discuss the unified demonstration of the updated single arm locking, where both the local and the returned laser phase noises can be tuned to generate the expected arm-locking sensor actually. Our work could provide a powerful method for the arm locking in the future space-borne GW detectors.

preprint2021arXiv

Ultra-broadband reflectionless Brewster absorber protected by reciprocity

The Brewster's law predicts zero reflection of p-polarization on a dielectric surface at a particular angle. However, when loss is introduced into the permittivity of the dielectric, the Brewster condition breaks down and reflection unavoidably appears. In this work, we found an exception to this long-standing dilemma by creating a class of nonmagnetic anisotropic metamaterials, where an anomalous Brewster effects with independently tunable absorption and refraction emerges. This loss-independent Brewster effect is bestowed by the extra degrees of freedoms introduced by anisotropy and strictly protected by the reciprocity principle. The bandwidth can cover an extremely wide spectrum from dc to optical frequencies. Two examples of reflectionless Brewster absorbers with different Brewster angles are both demonstrated to achieve large absorbance in a wide spectrum via microwave experiments. Our work extends the scope of Brewster effect to the horizon of nonmagnetic absorptive materials, which promises an unprecedented wide bandwidth for reflectionless absorption with high efficiency.

preprint2020arXiv

Adversarial Uni- and Multi-modal Stream Networks for Multimodal Image Registration

Deformable image registration between Computed Tomography (CT) images and Magnetic Resonance (MR) imaging is essential for many image-guided therapies. In this paper, we propose a novel translation-based unsupervised deformable image registration method. Distinct from other translation-based methods that attempt to convert the multimodal problem (e.g., CT-to-MR) into a unimodal problem (e.g., MR-to-MR) via image-to-image translation, our method leverages the deformation fields estimated from both: (i) the translated MR image and (ii) the original CT image in a dual-stream fashion, and automatically learns how to fuse them to achieve better registration performance. The multimodal registration network can be effectively trained by computationally efficient similarity metrics without any ground-truth deformation. Our method has been evaluated on two clinical datasets and demonstrates promising results compared to state-of-the-art traditional and learning-based methods.

preprint2020arXiv

An Investigation of Feature-based Nonrigid Image Registration using Gaussian Process

For a wide range of clinical applications, such as adaptive treatment planning or intraoperative image update, feature-based deformable registration (FDR) approaches are widely employed because of their simplicity and low computational complexity. FDR algorithms estimate a dense displacement field by interpolating a sparse field, which is given by the established correspondence between selected features. In this paper, we consider the deformation field as a Gaussian Process (GP), whereas the selected features are regarded as prior information on the valid deformations. Using GP, we are able to estimate the both dense displacement field and a corresponding uncertainty map at once. Furthermore, we evaluated the performance of different hyperparameter settings for squared exponential kernels with synthetic, phantom and clinical data respectively. The quantitative comparison shows, GP-based interpolation has performance on par with state-of-the-art B-spline interpolation. The greatest clinical benefit of GP-based interpolation is that it gives a reliable estimate of the mathematical uncertainty of the calculated dense displacement map.

preprint2020arXiv

Are Registration Uncertainty and Error Monotonically Associated

In image-guided neurosurgery, current commercial systems usually provide only rigid registration, partly because it is harder to predict, validate and understand non-rigid registration error. For instance, when surgeons see a discrepancy in aligned image features, they may not be able to distinguish between registration error and actual tissue deformation caused by tumor resection. In this case, the spatial distribution of registration error could help them make more informed decisions, e.g., ignoring the registration where the estimated error is high. However, error estimates are difficult to acquire. Probabilistic image registration (PIR) methods provide measures of registration uncertainty, which could be a surrogate for assessing the registration error. It is intuitive and believed by many clinicians that high uncertainty indicates a large error. However, the monotonic association between uncertainty and error has not been examined in image registration literature. In this pilot study, we attempt to address this fundamental problem by looking at one PIR method, the Gaussian process (GP) registration. We systematically investigate the relation between GP uncertainty and error based on clinical data and show empirically that there is a weak-to-moderate positive monotonic correlation between point-wise GP registration uncertainty and non-rigid registration error.

preprint2020arXiv

Do Public Datasets Assure Unbiased Comparisons for Registration Evaluation?

With the increasing availability of new image registration approaches, an unbiased evaluation is becoming more needed so that clinicians can choose the most suitable approaches for their applications. Current evaluations typically use landmarks in manually annotated datasets. As a result, the quality of annotations is crucial for unbiased comparisons. Even though most data providers claim to have quality control over their datasets, an objective third-party screening can be reassuring for intended users. In this study, we use the variogram to screen the manually annotated landmarks in two datasets used to benchmark registration in image-guided neurosurgeries. The variogram provides an intuitive 2D representation of the spatial characteristics of annotated landmarks. Using variograms, we identified potentially problematic cases and had them examined by experienced radiologists. We found that (1) a small number of annotations may have fiducial localization errors; (2) the landmark distribution for some cases is not ideal to offer fair comparisons. If unresolved, both findings could incur bias in registration evaluation.

preprint2020arXiv

On the Applicability of Registration Uncertainty

Estimating the uncertainty in (probabilistic) image registration enables, e.g., surgeons to assess the operative risk based on the trustworthiness of the registered image data. If surgeons receive inaccurately calculated registration uncertainty and misplace unwarranted confidence in the alignment solutions, severe consequences may result. For probabilistic image registration (PIR), the predominant way to quantify the registration uncertainty is using summary statistics of the distribution of transformation parameters. The majority of existing research focuses on trying out different summary statistics as well as a means to exploit them. Distinctively, in this paper, we study two rarely examined topics: (1) whether those summary statistics of the transformation distribution most informatively represent the registration uncertainty; (2) Does utilizing the registration uncertainty always be beneficial. We show that there are two types of uncertainties: the transformation uncertainty, Ut, and label uncertainty Ul. The conventional way of using Ut to quantify Ul is inappropriate and can be misleading. By a real data experiment, we also share a potentially critical finding that making use of the registration uncertainty may not always be an improvement.

preprint2020arXiv

Unbiased Scene Graph Generation via Rich and Fair Semantic Extraction

Extracting graph representation of visual scenes in image is a challenging task in computer vision. Although there has been encouraging progress of scene graph generation in the past decade, we surprisingly find that the performance of existing approaches is largely limited by the strong biases, which mainly stem from (1) unconsciously assuming relations with certain semantic properties such as symmetric and (2) imbalanced annotations over different relations. To alleviate the negative effects of these biases, we proposed a new and simple architecture named Rich and Fair semantic extraction network (RiFa for short), to not only capture rich semantic properties of the relations, but also fairly predict relations with different scale of annotations. Using pseudo-siamese networks, RiFa embeds the subject and object respectively to distinguish their semantic differences and meanwhile preserve their underlying semantic properties. Then, it further predicts subject-object relations based on both the visual and semantic features of entities under certain contextual area, and fairly ranks the relation predictions for those with a few annotations. Experiments on the popular Visual Genome dataset show that RiFa achieves state-of-the-art performance under several challenging settings of scene graph task. Especially, it performs significantly better on capturing different semantic properties of relations, and obtains the best overall per relation performance.

preprint2019arXiv

Three-dimensional acoustic double-zero-index medium with a Dirac-like point

We report a design and experimental realization of a three-dimensional (3D) acoustic double-zero-index medium (DZIM), whose effective mass density and compressibility are nearly zero simultaneously. The DZIM is constructed from a cubic lattice of three orthogonally-aligned metal rods in air. The combination of lattice symmetry and accidental degeneracy yields a four-fold degenerate point with conical dispersion at the Brillouin zone center, where the material becomes a 3D DZIM. Though occupying a finite volume, the 3D DZIM maintains the wave properties of a "void space," and enables rich applications. For demonstration, we fabricate an acoustic "periscope" by placing the designed 3D DZIM inside a 3D bending waveguide, and observe the unusual wave tunneling effect through this waveguide with undisturbed planar wavefront. Our findings establish a practical route to realize 3D DZIM as an effective acoustic "void space," which offers unprecedented opportunities for advanced sound manipulation.

preprint2019arXiv

Two-Dimensional Optomechanical Crystal Cavity with High Quantum Cooperativity

Optomechanical systems offer new opportunities in quantum information processing and quantum sensing. Many solid-state quantum devices operate at millikelvin temperatures -- however, it has proven challenging to operate nanoscale optomechanical devices at these ultralow temperatures due to their limited thermal conductance and parasitic optical absorption. Here, we demonstrate a two-dimensional optomechanical crystal resonator capable of achieving large cooperativity $C$ and small effective bath occupancy $n_b$, resulting in a quantum cooperativity $C_{\text{eff}}\equiv C/n_b \approx 1.3 > 1$ under continuous-wave optical driving. This is realized using a two-dimensional phononic bandgap structure to host the optomechanical cavity, simultaneously isolating the acoustic mode of interest in the bandgap while allowing heat to be removed by phonon modes outside of the bandgap. This achievement paves the way for a variety of applications requiring quantum-coherent optomechanical interactions, such as transducers capable of bi-directional conversion of quantum states between microwave frequency superconducting quantum circuits and optical photons in a fiber optic network.