Source author record

Lei Sun

Lei Sun appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

45works

20topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

DeformMaster: An Interactive Physics-Neural World Model for Deformable Objects from Videos

World models for deformable objects should recover not only geometry and appearance, but also underlying physical dynamics, interaction grounding, and material behavior. Learning such a model from real videos is challenging because deformable linear, planar, and volumetric objects evolve under high-dimensional deformation, noisy interactions, and complex material response. The model must therefore infer a physical state from visual observations, roll it forward under new interactions, and render the resulting dynamics with high visual fidelity. We present DeformMaster, a video-derived interactive physics--neural world model that turns real interaction videos into an online interactive model of deformable objects within a unified dynamics-and-appearance framework. DeformMaster preserves structured physical rollout while using a neural residual to compensate for unmodeled effects, grounds sparse hand motion as distributed compliant actuator for hand--continuum interaction, represents material response with spatially varying constitutive experts, and drives high-fidelity 4D appearance from the predicted physical evolution. Experiments on real-world deformable-object sequences demonstrate DeformMaster's ability to roll out future dynamics and render dynamic appearance, outperforming state-of-the-art baselines while supporting novel action rollout, material-parameter variation, and dynamic novel-view synthesis.

preprint2025arXiv

Real-world Reinforcement Learning from Suboptimal Interventions

Real-world reinforcement learning (RL) offers a promising approach to training precise and dexterous robotic manipulation policies in an online manner, enabling robots to learn from their own experience while gradually reducing human labor. However, prior real-world RL methods often assume that human interventions are optimal across the entire state space, overlooking the fact that even expert operators cannot consistently provide optimal actions in all states or completely avoid mistakes. Indiscriminately mixing intervention data with robot-collected data inherits the sample inefficiency of RL, while purely imitating intervention data can ultimately degrade the final performance achievable by RL. The question of how to leverage potentially suboptimal and noisy human interventions to accelerate learning without being constrained by them thus remains open. To address this challenge, we propose SiLRI, a state-wise Lagrangian reinforcement learning algorithm for real-world robot manipulation tasks. Specifically, we formulate the online manipulation problem as a constrained RL optimization, where the constraint bound at each state is determined by the uncertainty of human interventions. We then introduce a state-wise Lagrange multiplier and solve the problem via a min-max optimization, jointly optimizing the policy and the Lagrange multiplier to reach a saddle point. Built upon a human-as-copilot teleoperation system, our algorithm is evaluated through real-world experiments on diverse manipulation tasks. Experimental results show that SiLRI effectively exploits human suboptimal interventions, reducing the time required to reach a 90% success rate by at least 50% compared with the state-of-the-art RL method HIL-SERL, and achieving a 100% success rate on long-horizon manipulation tasks where other RL methods struggle to succeed. Project website: https://silri-rl.github.io/.

preprint2023arXiv

Event-Based Fusion for Motion Deblurring with Cross-modal Attention

Traditional frame-based cameras inevitably suffer from motion blur due to long exposure times. As a kind of bio-inspired camera, the event camera records the intensity changes in an asynchronous way with high temporal resolution, providing valid image degradation information within the exposure time. In this paper, we rethink the eventbased image deblurring problem and unfold it into an end-to-end two-stage image restoration network. To effectively fuse event and image features, we design an event-image cross-modal attention module applied at multiple levels of our network, which allows to focus on relevant features from the event branch and filter out noise. We also introduce a novel symmetric cumulative event representation specifically for image deblurring as well as an event mask gated connection between the two stages of our network which helps avoid information loss. At the dataset level, to foster event-based motion deblurring and to facilitate evaluation on challenging real-world images, we introduce the Real Event Blur (REBlur) dataset, captured with an event camera in an illumination controlled optical laboratory. Our Event Fusion Network (EFNet) sets the new state of the art in motion deblurring, surpassing both the prior best-performing image-based method and all event-based methods with public implementations on the GoPro dataset (by up to 2.47dB) and on our REBlur dataset, even in extreme blurry conditions. The code and our REBlur dataset will be made publicly available.

preprint2022arXiv

Additional evidence for a pulsar wind nebula in the heart of SN 1987A from multi-epoch X-ray data and MHD modeling

Since the day of its explosion, supernova (SN) 1987A has been closely monitored to study its evolution and to detect its central compact relic. In fact, the formation of a neutron star is strongly supported by the detection of neutrinos from the SN. However, besides the detection in the Atacama Large Millimeter/submillimeter Array (ALMA) data of a feature that is compatible with the emission arising from a proto-pulsar wind nebula (PWN), the only hint for the existence of such elusive compact object is provided by the detection of hard emission in NuSTAR data up to ~ 20 keV. We report on the simultaneous analysis of multi-epoch observations of SN 1987A performed with Chandra, XMM-Newton and NuSTAR. We also compare the observations with a state-of-the-art 3D magnetohydrodynamic (MHD) simulation of SN 1987A. A heavily absorbed power-law, consistent with the emission from a PWN embedded in the heart of SN 1987A, is needed to properly describe the high-energy part of the observed spectra. The spectral parameters of the best-fit power-law are in agreement with the previous estimate, and exclude diffusive shock acceleration as a possible mechanism responsible for the observed non-thermal emission. The information extracted from our analysis are used to infer the physical characteristics of the pulsar and the broad-band emission of its nebula, in agreement with the ALMA data. Analysis of the synthetic spectra also show that, in the near future, the main contribution to Fe K emission line will originate in the outermost shocked ejecta of SN 1987A.

preprint2022arXiv

Annular Computational Imaging: Capture Clear Panoramic Images through Simple Lens

Panoramic Annular Lens (PAL) composed of few lenses has great potential in panoramic surrounding sensing tasks for mobile and wearable devices because of its tiny size and large Field of View (FoV). However, the image quality of tiny-volume PAL confines to optical limit due to the lack of lenses for aberration correction. In this paper, we propose an Annular Computational Imaging (ACI) framework to break the optical limit of light-weight PAL design. To facilitate learning-based image restoration, we introduce a wave-based simulation pipeline for panoramic imaging and tackle the synthetic-to-real gap through multiple data distributions. The proposed pipeline can be easily adapted to any PAL with design parameters and is suitable for loose-tolerance designs. Furthermore, we design the Physics Informed Image Restoration Network (PI2RNet) considering the physical priors of panoramic imaging and single-pass physics-informed engine. At the dataset level, we create the DIVPano dataset and the extensive experiments on it illustrate that our proposed network sets the new state of the art in the panoramic image restoration under spatially-variant degradation. In addition, the evaluation of the proposed ACI on a simple PAL with only 3 spherical lenses reveals the delicate balance between high-quality panoramic imaging and compact design. To the best of our knowledge, we are the first to explore Computational Imaging (CI) in PAL. Code and datasets are publicly available at https://github.com/zju-jiangqi/ACI-PI2RNet.

preprint2022arXiv

CE-based white-box adversarial attacks will not work using super-fitting

Deep neural networks are widely used in various fields because of their powerful performance. However, recent studies have shown that deep learning models are vulnerable to adversarial attacks, i.e., adding a slight perturbation to the input will make the model obtain wrong results. This is especially dangerous for some systems with high-security requirements, so this paper proposes a new defense method by using the model super-fitting state to improve the model's adversarial robustness (i.e., the accuracy under adversarial attacks). This paper mathematically proves the effectiveness of super-fitting and enables the model to reach this state quickly by minimizing unrelated category scores (MUCS). Theoretically, super-fitting can resist any existing (even future) CE-based white-box adversarial attacks. In addition, this paper uses a variety of powerful attack algorithms to evaluate the adversarial robustness of super-fitting, and the proposed method is compared with nearly 50 defense models from recent conferences. The experimental results show that the super-fitting method in this paper can make the trained model obtain the highest adversarial robustness.

preprint2022arXiv

Efficient Human Pose Estimation via 3D Event Point Cloud

Human Pose Estimation (HPE) based on RGB images has experienced a rapid development benefiting from deep learning. However, event-based HPE has not been fully studied, which remains great potential for applications in extreme scenes and efficiency-critical conditions. In this paper, we are the first to estimate 2D human pose directly from 3D event point cloud. We propose a novel representation of events, the rasterized event point cloud, aggregating events on the same position of a small time slice. It maintains the 3D features from multiple statistical cues and significantly reduces memory consumption and computation complexity, proved to be efficient in our work. We then leverage the rasterized event point cloud as input to three different backbones, PointNet, DGCNN, and Point Transformer, with two linear layer decoders to predict the location of human keypoints. We find that based on our method, PointNet achieves promising results with much faster speed, whereas Point Transfomer reaches much higher accuracy, even close to previous event-frame-based methods. A comprehensive set of results demonstrates that our proposed method is consistently effective for these 3D backbone models in event-driven human pose estimation. Our method based on PointNet with 2048 points input achieves 82.46mm in MPJPE3D on the DHP19 dataset, while only has a latency of 12.29ms on an NVIDIA Jetson Xavier NX edge computing platform, which is ideally suitable for real-time detection with event cameras. Code is available at https://github.com/MasterHow/EventPointPose.

preprint2022arXiv

FaceFormer: Scale-aware Blind Face Restoration with Transformers

Blind face restoration usually encounters with diverse scale face inputs, especially in the real world. However, most of the current works support specific scale faces, which limits its application ability in real-world scenarios. In this work, we propose a novel scale-aware blind face restoration framework, named FaceFormer, which formulates facial feature restoration as scale-aware transformation. The proposed Facial Feature Up-sampling (FFUP) module dynamically generates upsampling filters based on the original scale-factor priors, which facilitate our network to adapt to arbitrary face scales. Moreover, we further propose the facial feature embedding (FFE) module which leverages transformer to hierarchically extract diversity and robustness of facial latent. Thus, our FaceFormer achieves fidelity and robustness restored faces, which possess realistic and symmetrical details of facial components. Extensive experiments demonstrate that our proposed method trained with synthetic dataset generalizes better to a natural low quality images than current state-of-the-arts.

preprint2022arXiv

Five-channel frequency-division multiplexing using low-loss epsilon-near-zero metamaterial waveguide

The rapidly growing global data usage has demanded more efficient ways to utilize the scarce electromagnetic spectrum resource. Recent research has focused on the development of efficient multiplexing techniques in the millimeter-wave band (1-10 mm, or 30-300 GHz) due to the promise of large available bandwidth for future wireless networks. Frequency-division multiplexing is still one of the most commonly-used techniques to maximize the transmission capacity of a wireless network. Based on the frequency-selective tunnelling effect of the low-loss epsilon-near-zero metamaterial waveguide, we numerically and experimentally demonstrate five-channel frequency-division multiplexing and demultiplexing in the millimeter-wave range. We show that this device architecture offers great flexibility to manipulate the filter Q-factors and the transmission spectra of different channels, by changing of the epsilon-near-zero metamaterial waveguide topology and by adding a standard waveguide between two epsilon-near-zero channels. This strategy of frequency-division multiplexing may pave a way for efficiently allocating the spectrum for future communication networks.

preprint2022arXiv

Measuring the severity of multi-collinearity in high dimensions

Multi-collinearity is a wide-spread phenomenon in modern statistical applications and when ignored, can negatively impact model selection and statistical inference. Classic tools and measures that were developed for "$n>p$" data are not applicable nor interpretable in the high-dimensional regime. Here we propose 1) new individualized measures that can be used to visualize patterns of multi-collinearity, and subsequently 2) global measures to assess the overall burden of multi-collinearity without limiting the observed data dimensions. We applied these measures to genomic applications to investigate patterns of multi-collinearity in genetic variations across individuals with diverse ancestral backgrounds. The measures were able to visually distinguish genomic regions of excessive multi-collinearity and contrast the level of multi-collinearity between different continental populations.

preprint2022arXiv

Multi-Task Learning Framework for Emotion Recognition in-the-wild

This paper presents our system for the Multi-Task Learning (MTL) Challenge in the 4th Affective Behavior Analysis in-the-wild (ABAW) competition. We explore the research problems of this challenge from three aspects: 1) For obtaining efficient and robust visual feature representations, we propose MAE-based unsupervised representation learning and IResNet/DenseNet-based supervised representation learning methods; 2) Considering the importance of temporal information in videos, we explore three types of sequential encoders to capture the temporal information, including the encoder based on transformer, the encoder based on LSTM, and the encoder based on GRU; 3) For modeling the correlation between these different tasks (i.e., valence, arousal, expression, and AU) for multi-task affective analysis, we first explore the dependency between these different tasks and propose three multi-task learning frameworks to model the correlations effectively. Our system achieves the performance of $1.7607$ on the validation dataset and $1.4361$ on the test dataset, ranking first in the MTL Challenge. The code is available at https://github.com/AIM3-RUC/ABAW4.

preprint2022arXiv

Novel boron nitride polymorphs with graphite-diamond hybrid structure

Both boron nitride (BN) and carbon (C) have sp, sp2 and sp3 hybridization modes, and thus resulting in a variety of BN and C polymorphs with similar structures, such as hexagonal BN (hBN) and graphite, cubic BN (cBN) and diamond. Here, five types of BN polymorph structures were proposed theoretically, inspired by the graphite-diamond hybrid structures discovered in recent experiment. These BN polymorphs with graphite-diamond hybrid structures possessed excellent mechanical properties with combined high hardness and high ductility, and also exhibited various electronic properties such as semi-conductivity, semi-metallicity, and even one- and two-dimensional conductivity, differing from known insulators hBN and cBN. The simulated diffraction patterns of these BN hybrid structures could account for the unsolved diffraction patterns of intermediate products composed of "compressed hBN" and diamond-like BN, caused by phase transitions in previous experiments. Thus, this work provides a theoretical basis for the presence of these types of hybrid materials during phase transitions between graphite-like and diamond-like BN polymorphs.

preprint2022arXiv

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

This paper reviews the NTIRE 2022 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The task of the challenge was to super-resolve an input image with a magnification factor of $\times$4 based on pairs of low and corresponding high resolution images. The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29.00dB on DIV2K validation set. IMDN is set as the baseline for efficiency measurement. The challenge had 3 tracks including the main track (runtime), sub-track one (model complexity), and sub-track two (overall performance). In the main track, the practical runtime performance of the submissions was evaluated. The rank of the teams were determined directly by the absolute value of the average runtime on the validation set and test set. In sub-track one, the number of parameters and FLOPs were considered. And the individual rankings of the two metrics were summed up to determine a final ranking in this track. In sub-track two, all of the five metrics mentioned in the description of the challenge including runtime, parameter count, FLOPs, activations, and memory consumption were considered. Similar to sub-track one, the rankings of five metrics were summed up to determine a final ranking. The challenge had 303 registered participants, and 43 teams made valid submissions. They gauge the state-of-the-art in efficient single image super-resolution.

preprint2022arXiv

On the Connection between Local Attention and Dynamic Depth-wise Convolution

Vision Transformer (ViT) attains state-of-the-art performance in visual recognition, and the variant, Local Vision Transformer, makes further improvements. The major component in Local Vision Transformer, local attention, performs the attention separately over small local windows. We rephrase local attention as a channel-wise locally-connected layer and analyze it from two network regularization manners, sparse connectivity and weight sharing, as well as weight computation. Sparse connectivity: there is no connection across channels, and each position is connected to the positions within a small local window. Weight sharing: the connection weights for one position are shared across channels or within each group of channels. Dynamic weight: the connection weights are dynamically predicted according to each image instance. We point out that local attention resembles depth-wise convolution and its dynamic version in sparse connectivity. The main difference lies in weight sharing - depth-wise convolution shares connection weights (kernel weights) across spatial positions. We empirically observe that the models based on depth-wise convolution and the dynamic variant with lower computation complexity perform on-par with or sometimes slightly better than Swin Transformer, an instance of Local Vision Transformer, for ImageNet classification, COCO object detection and ADE semantic segmentation. These observations suggest that Local Vision Transformer takes advantage of two regularization forms and dynamic weight to increase the network capacity. Code is available at https://github.com/Atten4Vis/DemystifyLocalViT.

preprint2022arXiv

Real Image Restoration via Structure-preserving Complementarity Attention

Since convolutional neural networks perform well in learning generalizable image priors from large-scale data, these models have been widely used in image denoising tasks. However, the computational complexity increases dramatically as well on complex model. In this paper, We propose a novel lightweight Complementary Attention Module, which includes a density module and a sparse module, which can cooperatively mine dense and sparse features for feature complementary learning to build an efficient lightweight architecture. Moreover, to reduce the loss of details caused by denoising, this paper constructs a gradient-based structure-preserving branch. We utilize gradient-based branches to obtain additional structural priors for denoising, and make the model pay more attention to image geometric details through gradient loss optimization.Based on the above, we propose an efficiently Unet structured network with dual branch, the visual results show that can effectively preserve the structural details of the original image, we evaluate benchmarks including SIDD and DND, where SCANet achieves state-of-the-art performance in PSNR and SSIM while significantly reducing computational cost.

preprint2022arXiv

Rethinking Classifier and Adversarial Attack

Various defense models have been proposed to resist adversarial attack algorithms, but existing adversarial robustness evaluation methods always overestimate the adversarial robustness of these models (i.e., not approaching the lower bound of robustness). To solve this problem, this paper uses the proposed decouple space method to divide the classifier into two parts: non-linear and linear. Then, this paper defines the representation vector of the original example (and its space, i.e., the representation space) and uses the iterative optimization of Absolute Classification Boundaries Initialization (ACBI) to obtain a better attack starting point. Particularly, this paper applies ACBI to nearly 50 widely-used defense models (including 8 architectures). Experimental results show that ACBI achieves lower robust accuracy in all cases.

preprint2022arXiv

TSRFormer: Table Structure Recognition with Transformers

We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognizing the structures of complex tables with geometrical distortions from various table images. Unlike previous methods, we formulate table separation line prediction as a line regression problem instead of an image segmentation problem and propose a new two-stage DETR based separator prediction approach, dubbed \textbf{Sep}arator \textbf{RE}gression \textbf{TR}ansformer (SepRETR), to predict separation lines from table images directly. To make the two-stage DETR framework work efficiently and effectively for the separation line prediction task, we propose two improvements: 1) A prior-enhanced matching strategy to solve the slow convergence issue of DETR; 2) A new cross attention module to sample features from a high-resolution convolutional feature map directly so that high localization accuracy is achieved with low computational cost. After separation line prediction, a simple relation network based cell merging module is used to recover spanning cells. With these new techniques, our TSRFormer achieves state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet and WTW. Furthermore, we have validated the robustness of our approach to tables with complex structures, borderless cells, large blank spaces, empty or spanning cells as well as distorted or even curved shapes on a more challenging real-world in-house dataset.

preprint2022arXiv

Unusually high HCO+/CO ratios in and outside supernova remnant W49B

Galactic supernova remnants (SNRs) and their environments provide the nearest laboratories to study SN feedback. We performed molecular observations toward SNR W49B, the most luminous Galactic SNR in the X-ray band, aiming to explore signs of multiple feedback channels of SNRs on nearby molecular clouds (MCs). We found very broad HCO+ lines with widths of dv = 48--75 km/s in the SNR southwest, providing strong evidence that W49B is perturbing MCs at a systemic velocity of $V_{LSR}=61$--65 km/s, and placing W49B at a distance of $7.9\pm 0.6$ kpc. We observed unusually high-intensity ratios of HCO+ J=1-0/CO J=1-0 not only at shocked regions ($1.1\pm 0.4$ and $0.70\pm 0.16$), but also in quiescent clouds over 1 pc away from the SNR's eastern boundary (> 0.2). By comparing with the magnetohydrodynamics shock models, we interpret that the high ratio in the broad-line regions can result from a cosmic-ray (CR) induced chemistry in shocked MCs, where the CR ionization rate is enhanced to around 10--100 times of the Galactic level. The high HCO+/CO ratio outside the SNR is probably caused by the radiation precursor, while the luminous X-ray emission of W49B can explain a few properties in this region. The above results provide observational evidence that SNRs can strongly influence the molecular chemistry in and outside the shock boundary via their shocks, CRs, and radiation. We propose that the HCO+/CO ratio is a potentially useful tool to probe an SNR's multichannel influence on MCs.

preprint2021arXiv

TriVoC: Efficient Voting-based Consensus Maximization for Robust Point Cloud Registration with Extreme Outlier Ratios

Correspondence-based point cloud registration is a cornerstone in robotics perception and computer vision, which seeks to estimate the best rigid transformation aligning two point clouds from the putative correspondences. However, due to the limited robustness of 3D keypoint matching approaches, outliers, probably in large numbers, are prone to exist among the correspondences, which makes robust registration methods imperative. Unfortunately, existing robust methods have their own limitations (e.g. high computational cost or limited robustness) when facing high or extreme outlier ratios, probably unsuitable for practical use. In this paper, we present a novel, fast, deterministic and guaranteed robust solver, named TriVoC (Triple-layered Voting with Consensus maximization), for the robust registration problem. We decompose the selecting of the minimal 3-point sets into 3 consecutive layers, and in each layer we design an efficient voting and correspondence sorting framework on the basis of the pairwise equal-length constraint. In this manner, the 3-point sets can be selected independently from the reduced correspondence sets according to the sorted sequence, which can significantly lower the computational cost and meanwhile provide a strong guarantee to achieve the largest consensus set (as the final inlier set) as long as a probabilistic termination condition is fulfilled. Varied experiments show that our solver TriVoC is robust against up to 99% outliers, highly accurate, time-efficient even with extreme outlier ratios, and also practical for real-world applications, showing performance superior to other state-of-the-art competitors.

preprint2020arXiv

Adaptive Operator Selection Based on Dynamic Thompson Sampling for MOEA/D

In evolutionary computation, different reproduction operators have various search dynamics. To strike a well balance between exploration and exploitation, it is attractive to have an adaptive operator selection (AOS) mechanism that automatically chooses the most appropriate operator on the fly according to the current status. This paper proposes a new AOS mechanism for multi-objective evolutionary algorithm based on decomposition (MOEA/D). More specifically, the AOS is formulated as a multi-armed bandit problem where the dynamic Thompson sampling (DYTS) is applied to adapt the bandit learning model, originally proposed with an assumption of a fixed award distribution, to a non-stationary setup. In particular, each arm of our bandit learning model represents a reproduction operator and is assigned with a prior reward distribution. The parameters of these reward distributions will be progressively updated according to the performance of its performance collected from the evolutionary process. When generating an offspring, an operator is chosen by sampling from those reward distribution according to the DYTS. Experimental results fully demonstrate the effectiveness and competitiveness of our proposed AOS mechanism compared with other four state-of-the-art MOEA/D variants.

preprint2020arXiv

An XMM-Newton X-ray View of Supernova Remnant W49B: Revisiting its Recombining Plasmas and Progenitor Type

We present a comprehensive X-ray spectroscopy and imaging study of supernova remnant W49B using archival XMM-Newton observations. The overionization state of the shocked ejecta in W49B is clearly indicated by the radiative recombination continua of Si XIV, S XV, and Fe XXV, combined with the Ly$α$ lines of Ca and Fe. The line flux images of W49B indicate high emission measures of the central bar-like region for almost all the emission lines, while the equivalent width maps reveal a stratified structure for the metal abundance distributions. The global spectrum of W49B is well reproduced by a model containing one collisional ionization equilibrium (CIE) plasma component and two recombining plasma (RP) components. The CIE plasma represents the shocked interstellar medium, which dominates the X-ray emitting volume in W49B with a mass $\sim450M_\odot$. The two RP components with a total mass $\sim4.6M_\odot$ are both dominated by the ejecta material, but characterized by different electron temperatures ($\sim1.60$ keV and $\sim0.64$ keV). The recombination ages of the RP components are estimated as $\sim6000$ yr and $\sim3400$ yr, respectively. We then reveal the possibility of a thermal conduction origin for the high-temperature RP in W49B by calculating the conduction timescale. The metal abundance ratios of the ejecta in W49B are roughly consistent with a core-collapse explosion model with a $\lesssim15M_\odot$ progenitor, except for a rather high Mn/Fe. A Type Ia origin can explain the Mn abundance, while it predicts much higher ejecta masses than observed values for all the metal species considered in our analysis.

preprint2020arXiv

Real-time Fusion Network for RGB-D Semantic Segmentation Incorporating Unexpected Obstacle Detection for Road-driving Images

Semantic segmentation has made striking progress due to the success of deep convolutional neural networks. Considering the demands of autonomous driving, real-time semantic segmentation has become a research hotspot these years. However, few real-time RGB-D fusion semantic segmentation studies are carried out despite readily accessible depth information nowadays. In this paper, we propose a real-time fusion semantic segmentation network termed RFNet that effectively exploits complementary cross-modal information. Building on an efficient network architecture, RFNet is capable of running swiftly, which satisfies autonomous vehicles applications. Multi-dataset training is leveraged to incorporate unexpected small obstacle detection, enriching the recognizable classes required to face unforeseen hazards in the real world. A comprehensive set of experiments demonstrates the effectiveness of our framework. On Cityscapes, Our method outperforms previous state-of-the-art semantic segmenters, with excellent accuracy and 22Hz inference speed at the full 2048x1024 resolution, outperforming most existing RGB-D networks.

preprint2020arXiv

ReLaText: Exploiting Visual Relationships for Arbitrary-Shaped Scene Text Detection with Graph Convolutional Networks

We introduce a new arbitrary-shaped text detection approach named ReLaText by formulating text detection as a visual relationship detection problem. To demonstrate the effectiveness of this new formulation, we start from using a "link" relationship to address the challenging text-line grouping problem firstly. The key idea is to decompose text detection into two subproblems, namely detection of text primitives and prediction of link relationships between nearby text primitive pairs. Specifically, an anchor-free region proposal network based text detector is first used to detect text primitives of different scales from different feature maps of a feature pyramid network, from which a text primitive graph is constructed by linking each pair of nearby text primitives detected from a same feature map with an edge. Then, a Graph Convolutional Network (GCN) based link relationship prediction module is used to prune wrongly-linked edges in the text primitive graph to generate a number of disjoint subgraphs, each representing a detected text instance. As GCN can effectively leverage context information to improve link prediction accuracy, our GCN based text-line grouping approach can achieve better text detection accuracy than previous text-line grouping methods, especially when dealing with text instances with large inter-character or very small inter-line spacings. Consequently, the proposed ReLaText achieves state-of-the-art performance on five public text detection benchmarks, namely RCTW-17, MSRA-TD500, Total-Text, CTW1500 and DAST1500.

preprint2016arXiv

A Generalized Levene's Scale Test for Variance Heterogeneity in the Presence of Sample Correlation and Group Uncertainty

We generalize Levene's test for variance (scale) heterogeneity between $k$ groups for more complex data, which includes sample correlation and group membership uncertainty. Following a two-stage regression framework, we show that least absolute deviation regression must be used in the stage 1 analysis to ensure a correct asymptotic $χ^2_{k-1}/(k-1)$ distribution of the generalized scale ($gS$) test statistic. We then show that the proposed $gS$ test is independent of the generalized location test, under the joint null hypothesis of no mean and no variance heterogeneity. Consequently, we generalize the recently proposed joint location-scale ($gJLS$) test valuable in settings where there is an interaction effect, but one interacting variable is not available. We evaluate the proposed method via an extensive simulation study, and two genetic association application studies.

preprint2016arXiv

Conetronics in 2D Metal-Organic Frameworks: Double Dirac Cones, Magnetic Half Dirac Cones and Quantum Anomalous Hall Effect

Based on recently synthesized Ni3C12S12 class 2D metal-organic frameworks, we predict electronic properties of M3C12S12 and M3C12O12, where M is Zn, Cd, Hg, Be, or Mg with no M orbital contributions to bands near Fermi level. For M3C12S12, their band structures exhibit double Dirac cones with different Fermi velocities that are n and p type, respectively, which are switchable by few-percent strain. The crossing of two cones are symmetry-protected to be non-hybridizing, leading to two independent channels in 2D node-line semimetals at the same k-point akin to spin-channels in spintronics, rendering conetronics device possible. The node line rings right at their crossing, which are both electron and hole pockets at the Fermi level, can give rise to magnetoresistance that will not saturate when the magnetic field is infinitely large, due to perfect n-p compensation. For M3C12O12, together with conjugated metal-tricatecholate polymers M3(HHTP)2, the spin-polarized slow Dirac cone center is pinned precisely at the Fermi level, making the systems conducting in only one spin or cone channel. Quantum anomalous Hall effect can arise in MOFs with non-negligible spin-orbit coupling like Cu3C12O12. Compounds of M3C12S12 and M3C12O12 with different M, can be used to build spintronic and cone-selecting heterostructure devices, tunable by strain or electrostatic gating.

preprint2016arXiv

Constrained Maximum Correntropy Adaptive Filtering

Constrained adaptive filtering algorithms inculding constrained least mean square (CLMS), constrained affine projection (CAP) and constrained recursive least squares (CRLS) have been extensively studied in many applications. Most existing constrained adaptive filtering algorithms are developed under mean square error (MSE) criterion, which is an ideal optimality criterion under Gaussian noises. This assumption however fails to model the behavior of non-Gaussian noises found in practice. Motivated by the robustness and simplicity of maximum correntropy criterion (MCC) in non-Gaussian impulsive noises, this paper proposes a new adaptive filtering algorithm called constrained maximum correntropy criterion (CMCC). Specifically, CMCC incorporates a linear constraint into a MCC filter to solve a constrained optimization problem explicitly. The proposed adaptive filtering algorithm is easy to implement and has low computational complexity, and in terms of convergence accuracy (say lower mean square deviation) and stability, can significantly outperform those MSE based constrained adaptive algorithms in presence of heavy-tailed impulsive noises. Additionally, the mean square convergence behaviors are studied under energy conservation relation, and a sufficient condition to ensure the mean square convergence and the steady-state mean square deviation (MSD) of the proposed algorithm are obtained. Simulation results confirm the theoretical predictions under both Gaussian and non- Gaussian noises, and demonstrate the excellent performance of the novel algorithm by comparing it with other conventional methods.

preprint2016arXiv

On "Exponential Lower Bounds for Polytopes in Combinatorial Optimization" by Fiorini et al. (2015): A Refutation For Models With Disjoint Sets of Descriptive Variables

We provide a numerical refutation of the developments of Fiorini et al. (2015)* for models with disjoint sets of descriptive variables. We also provide an insight into the meaning of the existence of a one-to-one linear map between solutions of such models. *: Fiorini, S., S. Massar, S. Pokutta, H.R. Tiwary, and R. de Wolf (2015). Exponential Lower Bounds for Polytopes in Combinatorial Optimization. Journal of the ACM 62:2, Article No. 17.

preprint2015arXiv

Diffraction-free optical beam propagation with near-zero phase variation in extremely anisotropic metamaterials

Extremely anisotropic metal-dielectric multilayer metamaterials are designed to have the effective permittivity tensor of a transverse component (parallel to the interfaces of the multilayer) with zero real part and a longitudinal component (normal to the interfaces of the multilayer) with ultra-large imaginary part at the same wavelength, including the optical nonlocality analysis based on the transfer-matrix method. The diffraction-free deep-subwavelength optical beam propagation with near-zero phase variation in the designed multilayer stack due to the near-flat iso-frequency contour is demonstrated and analyzed, including the effects of the multilayer period and the material loss.

preprint2015arXiv

Nonlocal effective medium analysis in symmetric metal-dielectric multilayer metamaterials

The optical nonlocality in symmetric metal-dielectric multilayer metamaterials is theoretically and experimentally investigated with respect to transverse-magnetic-polarized incident light. A nonlocal effective medium theory is derived from the transfer-matrix method to determine the nonlocal effective permittivity depending on both the frequency and wave vector in a symmetric metal-dielectric multilayer stack. In contrast to the local effective medium theory, our proposed nonlocal effective medium theory can accurately predict measured incident angle-dependent reflection spectra from a fabricated multilayer stack and provide nonlocal dispersion relations. Moreover, the bulk plasmon polaritons with large wave vectors supported in the multilayer stack are also investigated with the nonlocal effective medium theory through the analysis of the dispersion relation and eigenmode.

preprint2014arXiv

A novel joint location-scale testing framework for improved detection of variants with main or interaction effects

We propose a novel and easy-to-implement joint location-scale association testing procedure that can account for complex genetic architecture without explicitly modeling interaction effects, and is suitable for large-scale whole-genome scans and meta-analyses. We focus on Fisher's method and use it to combine evidence from the standard location test and the more recent scale test, and we describe its use for single-variant, gene-set and pathway association analyses.

preprint2014arXiv

Pooled Association Tests for Rare Genetic Variants: A Review and Some New Results

In the search for genetic factors that are associated with complex heritable human traits, considerable attention is now being focused on rare variants that individually have small effects. In response, numerous recent papers have proposed testing strategies to assess association between a group of rare variants and a trait, with competing claims about the performance of various tests. The power of a given test in fact depends on the nature of any association and on the rareness of the variants in question. We review such tests within a general framework that covers a wide range of genetic models and types of data. We study the performance of specific tests through exact or asymptotic power formulas and through novel simulation studies of over 10,000 different models. The tests considered are also applied to real sequence data from the 1000 Genomes project and provided by the GAW17. We recommend a testing strategy, but our results show that power to detect association in plausible genetic scenarios is low for studies of medium size unless a high proportion of the chosen variants are causal. Consequently, considerable attention must be given to relevant biological information that can guide the selection of variants for testing.

preprint2014arXiv

Realizing broadband electromagnetic transparency with a graded-permittivity sphere

Broadband electromagnetic transparency phenomenon is realized with a well-designed graded-permittivity sphere, which has an extremely low scattering cross section over a wide frequency range, based on the generalized Mie scattering theory and numerical simulation in full-wave condition. The dynamic polarization cancellation is revealed by studying the variation of the polarization with respect to the frequency. Furthermore, a properly-designed multi-shell sphere is also proposed and examined in order to reduce the rigorous conditions for realizing the broadband transparency in experiments.

preprint2013arXiv

Broadband Epsilon-Near-Zero Metamaterials with Step-Like Metal-Dielectric Multilayer Structures

The concept of the broadband epsilon-near-zero meta-atom consisting of layered stacks with specified metallic filling ratio and thickness is proposed based on the Bergman spectral representation of the effective permittivity. The step-like metal-dielectric multilayer structures are designed to achieve realistic broadband epsilon-near-zero meta-atoms in optical frequency range. These meta-atoms can be integrated as building blocks for unconventional optical components with exotic electromagnetic properties over a wide frequency range, such as the demonstrated broadband directional emission and phase front shaping.

preprint2013arXiv

Deep subwavelength beam propagation in extremely loss-anisotropic metamaterials

Metal-dielectric multilayer metamaterials with extreme loss-anisotropy, in which the longitudinal component of the permittivity tensor has ultra-large imaginary part, are proposed and designed. Diffraction-free deep subwavelength beam propagation and manipulation, due to the nearly flat iso-frequency contour (IFC), is demonstrated in such loss-anisotropic metamaterials. It is also shown that deep subwavelength beam propagation can be realized in practical multilayer structures with large multilayer period, when the nonlocal effect is considered.

preprint2013arXiv

Experimental realization of epsilon-near-zero metamaterial slabs with metal-dielectric multilayers

Epsilon-near-zero (ENZ) metamaterial slabs at visible frequencies based on metal-dielectric multilayers are experimentally realized. Transmission, reflection and absorption spectra are measured and used to determine the complex refractive indices and the effective permittivities of the ENZ slabs, which agree with the results obtained from both the numerical simulations and the optical nonlocalities analysis. Furthermore, light propagation in ENZ slabs and directional emission from ENZ prisms are also analyzed. The accurate determination of the ENZ wavelength for metal-dielectric multilayer metamaterial slabs is important for realizing many unique applications, such as phase front manipulation and enhancement of photonic density of states.

preprint2013arXiv

Giant optical nonlocality near the Dirac point in metal-dielectric multilayer metamaterials

The giant optical nonlocality near the Dirac point in lossless metal-dielectric multilayer metamaterials is revealed and investigated through the analysis of the band structure of the multilayer stack in the three-dimensional omega-k space, according to the transfer-matrix method with the optical nonlocal effect. The position of the Dirac point is analytically located in the omega-k space. It is revealed that the emergence of the Dirac point is due to the degeneracy of the symmetric and the asymmetric eigenmodes of the coupled surface plasmon polaritons. The optical nonlocality induced epsilon-near-zero frequency shift for the multilayer stack compared to the effective medium is studied. Furthermore, the giant optical nonlocality around the Dirac point is explored with the iso-frequency contour analysis, while the beam splitting phenomenon at the Dirac point due to the optical nonlocal effect is also demonstrated.

preprint2013arXiv

Quantum entanglement in plasmonic waveguides with near-zero mode indices

We investigate the quantum entanglement between two quantum dots in a plasmonic waveguide with near-zero mode index, considering the dependence of concurrence on interdot distance, quantum dot-waveguide frequency detuning and coupling strength ratio. High concurrence is achieved for a wide range of interdot distance due to the near-zero mode index, which largely relaxes the strict requirement of interdot distance in conventional dielectric waveguides or metal nanowires. The proposed quantum dot-waveguide system with near-zero phase variation along the waveguide near the mode cutoff frequency shows very promising potential in quantum optics and quantum information processing.

preprint2012arXiv

A Generalized Kruskal-Wallis Test Incorporating Group Uncertainty with Application to Genetic Association Studies

Motivated by genetic association studies of SNPs with genotype uncertainty, we propose a generalization of the Kruskal-Wallis test that incorporates group uncertainty when comparing k samples. The extended test statistic is based on probability-weighted rank-sums and follows an asymptotic chi-square distribution with k-1 degrees of freedom under the null hypothesis. Simulation studies confirm the validity and robustness of the proposed test in finite samples. Application to a genome-wide association study of type 1 diabetic complications further demonstrates the utilities of this generalized Kruskal-Wallis test for studies with group uncertainty.

preprint2012arXiv

Bayesian Latent Variable Modeling of Longitudinal Family Data for Genetic Pleiotropy Studies

Motivated by genetic association studies of pleiotropy, we propose here a Bayesian latent variable approach to jointly study multiple outcomes or phenotypes. The proposed method models both continuous and binary phenotypes, and it accounts for serial and familial correlations when longitudinal and pedigree data have been collected. We present a Bayesian estimation method for the model parameters, and we develop a novel MCMC algorithm that builds upon hierarchical centering and parameter expansion techniques to efficiently sample the posterior distribution. We discuss phenotype and model selection in the Bayesian setting, and we study the performance of two selection strategies based on Bayes factors and spike-and-slab priors. We evaluate the proposed method via extensive simulations and demonstrate its utility with an application to a genome-wide association study of various complication phenotypes related to type 1 diabetes.

preprint2012arXiv

Integrated optical devices based on broadband epsilon-near-zero meta-atoms

We verify the feasibility of the proposed theoretical strategy for designing the broadband near-zero permittivity (ENZ) metamaterial at optical frequency range with numerical simulations. In addition, the designed broadband ENZ stack are used as meta-atoms to build functional nanophotonic devices with extraordinary properties, including an ultranarrow electromagnetic energy tunneling channel and an ENZ concave focusing lens.

preprint2012arXiv

Loss enhanced transmission and collimation in anisotropic epsilon-near-zero metamaterials

We verify the extraordinary transmission enhancement and collimation induced by the material loss in anisotropic near-zero permittivity (ENZ) metamaterials, and reveal the physical mechanism of this exotic electromagnetic phenomenon via the iso-frequency contour (IFC) analysis. In addition, we demonstrate the possibility in realization of such loss enhanced transmission of Gaussian beam in realistic silver-germanium multilayered structures by applying full-wave numerical simulations.

preprint2012arXiv

Strategy for designing broadband epsilon-near-zero metamaterial with loss compensation by gain media

A strategy is proposed to design the broadband gain-doped epsilon-near-zero (GENZ) metamaterial. Based on the Milton representation of effective permittivity, the strategy starts in a dimensionless spectral space, where the effective permittivity of GENZ metamaterial is simply determined by a pole-zero structure corresponding to the operating frequency range. The physical structure of GENZ metamaterial is retrieved from the pole-zero structure via a tractable inverse problem. The strategy is of great advantage in practical applications and also theoretically reveals the cancellation mechanism dominating the broadband near-zero permittivity phenomenon in the spectral space.

preprint2012arXiv

Triggering the Continuous Growth of Graphene toward Millimeter Size Grain

In this report, we demonstrated a simple but efficient strategy to synthesize millimeter-sized graphene single crystal grains by regulating the supply of reactants in chemical vapor deposition process. Polystyrene was used as a carbon source. Pulse heating on the carbon source was utilized to minimize the nucleation density of graphene on copper foil, while the gradual increase in the temperature of carbon source and the flow rate of hydrogen is adapted to drive the continuous growth of graphene grain. As a result, the nucleation density of graphene grain can be controlled as lower as ~100 nuclei/cm2, and the dimension of single crystal grain could grow up to ~1.2 mm. Raman spectroscopy, transmission electron microscopy and electrical transport measurement show that the graphene grains obtained are in high quality. The strategy presented here provides very good controllability and enables the possibility for large graphene single crystals, which is of vital importance for practical applications.

preprint2011arXiv

Bayesian methods to overcome the winner's curse in genetic studies

Parameter estimates for associated genetic variants, report ed in the initial discovery samples, are often grossly inflated compared to the values observed in the follow-up replication samples. This type of bias is a consequence of the sequential procedure in which the estimated effect of an associated genetic marker must first pass a stringent significance threshold. We propose a hierarchical Bayes method in which a spike-and-slab prior is used to account for the possibility that the significant test result may be due to chance. We examine the robustness of the method using different priors corresponding to different degrees of confidence in the testing results and propose a Bayesian model averaging procedure to combine estimates produced by different models. The Bayesian estimators yield smaller variance compared to the conditional likelihood estimator and outperform the latter in studies with low power. We investigate the performance of the method with simulations and applications to four real data examples.

preprint2008arXiv

Constraining Cosmological Parameters with Observational Data Including Weak Lensing Effects

In this paper, we study the cosmological implications of the 100 square degree Weak Lensing survey (the CFHTLS-Wide, RCS, VIRMOS-DESCART and GaBoDS surveys). We combine these weak lensing data with the cosmic microwave background (CMB) measurements from the WMAP5, BOOMERanG, CBI, VSA, ACBAR, the SDSS LRG matter power spectrum and the Type Ia Supernoave (SNIa) data with the "Union" compilation (307 sample), using the Markov Chain Monte Carlo method to determine the cosmological parameters. Our results show that the ΛCDM model remains a good fit to all of these data. For the dynamical dark energy model with time evolving EoS parameterized as w_{\DE}(a) = w_0 + w_a (1-a), we find that the best-fit model implying the mildly preference of Quintom model whose EoS gets across the cosmological constant boundary during evolution. Regarding the total neutrino mass limit, we obtain the upper limit, \sum m_ν< 0.471 eV (95% C.L.) within the framework of the flat ΛCDM model. Due to the obvious degeneracies between the neutrino mass and the EoS of dark energy model, this upper limit will be relaxed by a factor of 2 in the framework of dynamical dark energy models. For the constraints on the inflation parameters, we find that the upper limit on the ratio of the tensor to scalar is r<0.35 (95% C.L.) and the inflationary models with the slope n_s\geq1 are excluded at more than 2 σconfidence level. In this paper we pay particular attention to the contribution from the weak lensing data and find that the current weak lensing data do improve the constraints on matter density Ω_m, σ_8, \sum{m_ν}, and the EoS of dark energy.

Lei Sun

What is connected

Connect this record

See the researcher in context

Building this map preview

45 published item(s)

DeformMaster: An Interactive Physics-Neural World Model for Deformable Objects from Videos

Real-world Reinforcement Learning from Suboptimal Interventions

Event-Based Fusion for Motion Deblurring with Cross-modal Attention

Additional evidence for a pulsar wind nebula in the heart of SN 1987A from multi-epoch X-ray data and MHD modeling

Annular Computational Imaging: Capture Clear Panoramic Images through Simple Lens

CE-based white-box adversarial attacks will not work using super-fitting

Efficient Human Pose Estimation via 3D Event Point Cloud

FaceFormer: Scale-aware Blind Face Restoration with Transformers

Five-channel frequency-division multiplexing using low-loss epsilon-near-zero metamaterial waveguide

Measuring the severity of multi-collinearity in high dimensions

Multi-Task Learning Framework for Emotion Recognition in-the-wild

Novel boron nitride polymorphs with graphite-diamond hybrid structure

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

On the Connection between Local Attention and Dynamic Depth-wise Convolution

Real Image Restoration via Structure-preserving Complementarity Attention

Rethinking Classifier and Adversarial Attack

TSRFormer: Table Structure Recognition with Transformers

Unusually high HCO+/CO ratios in and outside supernova remnant W49B

TriVoC: Efficient Voting-based Consensus Maximization for Robust Point Cloud Registration with Extreme Outlier Ratios

Adaptive Operator Selection Based on Dynamic Thompson Sampling for MOEA/D

An XMM-Newton X-ray View of Supernova Remnant W49B: Revisiting its Recombining Plasmas and Progenitor Type

Real-time Fusion Network for RGB-D Semantic Segmentation Incorporating Unexpected Obstacle Detection for Road-driving Images

ReLaText: Exploiting Visual Relationships for Arbitrary-Shaped Scene Text Detection with Graph Convolutional Networks

A Generalized Levene's Scale Test for Variance Heterogeneity in the Presence of Sample Correlation and Group Uncertainty

Conetronics in 2D Metal-Organic Frameworks: Double Dirac Cones, Magnetic Half Dirac Cones and Quantum Anomalous Hall Effect

Constrained Maximum Correntropy Adaptive Filtering

On "Exponential Lower Bounds for Polytopes in Combinatorial Optimization" by Fiorini et al. (2015): A Refutation For Models With Disjoint Sets of Descriptive Variables

Diffraction-free optical beam propagation with near-zero phase variation in extremely anisotropic metamaterials

Nonlocal effective medium analysis in symmetric metal-dielectric multilayer metamaterials

A novel joint location-scale testing framework for improved detection of variants with main or interaction effects

Pooled Association Tests for Rare Genetic Variants: A Review and Some New Results

Realizing broadband electromagnetic transparency with a graded-permittivity sphere

Broadband Epsilon-Near-Zero Metamaterials with Step-Like Metal-Dielectric Multilayer Structures

Deep subwavelength beam propagation in extremely loss-anisotropic metamaterials

Experimental realization of epsilon-near-zero metamaterial slabs with metal-dielectric multilayers

Giant optical nonlocality near the Dirac point in metal-dielectric multilayer metamaterials

Quantum entanglement in plasmonic waveguides with near-zero mode indices

A Generalized Kruskal-Wallis Test Incorporating Group Uncertainty with Application to Genetic Association Studies

Bayesian Latent Variable Modeling of Longitudinal Family Data for Genetic Pleiotropy Studies

Integrated optical devices based on broadband epsilon-near-zero meta-atoms

Loss enhanced transmission and collimation in anisotropic epsilon-near-zero metamaterials

Strategy for designing broadband epsilon-near-zero metamaterial with loss compensation by gain media

Triggering the Continuous Growth of Graphene toward Millimeter Size Grain

Bayesian methods to overcome the winner's curse in genetic studies

Constraining Cosmological Parameters with Observational Data Including Weak Lensing Effects