Researcher profile

Yi Jin

Yi Jin contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2026arXiv

JOGS: Joint Optimization of Pose Estimation and 3D Gaussian Splatting

Traditional novel view synthesis methods heavily rely on external camera pose estimation tools such as COLMAP, which often introduce computational bottlenecks and propagate errors. To address these challenges, we propose a unified framework that jointly optimizes 3D Gaussian points and camera poses without requiring pre-calibrated inputs. Our approach iteratively refines 3D Gaussian parameters and updates camera poses through a novel co-optimization strategy, ensuring simultaneous improvements in scene reconstruction fidelity and pose estimation accuracy. The key innovation lies in decoupling the joint optimization into two interleaved phases: first, updating 3D Gaussian parameters via differentiable rendering with fixed poses, and second, refining camera poses using a customized 3D optical flow algorithm that incorporates geometric and photometric constraints. This formulation progressively reduces projection errors, particularly in challenging scenarios with large viewpoint variations and sparse feature distributions, where traditional methods struggle. Extensive evaluations on multiple datasets demonstrate that our approach significantly outperforms existing COLMAP-free techniques in reconstruction quality, and also surpasses the standard COLMAP-based baseline in general.

preprint2026arXiv

Synthetic FMCW Radar Range Azimuth Maps Augmentation with Generative Diffusion Model

The scarcity and low diversity of well-annotated automotive radar datasets often limit the performance of deep-learning-based environmental perception. To overcome these challenges, we propose a conditional generative framework for synthesizing realistic Frequency-Modulated Continuous-Wave radar Range-Azimuth Maps. Our approach leverages a generative diffusion model to generate radar data for multiple object categories, including pedestrians, cars, and cyclists. Specifically, conditioning is achieved via Confidence Maps, where each channel represents a semantic class and encodes Gaussian-distributed annotations at target locations. To address radar-specific characteristics, we incorporate Geometry Aware Conditioning and Temporal Consistency Regularization into the generative process. Experiments on the ROD2021 dataset demonstrate that signal reconstruction quality improves by \SI{3.6}{dB} in Peak Signal-to-Noise Ratio over baseline methods, while training with a combination of real and synthetic datasets improves overall mean Average Precision by 4.15% compared with conventional image-processing-based augmentation. These results indicate that our generative framework not only produces physically plausible and diverse radar spectrum but also substantially improves model generalization in downstream tasks.

preprint2022arXiv

2D+3D facial expression recognition via embedded tensor manifold regularization

In this paper, a novel approach via embedded tensor manifold regularization for 2D+3D facial expression recognition (FERETMR) is proposed. Firstly, 3D tensors are constructed from 2D face images and 3D face shape models to keep the structural information and correlations. To maintain the local structure (geometric information) of 3D tensor samples in the low-dimensional tensors space during the dimensionality reduction, the $\ell_0$-norm of the core tensors and a tensor manifold regularization scheme embedded on core tensors are adopted via a low-rank truncated Tucker decomposition on the generated tensors. As a result, the obtained factor matrices will be used for facial expression classification prediction. To make the resulting tensor optimization more tractable, $\ell_1$-norm surrogate is employed to relax $\ell_0$-norm and hence the resulting tensor optimization problem has a nonsmooth objective function due to the $\ell_1$-norm and orthogonal constraints from the orthogonal Tucker decomposition. To efficiently tackle this tensor optimization problem, we establish the first-order optimality condition in terms of stationary points, and then design a block coordinate descent (BCD) algorithm with convergence analysis and the computational complexity. Numerical results on BU-3DFE database and Bosphorus databases demonstrate the effectiveness of our proposed approach.

preprint2022arXiv

GLAN: A Graph-based Linear Assignment Network

Differentiable solvers for the linear assignment problem (LAP) have attracted much research attention in recent years, which are usually embedded into learning frameworks as components. However, previous algorithms, with or without learning strategies, usually suffer from the degradation of the optimality with the increment of the problem size. In this paper, we propose a learnable linear assignment solver based on deep graph networks. Specifically, we first transform the cost matrix to a bipartite graph and convert the assignment task to the problem of selecting reliable edges from the constructed graph. Subsequently, a deep graph network is developed to aggregate and update the features of nodes and edges. Finally, the network predicts a label for each edge that indicates the assignment relationship. The experimental results on a synthetic dataset reveal that our method outperforms state-of-the-art baselines and achieves consistently high accuracy with the increment of the problem size. Furthermore, we also embed the proposed solver, in comparison with state-of-the-art baseline solvers, into a popular multi-object tracking (MOT) framework to train the tracker in an end-to-end manner. The experimental results on MOT benchmarks illustrate that the proposed LAP solver improves the tracker by the largest margin.

preprint2022arXiv

Near-field radiative heat transfer between hybrid polaritonic structures

Near-field radiative heat transfer between close objects may exceed the far-field blackbody radiation in orders of magnitude when exploiting polaritonic materials. Great efforts have been made to experimentally measure this fundamental stochastic effect but mostly based on simple materials. In this work, we foster an all-optical method to characterize the heat transfer between less explored plasmon-phonon hybrid polaritonic systems made of graphene-SiC heterostructures. A large heat flux about 26 times of the blackbody radiation limit is obtained over a 150-nm vacuum gap, attributed to the couplings of three different surface modes (plasmon, phonon polaritons and frustrated mode). The interaction of polaritonic modes in the hybrid system is also explored to build a switchable thermophotonic device with nearly unity heat flux tunability. This work paves the way for understanding mode-mediated near-field heat transfer and provides a platform for building thermophotonic or thermo-optoelectronic blocks for various applications.

preprint2022arXiv

Prototype Guided Network for Anomaly Segmentation

Semantic segmentation methods can not directly identify abnormal objects in images. Anomaly Segmentation algorithm from this realistic setting can distinguish between in-distribution objects and Out-Of-Distribution (OOD) objects and output the anomaly probability for pixels. In this paper, a Prototype Guided Anomaly segmentation Network (PGAN) is proposed to extract semantic prototypes for in-distribution training data from limited annotated images. In the model, prototypes are used to model the hierarchical category semantic information and distinguish OOD pixels. The proposed PGAN model includes a semantic segmentation network and a prototype extraction network. Similarity measures are adopted to optimize the prototypes. The learned semantic prototypes are used as category semantics to compare the similarity with features extracted from test images and then to generate semantic segmentation prediction. The proposed prototype extraction network can also be integrated into most semantic segmentation networks and recognize OOD pixels. On the StreetHazards dataset, the proposed PGAN model produced mIoU of 53.4% for anomaly segmentation. The experimental results demonstrate PGAN may achieve the SOTA performance in the anomaly segmentation tasks.

preprint2022arXiv

Reusing the Task-specific Classifier as a Discriminator: Discriminator-free Adversarial Domain Adaptation

Adversarial learning has achieved remarkable performances for unsupervised domain adaptation (UDA). Existing adversarial UDA methods typically adopt an additional discriminator to play the min-max game with a feature extractor. However, most of these methods failed to effectively leverage the predicted discriminative information, and thus cause mode collapse for generator. In this work, we address this problem from a different perspective and design a simple yet effective adversarial paradigm in the form of a discriminator-free adversarial learning network (DALN), wherein the category classifier is reused as a discriminator, which achieves explicit domain alignment and category distinguishment through a unified objective, enabling the DALN to leverage the predicted discriminative information for sufficient feature alignment. Basically, we introduce a Nuclear-norm Wasserstein discrepancy (NWD) that has definite guidance meaning for performing discrimination. Such NWD can be coupled with the classifier to serve as a discriminator satisfying the K-Lipschitz constraint without the requirements of additional weight clipping or gradient penalty strategy. Without bells and whistles, DALN compares favorably against the existing state-of-the-art (SOTA) methods on a variety of public datasets. Moreover, as a plug-and-play technique, NWD can be directly used as a generic regularizer to benefit existing UDA algorithms. Code is available at https://github.com/xiaoachen98/DALN.

preprint2021arXiv

Attention Models for Point Clouds in Deep Learning: A Survey

Recently, the advancement of 3D point clouds in deep learning has attracted intensive research in different application domains such as computer vision and robotic tasks. However, creating feature representation of robust, discriminative from unordered and irregular point clouds is challenging. In this paper, our ultimate goal is to provide a comprehensive overview of the point clouds feature representation which uses attention models. More than 75+ key contributions in the recent three years are summarized in this survey, including the 3D objective detection, 3D semantic segmentation, 3D pose estimation, point clouds completion etc. We provide a detailed characterization (1) the role of attention mechanisms, (2) the usability of attention models into different tasks, (3) the development trend of key technology.

preprint2021arXiv

Camera-aware Style Separation and Contrastive Learning for Unsupervised Person Re-identification

Unsupervised person re-identification (ReID) is a challenging task without data annotation to guide discriminative learning. Existing methods attempt to solve this problem by clustering extracted embeddings to generate pseudo labels. However, most methods ignore the intra-class gap caused by camera style variance, and some methods are relatively complex and indirect although they try to solve the negative impact of the camera style on feature distribution. To solve this problem, we propose a camera-aware style separation and contrastive learning method (CA-UReID), which directly separates camera styles in the feature space with the designed camera-aware attention module. It can explicitly divide the learnable feature into camera-specific and camera-agnostic parts, reducing the influence of different cameras. Moreover, to further narrow the gap across cameras, we design a camera-aware contrastive center loss to learn more discriminative embedding for each identity. Extensive experiments demonstrate the superiority of our method over the state-of-the-art methods on the unsupervised person ReID task.

preprint2021arXiv

Investigating the $Z^\prime$ gauge boson at the future lepton colliders

$Z^\prime$ boson as a new gauge boson has been proposed in many new physics models. The interactions of $Z^\prime$ coupling to fermions are detailed studied at the large hadron collider. A $Z^\prime$ with the mass of a few TeV has been excluded in some special models. The future lepton colliders will focus on the studies of Higgs physics which provide the advantage to investigate the interactions of Higgs boson with the new gauge bosons. We investigate the $Z^\prime ZH$ interaction via the process of $e^+e^- \to Z^\prime/Z \to ZH \to l^+l^- b \bar{b}$. The angular distribution of the final leptons decaying from the $Z$-boson is related to the mixing of $Z^\prime$-$Z$ and the mass of $Z^\prime$. The forward-backward asymmetry has been proposed as an observable to investigate the $Z^\prime$-$Z$ mixing. The angular distributions change significantly with some special beam polarization comparing to the unpolarized condition.

preprint2021arXiv

Same-Sign Tetralepton Signature in Type-II Seesaw at Lepton Colliders

The same-sign tetralepton signature via mixing of neutral Higgs bosons and their cascade decays to charged Higgs bosons is a unique signal in the type-II seesaw model. In this paper, we study this signature at future lepton colliders, such as ILC, CLIC, and MuC. Constrained by direct search, $H^{\pm\pm}\to W^\pm W^\pm$ is the only viable decay mode for $M_{A^0}=400$ GeV at $\sqrt{s}=1$ TeV ILC. With an integrated luminosity of $\mathcal{L}=8~ \mathrm{ab}^{-1}$, the promising region with about 150 signal events corresponds to a narrow band in the range of $10^{-4}~\text{GeV}\lesssim v_Δ\lesssim10^{-2}$ GeV. For heavier triplet scalars $M_{A^0}\gtrsim 900$ GeV, although the $H^{\pm\pm}\to \ell^\pm \ell^\pm$ decay mode is allowed, the cascade decays are suppressed. A maximum event number $\sim 16$ can be obtained around $v_Δ\sim4\times10^{-4}$ GeV and $λ_4\sim0.26$ for $M_{A^0}=1000$ GeV with $\mathcal{L}=5~ \mathrm{ab}^{-1}$ at $\sqrt{s}=3$ TeV CLIC. Meanwhile, we find that this signature is not promising for $M_{A^0}=1500$ GeV at $\sqrt{s}=6$ TeV MuC.

preprint2020arXiv

Cross-ethnicity Face Anti-spoofing Recognition Challenge: A Review

Face anti-spoofing is critical to prevent face recognition systems from a security breach. The biometrics community has %possessed achieved impressive progress recently due the excellent performance of deep neural networks and the availability of large datasets. Although ethnic bias has been verified to severely affect the performance of face recognition systems, it still remains an open research problem in face anti-spoofing. Recently, a multi-ethnic face anti-spoofing dataset, CASIA-SURF CeFA, has been released with the goal of measuring the ethnic bias. It is the largest up to date cross-ethnicity face anti-spoofing dataset covering $3$ ethnicities, $3$ modalities, $1,607$ subjects, 2D plus 3D attack types, and the first dataset including explicit ethnic labels among the recently released datasets for face anti-spoofing. We organized the Chalearn Face Anti-spoofing Attack Detection Challenge which consists of single-modal (e.g., RGB) and multi-modal (e.g., RGB, Depth, Infrared (IR)) tracks around this novel resource to boost research aiming to alleviate the ethnic bias. Both tracks have attracted $340$ teams in the development stage, and finally 11 and 8 teams have submitted their codes in the single-modal and multi-modal face anti-spoofing recognition challenges, respectively. All the results were verified and re-ran by the organizing team, and the results were used for the final ranking. This paper presents an overview of the challenge, including its design, evaluation protocol and a summary of results. We analyze the top ranked solutions and draw conclusions derived from the competition. In addition we outline future work directions.

preprint2020arXiv

EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising

In the past few decades, to reduce the risk of X-ray in computed tomography (CT), low-dose CT image denoising has attracted extensive attention from researchers, which has become an important research issue in the field of medical images. In recent years, with the rapid development of deep learning technology, many algorithms have emerged to apply convolutional neural networks to this task, achieving promising results. However, there are still some problems such as low denoising efficiency, over-smoothed result, etc. In this paper, we propose the Edge enhancement based Densely connected Convolutional Neural Network (EDCNN). In our network, we design an edge enhancement module using the proposed novel trainable Sobel convolution. Based on this module, we construct a model with dense connections to fuse the extracted edge information and realize end-to-end image denoising. Besides, when training the model, we introduce a compound loss that combines MSE loss and multi-scales perceptual loss to solve the over-smoothed problem and attain a marked improvement in image quality after denoising. Compared with the existing low-dose CT image denoising algorithms, our proposed model has a better performance in preserving details and suppressing noise.

preprint2020arXiv

Enhancing single photon emission through quasi-bound states in the continuum of monolithic hexagonal boron nitride metasurface

A patterned structure of monolithic hexagonal boron nitride (hBN) on a glass substrate, which can enhance the emission of the embedded single photon emitters (SPEs), is useful for onchip single-photon sources of high-quality. Here, we design and demonstrate a monolithic hBN metasurface with quasi-bound states in the continuum mode at emission wavelength with ultrahigh Q values to enhance fluorescence emission of SPEs in hBN. Because of ultrahigh electric field enhancement inside the proposed hBN metasurface, an ultrahigh Purcell factor (3.3*10^4) is achieved. In addition, the Purcell factor can also be strongly enhanced in most part of the hBN structure, which makes the hBN metasurface suitable for e.g. monolithic quantum photonics.

preprint2020arXiv

High-Temperature Ultra-Broad UV-MIR High-Efficiency Absorber Based on Double Ring-Shaped Titanium Nitride Resonators

An ultrabroad absorber based on double-ring-shaped titanium nitride (TiN) nanoresonators, which can work in high temperatures, is proposed and numerically studied. The absorber with some optimal parameters exhibits an averaged absorption of 94.6% in the range of 200 - 4000 nm (from ultraviolet to mid-infrared) and a band from 200 - 3518 nm having an absorption > 90%. We have demonstrated in detail the physical mechanisms of the ultra-broad absorption, including the dielectric lossy property of TiN material itself in shorter wavelengths and plasmonic resonances caused by the metallic property of TiN nano-resonators in longer wavelengths. In addition, the absorber shows polarization independent and wide-angle acceptance. Another absorber with double TiN nano-rings of different heights has flatter and higher absorption efficiency (more than 95% absorption) at 200-2860 nm waveband. These properties make the proposed absorbers based on TiN has great potentials in many applications, such as light trapping, photovoltaics, thermal emitters.

preprint2020arXiv

Intelligent Reflecting Surface Assisted Secure Wireless Communications with Multiple-Transmit and Multiple-Receive Antennas

In this paper, we propose intelligent reflecting surfaces (IRS) assisted secure wireless communications with multi-input and multi-output antennas (IRS-MIMOME). The considered scenario is an access point (AP) equipped with multiple antennas communicates with a multi-antenna enabled legitimate user in the downlink at the present of an eavesdropper configured with multiple antennas. Particularly, the joint optimization of the transmit covariance matrix at the AP and the reflecting coefficients at the IRS to maximize the secrecy rate for the IRS-MIMOME system is investigated, with two different assumptions on the phase shifting capabilities at the IRS, i.e., the IRS has the continuous reflecting coefficients and the IRS has the discrete reflecting coefficients. For the former case, due to the non-convexity of the formulated problem, an alternating optimization (AO)-based algorithm is proposed, i.e., for given the reflecting coefficients at the IRS, the successive convex approximation (SCA)-based algorithm is used to solve the transmit covariance matrix optimization, while given the transmit covariance matrix at the AP, alternative optimization is used again in individually optimizing of each reflecting coefficient at the IRS with other fixed reflecting coefficients. For the individual reflecting coefficient optimization, the close-form or an interval of the optimal solution is provided. Then, the proposed algorithm is extended to the discrete reflecting coefficient model at the IRS. Finally, some numerical simulations have been done to demonstrate that the proposed algorithm outperforms other benchmark schemes.

preprint2020arXiv

Leptogenesis and Dark Matter from Low Scale Seesaw

In this paper, we perform a detail analysis on leptogenesis and dark matter form low scale seesaw. In the framework of $ν$2HDM, we further introduce one scalar singlet $ϕ$ and one Dirac fermion singlet $χ$, which are charged under a $Z_2$ symmetry. Assuming the coupling of $χ$ is extremely small, it serves as a FIMP dark matter. The heavy right hand neutrinos $N$ provide a common origin for tiny neutrino mass (via seesaw mechanism), leptogenesis (via $N\to \ell_L Φ_ν^*,\bar{\ell}_L Φ_ν$) and dark matter (via $N\to χϕ$). With hierarchical right hand neutrino masses, the explicit calculation shows that success thermal leptogenesis is viable even for TeV scale $N_1$ with $0.4 \lesssim v_ν\lesssim1$ GeV and lightest neutrino mass $m_1\lesssim 10^{-11}$ eV. In such scenario, light FIMP dark matter in the keV to MeV range is naturally expected. The common parameter space for neutrino mass, natural leptogenesis and FIMP DM is also obtained in this paper.

preprint2020arXiv

Time-Frequency Analysis based Blind Modulation Classification for Multiple-Antenna Systems

Blind modulation classification is an important step to implement cognitive radio networks. The multiple-input multiple-output (MIMO) technique is widely used in military and civil communication systems. Due to the lack of prior information about channel parameters and the overlapping of signals in the MIMO systems, the traditional likelihood-based and feature-based approaches cannot be applied in these scenarios directly. Hence, in this paper, to resolve the problem of blind modulation classification in MIMO systems, the time-frequency analysis method based on the windowed short-time Fourier transform is used to analyse the time-frequency characteristics of time-domain modulated signals. Then the extracted time-frequency characteristics are converted into RGB spectrogram images, and the convolutional neural network based on transfer learning is applied to classify the modulation types according to the RGB spectrogram images. Finally, a decision fusion module is used to fuse the classification results of all the receive antennas. Through simulations, we analyse the classification performance at different signal-to-noise ratios (SNRs), the results indicate that, for the single-input single-output (SISO) network, our proposed scheme can achieve 92.37% and 99.12% average classification accuracy at SNRs of -4 dB and 10 dB, respectively. For the MIMO network, our scheme achieves 80.42% and 87.92% average classification accuracy at -4 dB and 10 dB, respectively. This outperforms the existing classification methods based on baseband signals.

preprint2019arXiv

Exclusive Production Ratio of Neutral over Charged Kaon Pair in $e^+e^-$ Annihilation Continuum via `Straton Model'

A completely relativistic quark model in the Bethe-Salpter framework is employed to calculate the exclusive production ratio of the neutral over charged Kaon pair in $e^+e^-$ annihilation continuum region for center of mass energies smaller than the $J/Ψ$ mass. The valence quark charge plays the key rôle. The cancellation of the diagrams for the same charge case (in $K_S + K_L$) and the non-cancellation of the diagrams for the different charge case (in $K^-+K^+$) lead to the ratio as $(m_s-m_d)^2/M_{Kaon}^2 \sim 1/10$.