Researcher profile

Lin Wu

Lin Wu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

Plug-and-play Class-aware Knowledge Injection for Prompt Learning with Visual-Language Model

Prompt learning has become an effective and widely used technique in enhancing vision-language models (VLMs) such as CLIP for various downstream tasks, particularly in zero-shot classification within specific domains. Existing methods typically focus on either learning class-shared prompts for a given domain or generating instance-specific prompts through conditional prompt learning. While these methods have achieved promising performance, they often overlook class-specific knowledge in prompt design, leading to suboptimal outcomes. The underlying reasons are: 1) class-specific prompts offer more fine-grained supervision compared to coarse class-shared prompts, which helps prevent misclassification of data from different classes into a single class; 2) compared to class-specific prompts, instance-specific prompts neglect the richer class-level information across multiple instances, potentially causing data from the same class to be divided into multiple classes. To effectively supplement the class-specific knowledge into existing methods, we propose a plug-and-play Class-Aware Knowledge Injection (CAKI) framework. CAKI comprises two key components, i.e., class-specific prompt generation and query-key prompt matching. The former encodes class-specific knowledge into prompts from few-shot samples that belong to the same class and stores the learned prompts in a class-level knowledge bank. The latter provides a plug-and-play mechanism for each test instance to retrieve relevant class-level knowledge from the knowledge bank and inject such knowledge to refine model predictions. Extensive experiments demonstrate that our CAKI effectively improves the performance of existing methods on base and novel classes. Code is publicly available at \href{https://github.com/yjh576/CAKI}{this https URL}.

preprint2022arXiv

Designing light-element materials with large effective spin-orbit coupling

Spin-orbit coupling (SOC), the core of numerous condensed-matter phenomena such as nontrivial band gap, magnetocrystalline anisotropy, etc, is generally considered to be appreciable only in heavy elements, detrimental to the synthetization and application of functional materials. Therefore, amplifying the SOC effect in light elements is of great importance. Here, focusing on 3d and 4d systems, we demonstrate that the interplay between crystal symmetry and electron correlation can dramatically enhance the SOC effect in certain partially occupied orbital multiplets, through the self-consistently reinforced orbital polarization as a pivot. We then provide design principles and comprehensive databases, in which we list all the Wyckoff positions and site symmetries, in all two-dimensional (2D) and three-dimensional crystals that potentially have such enhanced SOC effect. As an important demonstration, we predict nine material candidates from our selected 2D material pool as high-temperature quantum anomalous Hall insulators with large nontrivial band gaps of hundreds of meV. Our work provides an efficient and straightforward way to predict promising SOC-active materials, releasing the burden of requiring heavy elements for next-generation spin-orbitronic materials and devices.

preprint2022arXiv

Exploring variational quantum eigensolver ansatzes for the long-range XY model

Finding the ground state energy and wavefunction of a quantum many-body system is a key problem in quantum physics and chemistry. We study this problem for the long-range XY model by using the variational quantum eigensolver (VQE) algorithm. We consider VQE ansatzes with full and linear entanglement structures consisting of different building gates: the CNOT gate, the controlled-rotation (CRX) gate, and the two-qubit rotation (TQR) gate. We find that the full-entanglement CRX and TQR ansatzes can sufficiently describe the ground state energy of the long-range XY model. In contrast, only the full-entanglement TQR ansatz can represent the ground state wavefunction with a fidelity close to one. In addition, we find that instead of using full-entanglement ansatzes, restricted-entanglement ansatzes where entangling gates are applied only between qubits that are a fixed distance from each other already suffice to give acceptable solutions. Using the entanglement entropy to characterize the expressive powers of the VQE ansatzes, we show that the full-entanglement TQR ansatz has the highest expressive power among them.

preprint2022arXiv

Fiber spectrum analyzer based on planar waveguide array aligned to a camera without lens

We propose and experimentally demonstrate a fiber spectrum analyzer based on a planar waveguide chip butt-coupled with an input fiber and aligned to a standard camera without any free-space optical elements. The chip consists of a single-mode waveguide to connect with the fiber, a beam broadening area, and a waveguide array in which the lengths of the waveguides are designed for both wavelength separation and beam focusing. The facet of the chip is diced open so that the outputs of the array form a near-field emitter. The far field are calculated by the Rayleigh-Sommerfeld diffraction integral. We show that the chip can provide a focal depth on the millimeter scale, allowing relaxed alignment to the camera without any fine-positioning stage. Two devices with 120 and 220 waveguides are fabricated on the polymer waveguide platform. The measured spectral width are 0.63 nm and 0.42 nm, respectively. This simple and practical approach may lead to the development of a spectrum analyzer for fiber that is easily mountable to any commercial camera, thereby avoiding the complication for customized detectors as well as electronic circuits afterwards.

preprint2022arXiv

Learning Resolution-Adaptive Representations for Cross-Resolution Person Re-Identification

The cross-resolution person re-identification (CRReID) problem aims to match low-resolution (LR) query identity images against high resolution (HR) gallery images. It is a challenging and practical problem since the query images often suffer from resolution degradation due to the different capturing conditions from real-world cameras. To address this problem, state-of-the-art (SOTA) solutions either learn the resolution-invariant representation or adopt super-resolution (SR) module to recover the missing information from the LR query. This paper explores an alternative SR-free paradigm to directly compare HR and LR images via a dynamic metric, which is adaptive to the resolution of a query image. We realize this idea by learning resolution-adaptive representations for cross-resolution comparison. Specifically, we propose two resolution-adaptive mechanisms. The first one disentangles the resolution-specific information into different sub-vectors in the penultimate layer of the deep neural networks, and thus creates a varying-length representation. To better extract resolution-dependent information, we further propose to learn resolution-adaptive masks for intermediate residual feature blocks. A novel progressive learning strategy is proposed to train those masks properly. These two mechanisms are combined to boost the performance of CRReID. Experimental results show that the proposed method is superior to existing approaches and achieves SOTA performance on multiple CRReID benchmarks.

preprint2022arXiv

Multi-modal Visual Place Recognition in Dynamics-Invariant Perception Space

Visual place recognition is one of the essential and challenging problems in the fields of robotics. In this letter, we for the first time explore the use of multi-modal fusion of semantic and visual modalities in dynamics-invariant space to improve place recognition in dynamic environments. We achieve this by first designing a novel deep learning architecture to generate the static semantic segmentation and recover the static image directly from the corresponding dynamic image. We then innovatively leverage the spatial-pyramid-matching model to encode the static semantic segmentation into feature vectors. In parallel, the static image is encoded using the popular Bag-of-words model. On the basis of the above multi-modal features, we finally measure the similarity between the query image and target landmark by the joint similarity of their semantic and visual codes. Extensive experiments demonstrate the effectiveness and robustness of the proposed approach for place recognition in dynamic environments.

preprint2022arXiv

Pseudo-Pair based Self-Similarity Learning for Unsupervised Person Re-identification

Person re-identification (re-ID) is of great importance to video surveillance systems by estimating the similarity between a pair of cross-camera person shorts. Current methods for estimating such similarity require a large number of labeled samples for supervised training. In this paper, we present a pseudo-pair based self-similarity learning approach for unsupervised person re-ID without human annotations. Unlike conventional unsupervised re-ID methods that use pseudo labels based on global clustering, we construct patch surrogate classes as initial supervision, and propose to assign pseudo labels to images through the pairwise gradient-guided similarity separation. This can cluster images in pseudo pairs, and the pseudos can be updated during training. Based on pseudo pairs, we propose to improve the generalization of similarity function via a novel self-similarity learning:it learns local discriminative features from individual images via intra-similarity, and discovers the patch correspondence across images via inter-similarity. The intra-similarity learning is based on channel attention to detect diverse local features from an image. The inter-similarity learning employs a deformable convolution with a non-local block to align patches for cross-image similarity. Experimental results on several re-ID benchmark datasets demonstrate the superiority of the proposed method over the state-of-the-arts.

preprint2020arXiv

Exhaustive List of Topological Hourglass Band Crossings in 230 Space Groups

Topological semimetals with band crossings (BCs) near the Fermi level have attracted intense research activities in the past several years. Among various BCs, those enforced by an hourglass-like connectivity pattern, which are just located at the vertex in the neck of an hourglass and thus called hourglass BCs (HBCs), show interesting topological properties and are intimately related with the space group symmetry. Through checking compatibility relations in the Brillouin zone (BZ), we list all possible HBCs for all 230 space groups by identifying positions of HBCs as well as the compatibility relations related with the HBCs.The HBCs can be coexisting with conventional topological BCs such as Dirac andWeyl fermions and based on our exhaustive list, the dimensionality and degeneracy of the HBCs can be quickly identified. It is also found that the HBCs can be classified into two categories: one contains essential HBCs which are guaranteed to exist, while the HBCs in the other category may be tuned to disappear. Our results can help in efficiently predicting hourglass semimetals combined with first-principles calculations as well as studying transitions among various topological crystalline phases.

preprint2020arXiv

Reconfigurable photon sources based on quantum plexcitonic systems

A single photon in a strongly nonlinear cavity is able to block the transmission of the second photon, thereby converting incident coherent light into anti-bunched light, which is known as photon blockade effect. On the other hand, photon anti-pairing, where only the entry of two photons is blocked and the emission of bunches of three or more photons is allowed, is based on an unconventional photon blockade mechanism due to destructive interference of two distinct excitation pathways. We propose quantum plexcitonic systems with moderate nonlinearity to generate both anti-bunched and anti-paired photons. The proposed plexitonic systems benefit from subwavelength field localizations that make quantum emitters spatially distinguishable, thus enabling a reconfigurable photon source between anti-bunched and anti-paired states via tailoring the energy bands. For a realistic nanoprism plexitonic system, two schemes of reconfiguration are suggested: (i) the chemical means by partially changing the type of the emitters; or (ii) the optical approach by rotating the polarization angle of the incident light to tune the coupling rate of the emitters. These results pave the way to realize reconfigurable nonclassical photon sources in a simple quantum plexcitonic platform with readily accessible experimental conditions.

preprint2020arXiv

Unsupervised Domain Adaptive Object Detection using Forward-Backward Cyclic Adaptation

We present a novel approach to perform the unsupervised domain adaptation for object detection through forward-backward cyclic (FBC) training. Recent adversarial training based domain adaptation methods have shown their effectiveness on minimizing domain discrepancy via marginal feature distributions alignment. However, aligning the marginal feature distributions does not guarantee the alignment of class conditional distributions. This limitation is more evident when adapting object detectors as the domain discrepancy is larger compared to the image classification task, e.g. various number of objects exist in one image and the majority of content in an image is the background. This motivates us to learn domain invariance for category level semantics via gradient alignment. Intuitively, if the gradients of two domains point in similar directions, then the learning of one domain can improve that of another domain. To achieve gradient alignment, we propose Forward-Backward Cyclic Adaptation, which iteratively computes adaptation from source to target via backward hopping and from target to source via forward passing. In addition, we align low-level features for adapting holistic color/texture via adversarial training. However, the detector performs well on both domains is not ideal for target domain. As such, in each cycle, domain diversity is enforced by maximum entropy regularization on the source domain to penalize confident source-specific learning and minimum entropy regularization on target domain to intrigue target-specific learning. Theoretical analysis of the training process is provided, and extensive experiments on challenging cross-domain object detection datasets have shown the superiority of our approach over the state-of-the-art.

preprint2019arXiv

CORAL8: Concurrent Object Regression for Area Localization in Medical Image Panels

This work tackles the problem of generating a medical report for multi-image panels. We apply our solution to the Renal Direct Immunofluorescence (RDIF) assay which requires a pathologist to generate a report based on observations across the eight different WSI in concert with existing clinical features. To this end, we propose a novel attention-based multi-modal generative recurrent neural network (RNN) architecture capable of dynamically sampling image data concurrently across the RDIF panel. The proposed methodology incorporates text from the clinical notes of the requesting physician to regulate the output of the network to align with the overall clinical context. In addition, we found the importance of regularizing the attention weights for word generation processes. This is because the system can ignore the attention mechanism by assigning equal weights for all members. Thus, we propose two regularizations which force the system to utilize the attention mechanism. Experiments on our novel collection of RDIF WSIs provided by a large clinical laboratory demonstrate that our framework offers significant improvements over existing methods.

preprint2019arXiv

Medi-Care AI: Predicting Medications From Billing Codes via Robust Recurrent Neural Networks

In this paper, we present an effective deep prediction framework based on robust recurrent neural networks (RNNs) to predict the likely therapeutic classes of medications a patient is taking, given a sequence of diagnostic billing codes in their record. Accurately capturing the list of medications currently taken by a given patient is extremely challenging due to undefined errors and omissions. We present a general robust framework that explicitly models the possible contamination through overtime decay mechanism on the input billing codes and noise injection into the recurrent hidden states, respectively. By doing this, billing codes are reformulated into its temporal patterns with decay rates on each medical variable, and the hidden states of RNNs are regularised by random noises which serve as dropout to improved RNNs robustness towards data variability in terms of missing values and multiple errors. The proposed method is extensively evaluated on real health care data to demonstrate its effectiveness in suggesting medication orders from contaminated values.