Researcher profile

Yuxuan Wan

Yuxuan Wan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

Attribution-Guided Continual Learning for Large Language Models

Large language models (LLMs) often suffer from catastrophic forgetting in continual learning: after learning new tasks sequentially, they perform worse on earlier tasks. Existing methods mitigate catastrophic forgetting by data replay, parameter freezing, or regularization. However, these methods lack semantic awareness of internal knowledge distribution in LLMs. As a result, they cannot distinguish parameters that should be preserved or updated. We propose an attribution-guided continual fine-tuning framework for LLMs. Our method estimates task-specific, element-wise parameter importance in each Transformer layer and uses these scores to modulate gradients. Parameters important to previous tasks receive smaller updates, while less relevant ones remain plastic for learning new tasks. Experiments on continual learning benchmarks show that our method consistently outperforms baselines, achieving better retention of old tasks while maintaining competitive performance on new tasks.

preprint2026arXiv

ChronoEarth-492K: A Large Scale and Long Horizon Spatiotemporal Hyperspectral Earth Observation Dataset and Benchmark

Hyperspectral imaging (HSI) provides dense spectral information for the Earth's surface, enabling material-level understanding of land cover and ecosystem dynamics. Despite recent progress in hyperspectral self-supervised learning (SSL), existing datasets remain temporally shallow, limiting the development of long-horizon spatiotemporal modeling. To address this gap, we introduce ChronoEarth-492K, the first large-scale, temporally calibrated hyperspectral SSL dataset built upon NASA's EO-1 Hyperion mission, the world's longest continuous hyperspectral archive up to date (2001-2017). ChronoEarth-492K comprises 492,354 radiometrically harmonized patches across 185,398 global locations over 17 years, with 28,786 sites containing multi-temporal sequences ($\geq 3$ observations) that enable both short- and long-horizon temporal analysis. Building on this foundation, we establish the ChronoEarth-Benchmark, a unified evaluation suite spanning static, short-horizon, and long-horizon temporal tasks, constructed from six open-source geospatial products covering land cover, crop type, forest dynamics, and soil properties. We further introduce a standardized evaluation protocol and report extensive baseline results across state-of-the-art hyperspectral foundation models. Together, ChronoEarth and benchmark provide the first large-scale, temporally grounded platform for systematic spatiotemporal hyperspectral representation learning.

preprint2026arXiv

LESSViT: Robust Hyperspectral Representation Learning under Spectral Configuration Shift

Modeling hyperspectral imagery (HSI) across different sensors presents a fundamental challenge due to variations in wavelength coverage, band sampling, and channel dimensionality. As a result, models trained under a fixed spectral configuration often fail to generalize to other sensors. Existing Vision Transformer (ViT) approaches either rely on implicit spectral modeling with fixed channel assumptions or adopt explicit spatial-spectral attention with prohibitive computational cost, leading to a fundamental trade-off between efficiency and expressiveness. In this work, we introduce Low-rank Efficient Spatial-Spectral ViT (LESSViT), a sensor-flexible architecture for cross-spectral generalization. LESSViT is built on LESS Attention, a structured low-rank factorization that models joint spatial-spectral interactions through separable spatial and spectral components, reducing the complexity of full spatial-spectral attention from $O(N^2 C^2)$ to $O(rNC)$, where $N$ is the number of spatial tokens, $C$ is the number of spectral channels, and $r$ is the rank of the low-rank approximation. We further incorporate channel-agnostic patch embedding and wavelength-aware positional encoding to support flexible spectral inputs. To enable efficient and robust pretraining, we introduce a hyperspectral masked autoencoder (HyperMAE) with decoupled spatial-spectral masking and hierarchical channel sampling. We evaluate LESSViT under a cross-spectral generalization setting that simulates cross-sensor variability. Experiments on the SpectralEarth benchmark demonstrate that LESSViT improves robustness under spectral shifts while remaining competitive in-distribution, and explicit and efficient spatial-spectral modeling is essential for scalable and generalizable hyperspectral representation learning.

preprint2022arXiv

Defense Against Gradient Leakage Attacks via Learning to Obscure Data

Federated learning is considered as an effective privacy-preserving learning mechanism that separates the client's data and model training process. However, federated learning is still under the risk of privacy leakage because of the existence of attackers who deliberately conduct gradient leakage attacks to reconstruct the client data. Recently, popular strategies such as gradient perturbation methods and input encryption methods have been proposed to defend against gradient leakage attacks. Nevertheless, these defenses can either greatly sacrifice the model performance, or be evaded by more advanced attacks. In this paper, we propose a new defense method to protect the privacy of clients' data by learning to obscure data. Our defense method can generate synthetic samples that are totally distinct from the original samples, but they can also maximally preserve their predictive features and guarantee the model performance. Furthermore, our defense strategy makes the gradient leakage attack and its variants extremely difficult to reconstruct the client data. Through extensive experiments, we show that our proposed defense method obtains better privacy protection while preserving high accuracy compared with state-of-the-art methods.

preprint2021arXiv

Selective observation of surface and bulk bands in polar WTe2 by laser-based spin- and angle-resolved photoemission spectroscopy

The electronic state of WTe2, a candidate of type-II Weyl semimetal, is investigated by using laser-based spin- and angle-resolved photoemission spectroscopy (SARPES). We prepare the pair of WTe2 samples, one with (001) surface and the other with (00-1) surface, by "sandwich method", and measure the band structures of each surface separately. The Fermi arcs are observed on both surfaces. We identify that the Fermi arcs on the two surfaces are both originating from surface states. We further find a surface resonance band, which connects with the Fermi-arc band, forming a Dirac-cone-like band dispersion. Our results indicate that the bulk electron and hole bands are much closer in momentum space than band calculations.

preprint2020arXiv

Anisotropic spin distribution and perpendicular magnetic anisotropy in the layered ferromagnetic semiconductor (Ba,K)(Zn,Mn)$_{2}$As$_{2}$

Perpendicular magnetic anisotropy of the new ferromagnetic semiconductor (Ba,K)(Zn,Mn)$_{2}$As$_{2}$ is studied by angle-dependent x-ray magnetic circular dichroism measurements. The large magnetic anisotropy with the anisotropy field of 0.85 T is deduced by fitting the Stoner-Wohlfarth model to the magnetic-field-angle dependence of the projected magnetic moment. Transverse XMCD spectra highlights the anisotropic distribution of Mn 3$d$ electrons, where the $d_{xz}$ and $d_{yz}$ orbitals are less populated than the $d_{xy}$ state because of the $D_{2d}$ splitting arising from the elongated MnAs$_{4}$ tetrahedra. It is suggested that the magnetic anisotropy originates from the degeneracy lifting of $p$-$d_{xz}$, $d_{yz}$ hybridized states at the Fermi level and resulting energy gain due to spin-orbit coupling when spins are aligned along the $z$ direction.

preprint2019arXiv

Magnetization process of the insulating ferromagnetic semiconductor (Al,Fe)Sb

We have studied the magnetization process of the new insulating ferromagnetic semiconductor (Al,Fe)Sb by means of x-ray magnetic circular dichroism. For an optimally doped sample with 10% Fe, a magnetization was found to rapidly increase at low magnetic fields and to saturate at high magnetic fields at room temperature, well above the Curie temperature of 40 K. We attribute this behavior to the existence of nanoscale Fe-rich ferromagnetic domains acting as superparamagnets. By fitting the magnetization curves using the Langevin function representing superparamagnetism plus the paramagnetic linear function, we estimated the average magnetic moment of the nanoscale ferromagnetic domain to be 300-400 $μ_{B}$, and the fraction of Fe atoms participating in the nano-scale ferromagnetism to be $\sim$50%. Such behavior was also reported for (In,Fe)As:Be and Ge:Fe, and seems to be a universal characteristic of the Fe-doped ferromagnetic semiconductors. Further Fe doping up to 14% led to the weakening of the ferromagnetism probably because antiferromagnetic superexchange interaction between nearest-neighbor Fe-Fe pairs becomes dominant.