Source author record

Xilu Wang

Xilu Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.HE astro-ph.SR Machine Learning Artificial Intelligence astro-ph.GA astro-ph.IM Computer Vision Neural and Evolutionary Computing

Catalog footprint

What is connected

6works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity

Large Language Models (LLMs) have achieved remarkable capabilities, but their immense computational demands during training remain a critical bottleneck for widespread adoption. Low-rank training has received attention in recent years due to its ability to significantly reduce training memory usage. Meanwhile, applying 2:4 structured sparsity to weights and activations to leverage NVIDIA GPU support for 2:4 structured sparse format has become a promising direction. However, existing low-rank methods often leave activation matrices in full-rank, which dominates memory consumption and limits throughput during large-batch training. Furthermore, directly applying sparsity to weights often leads to non-negligible performance degradation. To achieve efficient pre-training of LLMs, this paper proposes ELAS: Efficient pre-training of Low-rank LLMs via 2:4 Activation Sparsity, a novel framework for low-rank models via 2:4 activation sparsity. ELAS applies squared ReLU activation functions to the feed-forward networks in low-rank models and implements 2:4 structured sparsity on the activations after the squared ReLU operation. We evaluated ELAS through pre-training experiments on LLaMA models ranging from 60M to 1B parameters. The results demonstrate that ELAS maintains performance with minimal degradation after applying 2:4 activation sparsity, while achieving training and inference acceleration. Moreover, ELAS reduces activation memory overhead, particularly with large batch sizes. Code is available at ELAS Repo.

preprint2022arXiv

Alleviating Search Bias in Bayesian Evolutionary Optimization with Many Heterogeneous Objectives

Multi-objective optimization problems whose objectives have different evaluation costs are commonly seen in the real world. Such problems are now known as multi-objective optimization problems with heterogeneous objectives (HE-MOPs). So far, however, only a few studies have been reported to address HE-MOPs, and most of them focus on bi-objective problems with one fast objective and one slow objective. In this work, we aim to deal with HE-MOPs having more than two black-box and heterogeneous objectives. To this end, we develop a multi-objective Bayesian evolutionary optimization approach to HE-MOPs by exploiting the different data sets on the cheap and expensive objectives in HE-MOPs to alleviate the search bias caused by the heterogeneous evaluation costs for evaluating different objectives. To make the best use of two different training data sets, one with solutions evaluated on all objectives and the other with those only evaluated on the fast objectives, two separate Gaussian process models are constructed. In addition, a new acquisition function that mitigates search bias towards the fast objectives is suggested, thereby achieving a balance between convergence and diversity. We demonstrate the effectiveness of the proposed algorithm by testing it on widely used multi-/many-objective benchmark problems whose objectives are assumed to be heterogeneously expensive.

preprint2022arXiv

Near-Earth Supernovae in the Past 10 Myr: Implications for the Heliosphere

We summarize evidence that multiple supernovae exploded within 100 pc of Earth in the past few Myr. These events had dramatic effects on the heliosphere, compressing it to within ~20 au. We advocate for cross-disciplinary research of nearby supernovae, including on interstellar dust and cosmic rays. We urge for support of theory work, direct exploration, and study of extrasolar astrospheres.

preprint2020arXiv

Learning pose variations within shape population by constrained mixtures of factor analyzers

Mining and learning the shape variability of underlying population has benefited the applications including parametric shape modeling, 3D animation, and image segmentation. The current statistical shape modeling method works well on learning unstructured shape variations without obvious pose changes (relative rotations of the body parts). Studying the pose variations within a shape population involves segmenting the shapes into different articulated parts and learning the transformations of the segmented parts. This paper formulates the pose learning problem as mixtures of factor analyzers. The segmentation is obtained by components posterior probabilities and the rotations in pose variations are learned by the factor loading matrices. To guarantee that the factor loading matrices are composed by rotation matrices, constraints are imposed and the corresponding closed form optimal solution is derived. Based on the proposed method, the pose variations are automatically learned from the given shape populations. The method is applied in motion animation where new poses are generated by interpolating the existing poses in the training set. The obtained results are smooth and realistic.

preprint2020arXiv

Sandblasting the $\textit{r}$-Process: Spallation of Ejecta from Neutron Star Mergers

Neutron star mergers (NSMs) are rapid neutron capture ($\textit{r}$-process) nucleosynthesis sites that expel matter at high velocities, from $0.1c$ to as high as $0.6c$. Nuclei ejected at these speeds are sufficiently energetic to initiate spallation nuclear reactions with interstellar medium particles. We adopt a thick-target model for the propagation of high-speed heavy nuclei in the interstellar medium, similar to the transport of cosmic rays. We find that spallation may create observable perturbations to NSM isotopic abundances, particularly around the low-mass edges of the $\textit{r}$-process peaks where neighboring nuclei have very different abundances. The extent to which spallation modifies the final NSM isotopic yields depends on: (1) the ejected abundances, which are determined by the NSM astrophysical conditions and the properties of nuclei far from stability, (2) the ejecta velocity distribution and propagation in interstellar matter, and (3) the spallation cross-sections. Observed solar and stellar $\textit{r}$-process yields could thus constrain the velocity distribution of ejected neutron star matter, assuming NSMs are the dominant $\textit{r}$-process source. We suggest avenues for future work, including measurement of relevant cross sections.

preprint2020arXiv

The R-Process Alliance: The Peculiar Chemical Abundance Pattern of RAVE J183013.5-455510

We report on the spectroscopic analysis of RAVE J183013.5-455510, an extremely metal-poor star, highly enhanced in CNO, and with discernible contributions from the rapid neutron-capture process. There is no evidence of binarity for this object. At [Fe/H]=-3.57, this is one of the lowest metallicity stars currently observed, with 18 measured abundances of neutron-capture elements. The presence of Ba, La, and Ce abundances above the Solar System r-process predictions suggest that there must have been a non-standard source of r-process elements operating at such low metallicities. One plausible explanation is that this enhancement originates from material ejected at unusually fast velocities in a neutron star merger event. We also explore the possibility that the neutron-capture elements were produced during the evolution and explosion of a rotating massive star. In addition, based on comparisons with yields from zero-metallicity faint supernova, we speculate that RAVE J1830-4555 was formed from a gas cloud pre-enriched by both progenitor types. From analysis based on Gaia DR2 measurements, we show that this star has orbital properties similar to the Galactic metal-weak thick-disk stellar population.

Xilu Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity

Alleviating Search Bias in Bayesian Evolutionary Optimization with Many Heterogeneous Objectives

Near-Earth Supernovae in the Past 10 Myr: Implications for the Heliosphere

Learning pose variations within shape population by constrained mixtures of factor analyzers

Sandblasting the $\textit{r}$-Process: Spallation of Ejecta from Neutron Star Mergers

The R-Process Alliance: The Peculiar Chemical Abundance Pattern of RAVE J183013.5-455510