Researcher profile

Shan Huang

Shan Huang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
17topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

ARC: Active and Reflection-driven Context Management for Long-Horizon Information Seeking Agents

Large language models are increasingly deployed as research agents for deep search and long-horizon information seeking, yet their performance often degrades as interaction histories grow. This degradation, known as context rot, reflects a failure to maintain coherent and task-relevant internal states over extended reasoning horizons. Existing approaches primarily manage context through raw accumulation or passive summarization, treating it as a static artifact and allowing early errors or misplaced emphasis to persist. Motivated by this perspective, we propose ARC, which is the first framework to systematically formulate context management as an active, reflection-driven process that treats context as a dynamic internal reasoning state during execution. ARC operationalizes this view through reflection-driven monitoring and revision, allowing agents to actively reorganize their working context when misalignment or degradation is detected. Experiments on challenging long-horizon information-seeking benchmarks show that ARC consistently outperforms passive context compression methods, achieving up to an 11% absolute improvement in accuracy on BrowseComp-ZH with Qwen2.5-32B-Instruct.

preprint2026arXiv

CM-EVS: Sparse Panoramic RGB-D-Pose Data for Complete Scene Coverage

Modern 3D visual learning relies on observations sampled from metric 3D assets, yet existing scans, meshes, point clouds, simulations, and reconstructions do not directly provide a sparse, comparable, and geometry-consistent panoramic training interface. Dense trajectories duplicate nearby views, source-specific rendering policies yield heterogeneous annotations, and sparse heuristics may miss important regions or introduce depth-inconsistent observations. We study how to convert 3D assets into sparse panoramic RGB-D-pose data that preserves complete scene coverage with low redundancy and auditable provenance. We propose COVER (Coverage-Oriented Viewpoint curation with ERP Range-depth warping), a training-free ERP viewpoint curator that projects geometry observed from selected views into candidate ERP probes, scores incremental coverage, and penalizes depth conflicts. Under bounded proxy error, its greedy coverage proxy preserves the standard coverage-style approximation behavior up to an additive error term. Using COVER, we build CM-EVS (Coverage-curated Metric ERP View Set), a panoramic RGB-D-pose dataset with 36,373 curated ERP frames from 1,275 indoor scenes across Blender indoor, HM3D, and ScanNet++, complemented by outdoor panoramas from TartanGround and OB3D re-encoded into the same schema. Each frame provides full-sphere RGB, metric range depth, calibrated pose; COVER-produced indoor frames include per-step provenance logs. With a median of only 25 frames per indoor scene, CM-EVS covers all 13 unified room types while maintaining compact scene-level coverage. Experiments show that COVER improves the coverage-conflict trade-off, making CM-EVS a sparse, compact, and auditable RGB-D-pose resource for geometry-consistent panoramic 3D learning.

preprint2026arXiv

Layout optimization for the LUXE-NPOD experiment

Beam dump experiments represent an effective way to probe new physics in a parameter space, where new particles have feeble couplings to the Standard Model sector and masses below the GeV scale. The LUXE experiment, designed primarily to study strong-field quantum electrodynamics, can be used also as a photon beam dump experiment with a unique reach for new spin-0 particles in the $10-350~\mathrm{MeV}$ mass and $10^{-6}-10^{-3}~\mathrm{GeV}^{-1}$ couplings to photons ranges. This is achieved via the ``New Physics search with Optical Dump'' (NPOD) concept. While prior estimations were obtained with a simplified model of the experimental setup, in this work we present a systematic study of the new physics reach in the full, realistic experimental apparatus, including an existing detector to be used in the LUXE NPOD context. We furthermore investigate updated scenarios of LUXE's experimental plan and confirm that our results are in agreement with the original estimations of a background-free operation.

preprint2022arXiv

Axionlike-particle generation by laser-plasma interaction

The hypothetical axion and axion-like particles, feebly coupled with photon, have not yet been found in any experiment. With the improvement of laser technique, much stronger but shorter quasi-static electric and magnetic fields can be created in laboratory using laser-plasma interaction, compared to the fields of large magnets, to help the search of axion. In this article, we discuss the feasibility of ALPs exploration using planarly or cylindrically symmetric laser-plasma fields as background and an x-ray free-electron laser as probe. Both the probe and the background fields are polarized such that the existence of ALPs in the corresponding parameter space will cause polarization rotation of the probe, which can be detected with high accuracy. Besides, a structured field in the plasma creates a tunable transverse profile for the interaction and improves the signal-to-noise ratio via phase-matching mechanism. The ALP mass discussed in this article ranges from $10^{-3}$ eV to 1 keV. Some simple schemes and estimations on ALP production and polarization rotation of probe photon are given, which reveals the possibility of future laser-plasma ALP source in laboratory.

preprint2022arXiv

Building Embedded Systems Like It's 1996

Embedded devices are ubiquitous. However, preliminary evidence shows that attack mitigations protecting our desktops/servers/phones are missing in embedded devices, posing a significant threat to embedded security. To this end, this paper presents an in-depth study on the adoption of common attack mitigations on embedded devices. Precisely, it measures the presence of standard mitigations against memory corruptions in over 10k Linux-based firmware of deployed embedded devices. The study reveals that embedded devices largely omit both user-space and kernel-level attack mitigations. The adoption rates on embedded devices are multiple times lower than their desktop counterparts. An equally important observation is that the situation is not improving over time. Without changing the current practices, the attack mitigations will remain missing, which may become a bigger threat in the upcoming IoT era. Throughout follow-up analyses, we further inferred a set of factors possibly contributing to the absence of attack mitigations. The exemplary ones include massive reuse of non-protected software, lateness in upgrading outdated kernels, and restrictions imposed by automated building tools. We envision these will turn into insights towards improving the adoption of attack mitigations on embedded devices in the future.

preprint2022arXiv

Deep learning study of an electromagnetic calorimeter

The accurate and precise extraction of information from a modern particle physics detector, such as an electromagnetic calorimeter, may be complicated and challenging. In order to overcome the difficulties we propose processing the detector output using the deep-learning methodology. Our algorithmic approach makes use of a known network architecture, which is being modified to fit the problems at hand. The results are of high quality (biases of order 2%) and, moreover, indicate that most of the information may be derived from only a fraction of the detector. We conclude that such an analysis helps us understanding the essential mechanism of the detector and should be performed as a part of its designing procedure.

preprint2022arXiv

Emotions in Online Content Diffusion

Social media-transmitted online information, which is associated with emotional expressions, shapes our thoughts and actions. In this study, we incorporate social network theories and analyses and use a computational approach to investigate how emotional expressions, particularly \textit{negative discrete emotional expressions} (i.e., anxiety, sadness, anger, and disgust), lead to differential diffusion of online content in social media networks. We rigorously quantify diffusion cascades' structural properties (i.e., size, depth, maximum breadth, and structural virality) and analyze the individual characteristics (i.e., age, gender, and network degree) and social ties (i.e., strong and weak) involved in the cascading process. In our sample, more than six million unique individuals transmitted 387,486 randomly selected articles in a massive-scale online social network, WeChat. We detect the expression of discrete emotions embedded in these articles, using a newly generated domain-specific and up-to-date emotion lexicon. We apply a partial-linear instrumental variable approach with a double machine learning framework to causally identify the impact of the negative discrete emotions on online content diffusion. We find that articles with more expressions of anxiety spread to a larger number of individuals and diffuse more deeply, broadly, and virally. Expressions of anger and sadness, however, reduce cascades' size and maximum breadth. We further show that the articles with different degrees of negative emotional expressions tend to spread differently based on individual characteristics and social ties. Our results shed light on content marketing and regulation, utilizing negative emotional expressions.

preprint2022arXiv

LUXE: A new experiment to study non-perturbative QED in electron-laser and photon-laser collisions

LUXE (Laser Und XFEL Experiment) is a new experiment in planning at DESY Hamburg using the electron beam of the European XFEL. LUXE is intended to study collisions between a high-intensity optical laser and 16.5 GeV electrons from the XFEL electron beam, as well as collisions between the optical laser and high-energy secondary photons. The physics objective of LUXE are processes of quantum electrodynamics (QED) at the strong-field frontier, where the electromagnetic field of the laser is above the Schwinger limit. In this regime, QED is non-perturbative. This manifests itself in the creation of physical electron-positron pairs from the QED vacuum, similar to Hawking radiation from black holes. LUXE intends to measure the positron production rate in an unprecedented laser intensity regime. An overview of the LUXE experimental setup and its challenges will be given, followed by a discussion of the expected physics reach in the context of testing QED in the non-perturbative regime.

preprint2022arXiv

WebUAV-3M: A Benchmark for Unveiling the Power of Million-Scale Deep UAV Tracking

Unmanned aerial vehicle (UAV) tracking is of great significance for a wide range of applications, such as delivery and agriculture. Previous benchmarks in this area mainly focused on small-scale tracking problems while ignoring the amounts of data, types of data modalities, diversities of target categories and scenarios, and evaluation protocols involved, greatly hiding the massive power of deep UAV tracking. In this work, we propose WebUAV-3M, the largest public UAV tracking benchmark to date, to facilitate both the development and evaluation of deep UAV trackers. WebUAV-3M contains over 3.3 million frames across 4,500 videos and offers 223 highly diverse target categories. Each video is densely annotated with bounding boxes by an efficient and scalable semiautomatic target annotation (SATA) pipeline. Importantly, to take advantage of the complementary superiority of language and audio, we enrich WebUAV-3M by innovatively providing both natural language specifications and audio descriptions. We believe that such additions will greatly boost future research in terms of exploring language features and audio cues for multimodal UAV tracking. In addition, a fine-grained UAV tracking-under-scenario constraint (UTUSC) evaluation protocol and seven challenging scenario subtest sets are constructed to enable the community to develop, adapt and evaluate various types of advanced trackers. We provide extensive evaluations and detailed analyses of 43 representative trackers and envision future research directions in the field of deep UAV tracking and beyond. The dataset, toolkits and baseline results are available at \url{https://github.com/983632847/WebUAV-3M}.

preprint2020arXiv

A Multi-oriented Chinese Keyword Spotter Guided by Text Line Detection

Chinese keyword spotting is a challenging task as there is no visual blank for Chinese words. Different from English words which are split naturally by visual blanks, Chinese words are generally split only by semantic information. In this paper, we propose a new Chinese keyword spotter for natural images, which is inspired by Mask R-CNN. We propose to predict the keyword masks guided by text line detection. Firstly, proposals of text lines are generated by Faster R-CNN;Then, text line masks and keyword masks are predicted by segmentation in the proposals. In this way, the text lines and keywords are predicted in parallel. We create two Chinese keyword datasets based on RCTW-17 and ICPR MTWI2018 to verify the effectiveness of our method.

preprint2020arXiv

Application of Seq2Seq Models on Code Correction

We apply various seq2seq models on programming language correction tasks on Juliet Test Suite for C/C++ and Java of Software Assurance Reference Datasets(SARD), and achieve 75\%(for C/C++) and 56\%(for Java) repair rates on these tasks. We introduce Pyramid Encoder in these seq2seq models, which largely increases the computational efficiency and memory efficiency, while remain similar repair rate to their non-pyramid counterparts. We successfully carry out error type classification task on ITC benchmark examples (with only 685 code instances) using transfer learning with models pre-trained on Juliet Test Suite, pointing out a novel way of processing small programing language datasets.

preprint2019arXiv

Host Galaxies of Type Ic and Broad-lined Type Ic Supernovae from the Palomar Transient Factory: Implication for Jet Production

Unlike the ordinary supernovae (SNe) some of which are hydrogen and helium deficient (called Type Ic SNe), broad-lined Type Ic SNe (SNe Ic-bl) are very energetic events, and all SNe coincident with bona fide long duration gamma-ray bursts (LGRBs) are of Type Ic-bl. Understanding the progenitors and the mechanism driving SN Ic-bl explosions vs those of their SNe Ic cousins is key to understanding the SN-GRB relationship and jet production in massive stars. Here we present the largest set of host-galaxy spectra of 28 SNe Ic and 14 SN Ic-bl, all discovered before 2013 by the same untargeted survey, namely the Palomar Transient Factory (PTF). We carefully measure their gas-phase metallicities, stellar masses (M*s) and star-formation rates (SFRs) by taking into account recent progress in the metallicity field and propagating uncertainties correctly. We further re-analyze the hosts of 10 literature SN-GRBs using the same methods and compare them to our PTF SN hosts with the goal of constraining their progenitors from their local environments by conducting a thorough statistical comparison, including upper limits. We find that the metallicities, SFRs and M*s of our PTF SN Ic-bl hosts are statistically comparable to those of SN-GRBs, but significantly lower than those of the PTF SNe Ic. The mass-metallicity relations as defined by the SNe Ic-bl and SN-GRBs are not significantly different from the same relations as defined by the SDSS galaxies, in contrast to claims by earlier works. Our findings point towards low metallicity as a crucial ingredient for SN Ic-bl and SN-GRB production since we are able to break the degeneracy between high SFR and low metallicity. We suggest that the PTF SNe Ic-bl may have produced jets that were choked inside the star or were able break out of the star as unseen low-luminosity or off-axis GRBs.