Source author record

Xuan Zhang

Xuan Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

46works

35topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

1-GHz VIS-to-MIR frequency combs enabled by CMOS-compatible nanophotonic waveguides

A fully stabilized frequency comb is essential for precision metrology and coherent optical synthesis. However, fully-stabilized frequency combs generally require separate stages for supercontinuum generation (SCG) and self-referencing, largely limiting their compactness. Here, enabled by the low-threshold multi-octave supercontinuum generation and concurrent third-harmonic generation in low-loss silicon nitride waveguides, we present a novel approach to a self-referenced frequency comb source at 1 GHz repetition rate spanning from the full visible (VIS) to the mid-infrared (MIR). Our coherent comb is seeded by an all-polarization-maintaining ultrafast fiber laser at 1556 nm, with a pulse duration of 73 fs at 1 GHz repetition rate. With an injected energy of merely 110 pJ, the pulses propagate through dispersion-engineered Si3N4 waveguides, generating supercontinuum spanning over three octaves from 350-3280 nm i.e. 0.76 PHz of coherent bandwidth. Moreover, the on-chip third harmonic generation provides a carrier envelope offset beat note via f-3f with a signal-to-noise ratio of 43 dB. Fueled by the evolving photonic integration providing possibilities of on-chip filtering and photo-detectors, this approach for single-chip self-referencing of high-repetition-rate frequency combs paves the way for ultrabroadband comb sources with unprecedented compactness and field-readiness.

preprint2026arXiv

Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment

Most multimodal models treat every negative pair alike, ignoring the ambiguous negatives that differ from the positive by only a small detail. We propose Boundary-Aware Curriculum with Local Attention (BACL), a lightweight add-on that turns these borderline cases into a curriculum signal. A Boundary-aware Negative Sampler gradually raises difficulty, while a Contrastive Local Attention loss highlights where the mismatch occurs. The two modules are fully differentiable and work with any off-the-shelf dual encoder. Theory predicts a fast O(1/n) error rate; practice shows up to +32% R@1 over CLIP and new SOTA on four large-scale benchmarks, all without extra labels.

preprint2026arXiv

First Submillimeter Lights from Dome A: Tracing the Carbon Cycle in the Feedback of Massive Stars

The cycling of carbon between its ionized, atomic, and molecular phases shapes the chemical compositions and physical conditions of the interstellar medium (ISM). However, ground-based studies of the full carbon cycle have been limited by atmospheric absorption. Dome~A, the most promising site for submillimeter astronomy, has long resisted successful submillimeter astronomical observations. Using the 60~cm Antarctic Terahertz Explorer, we present the first successful CO ($4-3$) and [CI] ($^3P_1 - ^3P_0$) mapping observations of two archetypal triggered massive star-formation regions at Dome~A. These data, together with archival [CII], provide the first complete characterization of all three carbon phases in these environments. We find elevated C$^{0}$/CO abundance ratios in high-extinction regions, plausibly driven by deep penetration of intense radiation fields from massive stars into a clumpy ISM. These findings mark a major milestone for submillimeter astronomy at Dome~A and offer valuable insights into the impact of massive star feedback on the surrounding ISM.

preprint2026arXiv

GR-Ben: A General Reasoning Benchmark for Evaluating Process Reward Models

Currently, process reward models (PRMs) have exhibited remarkable potential for test-time scaling. Since large language models (LLMs) regularly generate flawed intermediate reasoning steps when tackling a broad spectrum of reasoning and decision-making tasks, PRMs are required to possess capabilities for detecting process-level errors in real-world scenarios. However, existing benchmarks primarily focus on mathematical reasoning, thereby failing to comprehensively evaluate the error detection ability of PRMs across diverse reasoning scenarios. To mitigate this gap, we introduce GR-Ben, a process-level benchmark specifically designed for assessing PRM's performance across two primary reasoning domains (science and logic) and nine subdomains. We conduct extensive experiments on a diverse set of 22 models, encompassing both PRMs and LLMs, and derive two key findings: (1) In domains beyond mathematical reasoning, the error-detection ability of existing PRMs and LLMs is found to be markedly weaker by comparison.(2) In general, PRMs are less adept at identifying knowledge-based errors, whereas LLMs exhibit poorer performance in detecting computational errors. We hope GR-Ben can foster future researches on PRMs for general domains, thereby enhancing the reasoning capabilities of LLMs.

preprint2026arXiv

Synergy over Discrepancy: A Partition-Based Approach to Multi-Domain LLM Fine-Tuning

Large language models (LLMs) demonstrate impressive generalization abilities, yet adapting them effectively across multiple heterogeneous domains remains challenging due to inter-domain interference. To overcome this challenge, we propose a partition-based multi-stage fine-tuning framework designed to exploit inter-domain synergies while minimizing negative transfer. Our approach strategically partitions domains into subsets (stages) by balancing domain discrepancy, synergy, and model capacity constraints. We theoretically analyze the proposed framework and derive novel generalization bounds that justify our partitioning strategy. Extensive empirical evaluations on various language understanding tasks show that our method consistently outperforms state-of-the-art baselines.

preprint2023arXiv

Building Concise Logical Patterns by Constraining Tsetlin Machine Clause Size

Tsetlin machine (TM) is a logic-based machine learning approach with the crucial advantages of being transparent and hardware-friendly. While TMs match or surpass deep learning accuracy for an increasing number of applications, large clause pools tend to produce clauses with many literals (long clauses). As such, they become less interpretable. Further, longer clauses increase the switching activity of the clause logic in hardware, consuming more power. This paper introduces a novel variant of TM learning - Clause Size Constrained TMs (CSC-TMs) - where one can set a soft constraint on the clause size. As soon as a clause includes more literals than the constraint allows, it starts expelling literals. Accordingly, oversized clauses only appear transiently. To evaluate CSC-TM, we conduct classification, clustering, and regression experiments on tabular data, natural language text, images, and board games. Our results show that CSC-TM maintains accuracy with up to 80 times fewer literals. Indeed, the accuracy increases with shorter clauses for TREC, IMDb, and BBC Sports. After the accuracy peaks, it drops gracefully as the clause size approaches a single literal. We finally analyze CSC-TM power consumption and derive new convergence properties.

preprint2023arXiv

Joint Hybrid Beamforming and User Scheduling for Multi-Satellite Cooperative Networks

In this paper, we consider a cooperative communication network where multiple satellites provide services for ground users (GUs) (at the same time and on the same frequency). The communication and computational resources on satellites are usually restricted and the satellite-GU link determination affects the communication performance significantly when multiple satellites provide services for multiple GUs in a collaborative manner. Therefore, considering the limitation of the on-board radio-frequency chains, we first propose a hybrid beamforming method consisting of analog beamforming for beam alignment and digital beamforming for interference mitigation. Then, to establish appropriate connections between satellites and GUs, we propose a heuristic user scheduling algorithm which determines the connections according to the total spectral efficiency (SE) increment of the multi-satellite cooperative network. Next, a joint hybrid beamforming and user scheduling scheme is proposed to dramatically improve the performance of the multi-satellite cooperative network. Moreover, simulations are conducted to compare the proposed schemes with representative baselines and analyze the key factors influencing the performance of the multi-satellite cooperative network. It is shown that the proposed joint beamforming and user scheduling approach can provide 47.2% SE improvement on average as compared with its non-joint counterpart.

preprint2022arXiv

A Fair and Efficient Hybrid Federated Learning Framework based on XGBoost for Distributed Power Prediction

In a modern power system, real-time data on power generation/consumption and its relevant features are stored in various distributed parties, including household meters, transformer stations and external organizations. To fully exploit the underlying patterns of these distributed data for accurate power prediction, federated learning is needed as a collaborative but privacy-preserving training scheme. However, current federated learning frameworks are polarized towards addressing either the horizontal or vertical separation of data, and tend to overlook the case where both are present. Furthermore, in mainstream horizontal federated learning frameworks, only artificial neural networks are employed to learn the data patterns, which are considered less accurate and interpretable compared to tree-based models on tabular datasets. To this end, we propose a hybrid federated learning framework based on XGBoost, for distributed power prediction from real-time external features. In addition to introducing boosted trees to improve accuracy and interpretability, we combine horizontal and vertical federated learning, to address the scenario where features are scattered in local heterogeneous parties and samples are scattered in various local districts. Moreover, we design a dynamic task allocation scheme such that each party gets a fair share of information, and the computing power of each party can be fully leveraged to boost training efficiency. A follow-up case study is presented to justify the necessity of adopting the proposed framework. The advantages of the proposed framework in fairness, efficiency and accuracy performance are also confirmed.

preprint2022arXiv

CATCH: Chasing All Transients Constellation Hunters Space Mission

In time-domain astronomy, a substantial number of transients will be discovered by multi-wavelength and multi-messenger observatories, posing a great challenge for follow-up capabilities. We have thus proposed an intelligent X-ray constellation, the Chasing All Transients Constellation Hunters (CATCH) space mission. Consisting of 126 micro-satellites in three types, CATCH will have the capability to perform follow-up observations for a large number of different types of transients simultaneously. Each satellite in the constellation will carry lightweight X-ray optics and use a deployable mast to increase the focal length. The combination of different optics and detector systems enables different types of satellites to have multiform observation capabilities, including timing, spectroscopy, imaging, and polarization. Controlled by the intelligent system, different satellites can cooperate to perform uninterrupted monitoring, all-sky follow-up observations, and scanning observations with a flexible field of view (FOV) and multi-dimensional observations. Therefore, CATCH will be a powerful mission to study the dynamic universe. Here, we present the current design of the spacecraft, optics, detector system, constellation configuration and observing modes, as well as the development plan.

preprint2022arXiv

Data-driven identification of the spatio-temporal structure of turbulent flows by streaming Dynamic Mode Decomposition

Streaming Dynamic Mode Decomposition (sDMD) (Hemati et al., Phys. Fluids 26(2014)) is a low-storage version of Dynamic Mode Decomposition (DMD) (Schmid, J. Fluid Mech. 656 (2010)), a data-driven method to extract spatio-temporal flow patterns. Streaming DMD avoids storing the entire data sequence in memory by approximating the dynamic modes through incremental updates with new available data. In this paper, we use sDMD to identify and extract dominant spatio-temporal structures of different turbulent flows, requiring the analysis of large datasets. First, the efficiency and accuracy of sDMD are compared to the classical DMD, using a publicly available test dataset that consists of velocity field snapshots obtained by direct numerical simulation of a wake flow behind a cylinder. Streaming DMD not only reliably reproduces the most important dynamical features of the flow; our calculations also highlight its advantage in terms of the required computational resources. We subsequently use sDMD to analyse three different turbulent flows that all show some degree of large-scale coherence: rapidly rotating Rayleigh--Bénard convection, horizontal convection and the asymptotic suction boundary layer. Structures of different frequencies and spatial extent can be clearly separated, and the prominent features of the dynamics are captured with just a few dynamic modes. In summary, we demonstrate that sDMD is a powerful tool for the identification of spatio-temporal structures in a wide range of turbulent flows.

preprint2022arXiv

Data-Driven Robust Control for Discrete Linear Time-Invariant Systems: A Descriptor System Approach

Given the recent surge of interest in data-driven control, this paper proposes a two-step method to study robust data-driven control for a parameter-unknown linear time-invariant (LTI) system that is affected by energy-bounded noises. First, two data experiments are designed and corresponding data are collected, then the investigated system is equivalently written into a data-based descriptor system with structured parametric uncertainties. Second, combined with model-based control theory for descriptor systems, state feedback controllers are designed for such data-based descriptor system, which stabilize the original LTI system and guarantee the ${H_\infty}$ performance. Finally, a simulation example is provided to illustrate the effectiveness and merits of our method.

preprint2022arXiv

Domain Knowledge-Based Automated Analog Circuit Design with Deep Reinforcement Learning

The design automation of analog circuits is a longstanding challenge in the integrated circuit field. This paper presents a deep reinforcement learning method to expedite the design of analog circuits at the pre-layout stage, where the goal is to find device parameters to fulfill desired circuit specifications. Our approach is inspired by experienced human designers who rely on domain knowledge of analog circuit design (e.g., circuit topology and couplings between circuit specifications) to tackle the problem. Unlike all prior methods, our method originally incorporates such key domain knowledge into policy learning with a graph-based policy network, thereby best modeling the relations between circuit parameters and design targets. Experimental results on exemplary circuits show it achieves human-level design accuracy (~99%) with 1.5x efficiency of existing best-performing methods. Our method also shows better generalization ability to unseen specifications and optimality in circuit performance optimization. Moreover, it applies to designing diverse analog circuits across different semiconductor technologies, breaking the limitations of prior ad-hoc methods in designing one particular type of analog circuits with conventional semiconductor technology.

preprint2022arXiv

Domain Knowledge-Infused Deep Learning for Automated Analog/Radio-Frequency Circuit Parameter Optimization

The design automation of analog circuits is a longstanding challenge. This paper presents a reinforcement learning method enhanced by graph learning to automate the analog circuit parameter optimization at the pre-layout stage, i.e., finding device parameters to fulfill desired circuit specifications. Unlike all prior methods, our approach is inspired by human experts who rely on domain knowledge of analog circuit design (e.g., circuit topology and couplings between circuit specifications) to tackle the problem. By originally incorporating such key domain knowledge into policy training with a multimodal network, the method best learns the complex relations between circuit parameters and design targets, enabling optimal decisions in the optimization process. Experimental results on exemplary circuits show it achieves human-level design accuracy (99%) 1.5X efficiency of existing best-performing methods. Our method also shows better generalization ability to unseen specifications and optimality in circuit performance optimization. Moreover, it applies to design radio-frequency circuits on emerging semiconductor technologies, breaking the limitations of prior learning methods in designing conventional analog circuits.

preprint2022arXiv

Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation

Personalized recommendation is an important class of deep-learning applications that powers a large collection of internet services and consumes a considerable amount of datacenter resources. As the scale of production-grade recommendation systems continues to grow, optimizing their serving performance and efficiency in a heterogeneous datacenter is important and can translate into infrastructure capacity saving. In this paper, we propose Hercules, an optimized framework for personalized recommendation inference serving that targets diverse industry-representative models and cloud-scale heterogeneous systems. Hercules performs a two-stage optimization procedure - offline profiling and online serving. The first stage searches the large under-explored task scheduling space with a gradient-based search algorithm achieving up to 9.0x latency-bounded throughput improvement on individual servers; it also identifies the optimal heterogeneous server architecture for each recommendation workload. The second stage performs heterogeneity-aware cluster provisioning to optimize resource mapping and allocation in response to fluctuating diurnal loads. The proposed cluster scheduler in Hercules achieves 47.7% cluster capacity saving and reduces the provisioned power by 23.7% over a state-of-the-art greedy scheduler.

preprint2022arXiv

Lattice Convolutional Networks for Learning Ground States of Quantum Many-Body Systems

Deep learning methods have been shown to be effective in representing ground-state wave functions of quantum many-body systems. Existing methods use convolutional neural networks (CNNs) for square lattices due to their image-like structures. For non-square lattices, existing method uses graph neural network (GNN) in which structure information is not precisely captured, thereby requiring additional hand-crafted sublattice encoding. In this work, we propose lattice convolutions in which a set of proposed operations are used to convert non-square lattices into grid-like augmented lattices on which regular convolution can be applied. Based on the proposed lattice convolutions, we design lattice convolutional networks (LCN) that use self-gating and attention mechanisms. Experimental results show that our method achieves performance on par or better than existing methods on spin 1/2 $J_1$-$J_2$ Heisenberg model over the square, honeycomb, triangular, and kagome lattices while without using hand-crafted encoding.

preprint2022arXiv

Music Influence Modeling Based on Directed Network Model

Studying the history of music may provide a glimpse into the development of human creativity as we examine the evolutionary and revolutionary trends in music and genres. First, a musical influence metric was created to construct a directed network of musical influence. Second, we examined the revolutions and development of musical genres, modeled the similarity, and explored similarities and influences within and between genres. Hierarchical cluster analysis and time series analysis of genres were used to explore the correlation between genres. Finally, Network Analysis, Semantic Analysis, and Random Forest Model are employed to find the revolutionaries. The above work was applied to Country music to sort out and analyze its evolution. In studying the connection between music and the social environment, time series analysis is used to determine the impact of social, political, or technological changes on music.

preprint2022arXiv

Neural-PIM: Efficient Processing-In-Memory with Neural Approximation of Peripherals

Processing-in-memory (PIM) architectures have demonstrated great potential in accelerating numerous deep learning tasks. Particularly, resistive random-access memory (RRAM) devices provide a promising hardware substrate to build PIM accelerators due to their abilities to realize efficient in-situ vector-matrix multiplications (VMMs). However, existing PIM accelerators suffer from frequent and energy-intensive analog-to-digital (A/D) conversions, severely limiting their performance. This paper presents a new PIM architecture to efficiently accelerate deep learning tasks by minimizing the required A/D conversions with analog accumulation and neural approximated peripheral circuits. We first characterize the different dataflows employed by existing PIM accelerators, based on which a new dataflow is proposed to remarkably reduce the required A/D conversions for VMMs by extending shift and add (S+A) operations into the analog domain before the final quantizations. We then leverage a neural approximation method to design both analog accumulation circuits (S+A) and quantization circuits (ADCs) with RRAM crossbar arrays in a highly-efficient manner. Finally, we apply them to build an RRAM-based PIM accelerator (i.e., \textbf{Neural-PIM}) upon the proposed analog dataflow and evaluate its system-level performance. Evaluations on different benchmarks demonstrate that Neural-PIM can improve energy efficiency by 5.36x (1.73x) and speed up throughput by 3.43x (1.59x) without losing accuracy, compared to the state-of-the-art RRAM-based PIM accelerators, i.e., ISAAC (CASCADE).

preprint2022arXiv

Robust and fair work allocation

In today's digital world, interaction with online platforms is ubiquitous, and thus content moderation is important for protecting users from content that do not comply with pre-established community guidelines. Having a robust content moderation system throughout every stage of planning is particularly important. We study the short-term planning problem of allocating human content reviewers to different harmful content categories. We use tools from fair division and study the application of competitive equilibrium and leximin allocation rules. Furthermore, we incorporate, to the traditional Fisher market setup, novel aspects that are of practical importance. The first aspect is the forecasted workload of different content categories. We show how a formulation that is inspired by the celebrated Eisenberg-Gale program allows us to find an allocation that not only satisfies the forecasted workload, but also fairly allocates the remaining reviewing hours among all content categories. The resulting allocation is also robust as the additional allocation provides a guardrail in cases where the actual workload deviates from the predicted workload. The second practical consideration is time dependent allocation that is motivated by the fact that partners need scheduling guidance for the reviewers across days to achieve efficiency. To address the time component, we introduce new extensions of the various fair allocation approaches for the single-time period setting, and we show that many properties extend in essence, albeit with some modifications. Related to the time component, we additionally investigate how to satisfy markets' desire for smooth allocation (e.g., partners for content reviewers prefer an allocation that does not vary much from time to time, to minimize staffing switch). We demonstrate the performance of our proposed approaches through real-world data obtained from Meta.

preprint2022arXiv

Self-Supervised Graph Neural Networks for Improved Electroencephalographic Seizure Analysis

Automated seizure detection and classification from electroencephalography (EEG) can greatly improve seizure diagnosis and treatment. However, several modeling challenges remain unaddressed in prior automated seizure detection and classification studies: (1) representing non-Euclidean data structure in EEGs, (2) accurately classifying rare seizure types, and (3) lacking a quantitative interpretability approach to measure model ability to localize seizures. In this study, we address these challenges by (1) representing the spatiotemporal dependencies in EEGs using a graph neural network (GNN) and proposing two EEG graph structures that capture the electrode geometry or dynamic brain connectivity, (2) proposing a self-supervised pre-training method that predicts preprocessed signals for the next time period to further improve model performance, particularly on rare seizure types, and (3) proposing a quantitative model interpretability approach to assess a model's ability to localize seizures within EEGs. When evaluating our approach on seizure detection and classification on a large public dataset, we find that our GNN with self-supervised pre-training achieves 0.875 Area Under the Receiver Operating Characteristic Curve on seizure detection and 0.749 weighted F1-score on seizure classification, outperforming previous methods for both seizure detection and classification. Moreover, our self-supervised pre-training strategy significantly improves classification of rare seizure types. Furthermore, quantitative interpretability analysis shows that our GNN with self-supervised pre-training precisely localizes 25.4% focal seizures, a 21.9 point improvement over existing CNNs. Finally, by superimposing the identified seizure locations on both raw EEG signals and EEG graphs, our approach could provide clinicians with an intuitive visualization of localized seizure regions.

preprint2022arXiv

Simultaneous Detection of Optical Flares of the Magnetically Active M Dwarf Wolf 359

We present detections of stellar flares of Wolf\,359, an M6.5 dwarf in the solar neighborhood (2.41~pc) known to be prone to flares due to surface magnetic activity. The observations were carried out from 2020 April 23 to 29 with a 1-m and a 0.5-m telescope separated by nearly 300~km in Xinjiang, China. In 27~hr of photometric monitoring, a total of 13 optical flares were detected, each with a total energy of $\gtrsim 5 \times 10^{29}$~erg. The measured event rate of about once every two hours is consistent with those reported previously in radio, X-ray and optical wavelengths for this star. One such flare, detected by both telescopes on 26 April, was an energetic event with a released energy of nearly $10^{33}$~erg. The two-telescope lightcurves of this major event sampled at different cadences and exposure timings enabled us to better estimate the intrinsic flare profile, which reached a peak of up to 1.6 times the stellar quiescent brightness, that otherwise would have been underestimated in the observed flare amplitudes of about $0.4$ and $0.8$, respectively, with single telescopes alone. The compromise between fast sampling so as to resolve a flare profile versus a longer integration time for higher photometric signal-to-noise provides a useful guidance in the experimental design of future flare observations.

preprint2021arXiv

Control Reconfiguration of Dynamical Systems for Improved Performance via Reverse- and Forward-engineering

This paper presents a control reconfiguration approach to improve the performance of two classes of dynamical systems. Motivated by recent research on re-engineering cyber-physical systems, we propose a three-step control retrofit procedure. First, we reverse-engineer a dynamical system to dig out an optimization problem it actually solves. Second, we forward-engineer the system by applying a corresponding faster algorithm to solve this optimization problem. Finally, by comparing the original and accelerated dynamics, we obtain the implementation of the redesigned part (the extra dynamics). As a result, the convergence rate/speed or transient behavior of the given system can be improved while the system control structure is maintained. Internet congestion control and distributed proportional-integral (PI) control, as applications in the two different classes of target systems, are used to show the effectiveness of the proposed approach.

preprint2021arXiv

Data-Driven Controllability Analysis and Stabilization for Linear Descriptor Systems

For a parameter-unknown linear descriptor system, this paper proposes data-driven methods to testify the system's type and controllability and then to stabilize it. First, a data-based condition is developed to identify whether this unknown system is a descriptor system or is equivalent to a normal system. Furthermore, various controllability concepts are testified by replacing the descriptor system's matrices with data. Finally, a data-based decomposing method is proposed to transfer the nominal system into its slow-fast subsystems' form, so that a state feedback controller for the slow subsystem can be obtained from persistently exciting input and state sequences. Meanwhile, due to the equivalent stabilizability between the nominal system and its slow subsystem, a state feedback controller which stabilizes the nominal system is also obtained. A simulation example is provided to illustrate the effectiveness of those methods.

preprint2021arXiv

Disentangled Recurrent Wasserstein Autoencoder

Learning disentangled representations leads to interpretable models and facilitates data generation with style transfer, which has been extensively studied on static data such as images in an unsupervised learning framework. However, only a few works have explored unsupervised disentangled sequential representation learning due to challenges of generating sequential data. In this paper, we propose recurrent Wasserstein Autoencoder (R-WAE), a new framework for generative modeling of sequential data. R-WAE disentangles the representation of an input sequence into static and dynamic factors (i.e., time-invariant and time-varying parts). Our theoretical analysis shows that, R-WAE minimizes an upper bound of a penalized form of the Wasserstein distance between model distribution and sequential data distribution, and simultaneously maximizes the mutual information between input data and different disentangled latent factors, respectively. This is superior to (recurrent) VAE which does not explicitly enforce mutual information maximization between input data and disentangled latent representations. When the number of actions in sequential data is available as weak supervision information, R-WAE is extended to learn a categorical latent representation of actions to improve its disentanglement. Experiments on a variety of datasets show that our models outperform other baselines with the same settings in terms of disentanglement and unconditional video generation both quantitatively and qualitatively.

preprint2021arXiv

Millimeter-sized Dust Grains Appear Surviving the Water-sublimating Temperature in the Inner 10 au of the FU Ori Disk

Previous observations have shown that the $\lesssim$10 au, $\gtrsim$400 K hot inner disk of the archetypal accretion outburst young stellar object, FU Ori, is dominated by viscous heating. To constrain dust properties in this region, we have performed radio observations toward this disk using the Karl G. Jansky Very Large Array (JVLA) in 2020 June-July, September, and November. We also performed complementary optical photometric monitoring observations. We found that the dust thermal emission from the hot inner disk mid-plane of FU Ori has been approximately stationary and the maximum dust grain size is $\gtrsim$1.6 mm in this region. If the hot inner disk of FU Ori which is inward of the 150-170 K water snowline is turbulent (e.g., corresponding to a Sunyaev & Shakura viscous $α_{t}\gtrsim$0.1), or if the actual maximum grain size is still larger than the lower limit we presently constrain, then as suggested by the recent analytical calculations and the laboratory measurements, water-ice free dust grains may be stickier than water-ice coated dust grains in protoplanetary disks. Additionally, we find that the free-free emission and the Johnson B and V bands magnitudes of these binary stars are brightening in 2016-2020. The optical and radio variability might be related to the dynamically evolving protostellar or disk accretion activities. Our results highlight that hot inner disks of outbursting objects are important laboratories for testing models of dust grain growth. Given the active nature of such systems, to robustly diagnose the maximum dust grain sizes, it is important to carry out coordinated multi-wavelength radio observations.

preprint2021arXiv

On the Convergence of Tsetlin Machines for the XOR Operator

The Tsetlin Machine (TM) is a novel machine learning algorithm with several distinct properties, including transparent inference and learning using hardware-near building blocks. Although numerous papers explore the TM empirically, many of its properties have not yet been analyzed mathematically. In this article, we analyze the convergence of the TM when input is non-linearly related to output by the XOR-operator. Our analysis reveals that the TM, with just two conjunctive clauses, can converge almost surely to reproducing XOR, learning from training data over an infinite time horizon. Furthermore, the analysis shows how the hyper-parameter T guides clause construction so that the clauses capture the distinct sub-patterns in the data. Our analysis of convergence for XOR thus lays the foundation for analyzing other more complex logical expressions. These analyses altogether, from a mathematical perspective, provide new insights on why TMs have obtained state-of-the-art performance on several pattern recognition problems

preprint2020arXiv

Fluctuations of ergodic sums on periodic orbits under specification

We study the fluctuations of ergodic sums using global and local specifications on periodic points. We obtain Lindeberg-type central limit theorems in both situations. As an application, when the system possesses a unique measure of maximal entropy, we show weak convergence of ergodic sums to a mixture of normal distributions. Our results suggest decomposing the variances of ergodic sums according to global and local sources.

preprint2020arXiv

Legal Assignments and fast EADAM with consent via classical theory of stable matchings

Gale and Shapley's stable assignment problem has been extensively studied, applied, and extended. In the context of school choice, mechanisms often aim at finding an assignment that is more favorable to students. We investigate two extensions introduced in this framework -- legal assignments and the EADAM algorithm -- through the lens of classical theory of stable matchings. In any instance, the set ${\cal L}$ of legal assignments is known to contain all stable assignments. We prove that ${\cal L}$ is exactly the set of stable assignments in another instance. Moreover, we show that essentially all optimization problems over ${\cal L}$ can be solved within the same time bound needed for solving it over the set of stable assignments. A key tool for this latter result is an algorithm that finds the student-optimal legal assignment. We then generalize our algorithm to obtain the assignment output of EADAM with any given set of consenting students without sacrificing the running time, hence largely improving in both theory and practice over known algorithms. Lastly, we show that the set ${\cal L}$ can be much larger than the set of stable matchings, connecting legal matchings with certain concepts and open problems in the literature.

preprint2020arXiv

Machine Translation System Selection from Bandit Feedback

Adapting machine translation systems in the real world is a difficult problem. In contrast to offline training, users cannot provide the type of fine-grained feedback (such as correct translations) typically used for improving the system. Moreover, different users have different translation needs, and even a single user's needs may change over time. In this work we take a different approach, treating the problem of adaptation as one of selection. Instead of adapting a single system, we train many translation systems using different architectures, datasets, and optimization methods. Using bandit learning techniques on simulated user feedback, we learn a policy to choose which system to use for a particular translation task. We show that our approach can (1) quickly adapt to address domain changes in translation tasks, (2) outperform the single best system in mixed-domain translation tasks, and (3) make effective instance-specific decisions when using contextual bandit strategies.

preprint2020arXiv

OralCam: Enabling Self-Examination and Awareness of Oral Health Using a Smartphone Camera

Due to a lack of medical resources or oral health awareness, oral diseases are often left unexamined and untreated, affecting a large population worldwide. With the advent of low-cost, sensor-equipped smartphones, mobile apps offer a promising possibility for promoting oral health. However, to the best of our knowledge, no mobile health (mHealth) solutions can directly support a user to self-examine their oral health condition. This paper presents OralCam, the first interactive app that enables end-users' self-examination of five common oral conditions (diseases or early disease signals) by taking smartphone photos of one's oral cavity. OralCam allows a user to annotate additional information (e.g. living habits, pain, and bleeding) to augment the input image, and presents the output hierarchically, probabilistically and with visual explanations to help a laymen user understand examination results. Developed on our in-house dataset that consists of 3,182 oral photos annotated by dental experts, our deep learning based framework achieved an average detection sensitivity of 0.787 over five conditions with high localization accuracy. In a week-long in-the-wild user study (N=18), most participants had no trouble using OralCam and interpreting the examination results. Two expert interviews further validate the feasibility of OralCam for promoting users' awareness of oral health.

preprint2020arXiv

Quaternion Product Units for Deep Learning on 3D Rotation Groups

We propose a novel quaternion product unit (QPU) to represent data on 3D rotation groups. The QPU leverages quaternion algebra and the law of 3D rotation group, representing 3D rotation data as quaternions and merging them via a weighted chain of Hamilton products. We prove that the representations derived by the proposed QPU can be disentangled into "rotation-invariant" features and "rotation-equivariant" features, respectively, which supports the rationality and the efficiency of the QPU in theory. We design quaternion neural networks based on our QPUs and make our models compatible with existing deep learning models. Experiments on both synthetic and real-world data show that the proposed QPU is beneficial for the learning tasks requiring rotation robustness.

preprint2020arXiv

Site testing campaign for the Large Optical/infrared Telescope of China: Overview

The Large Optical/infrared Telescope (LOT) is a ground-based 12m diameter optical/infrared telescope which is proposed to be built in the western part of China in the next decade. Based on satellite remote sensing data, along with geographical, logistical and political considerations, three candidate sites were chosen for ground-based astronomical performance monitoring. These sites include: Ali in Tibet, Daocheng in Sichuan, and Muztagh Ata in Xinjiang. Up until now, all three sites have continuously collected data for two years. In this paper, we will introduce this site testing campaign, and present its monitoring results obtained during the period between March 2017 and March 2019.

preprint2020arXiv

Site-testing at Muztagh-ata site I: Ground Meteorology and Sky Brightness

Site-testing is crucial for achieving the goal of scientific research and analysis of meteorological and optical observing conditions is one of the basic tasks of it. As one of three potential sites to host 12-meter Large Optical/infrared Telescope (LOT), Muztagh-ata site which is located on the Pamirs Plateau in west China's Xinjiang began its site-testing task in the spring of 2017. In this paper, we firstly start with an introduction to the site and then present a statistical analysis of the ground-level meteorological properties such as air temperature, barometric pressure, relative humidity, wind speed and direction, recorded by automatic weather station with standard meteorological sensors for two-year long. We also show the monitoring results of sky brightness during this period.

preprint2020arXiv

Site-testing at Muztagh-ata site II: Seeing statistics

In this article, we present a detailed analysis of the statistical properties of seeing for the Muztagh-ata site which is the candidate site for hosting future Chinese Large Optical/infrared Telescope (LOT) project. The measurement was obtained with Differential Image Motion Monitor (DIMM) from April 2017 to November 2018 at different heights during different periods. The median seeing at 11 meters and 6 meters are very close but different significantly from that on the ground. We mainly analyzed the seeing at 11 meters monthly and hourly, having found that the best season for observing was from late autumn to early winter and seeing tended to improve during the night only in autumn. The analysis of the dependence on temperature inversion, wind speed, direction also was made and the best meteorological conditions for seeing is given.

preprint2020arXiv

Stable intense 1 kHz supercontinuum light generation in air

Supercontinuum (SC) light source has advanced ultrafast laser spectroscopy in condensed matter science, biology, physics, and chemistry. Compared to the frequently used photonic crystal fibers and bulk materials, femtosecond laser filamentation in gases is damage-immune for supercontinuum generation. A bottleneck problem is the strong jitters from filament induced self-heating at kHz repetition rate level. We demonstrate stable kHz supercontinuum generation directly in air with multiple mJ level pulse energy. This is achieved by applying an external DC electric field to the air plasma filament through the effects of plasma wave guiding and Coulomb interaction. Both pointing and intensity jitters of 1 kHz air filament induced SC light are reduced by more than 2 fold. This offers the opportunities for stable intense SC generation and other laser filament based applications in air.

preprint2020arXiv

The Architectural Implications of Facebook's DNN-based Personalized Recommendation

The widespread application of deep learning has changed the landscape of computation in the data center. In particular, personalized recommendation for content ranking is now largely accomplished leveraging deep neural networks. However, despite the importance of these models and the amount of compute cycles they consume, relatively little research attention has been devoted to systems for recommendation. To facilitate research and to advance the understanding of these workloads, this paper presents a set of real-world, production-scale DNNs for personalized recommendation coupled with relevant performance metrics for evaluation. In addition to releasing a set of open-source workloads, we conduct in-depth analysis that underpins future system design and optimization for at-scale recommendation: Inference latency varies by 60% across three Intel server generations, batching and co-location of inferences can drastically improve latency-bounded throughput, and the diverse composition of recommendation models leads to different optimization strategies.

preprint2020arXiv

The Wide-field Photometric System of the Nanshan One-meter Telescope

The Nanshan One-meter Wide-field Telescope (NOWT) is a prime focus system located at Nanshan Station of Xinjiang Astronomical Observatories (XAO). The field of view(FOV) was designed to 1.5 degree *1.5 degree, and Johnson-Cousins UBVRI system was chosen as the main Filter set. The telescope has been providing observation services for astronomers since Sept. 2013. Variable source searching and time-domain surveys are the main scientific goals. The system's test results are reported including linearity, dark current, bias, readout noise and gain of the CCD camera. The accurate instrumental calibration coefficients in UBVRI bands was driven with Landolt standard stars during photometric nights. Finally, the limiting magnitudes are given with signal-to-noise ratios and various exposure times for observers.

preprint2019arXiv

Boundary Zonal Flow in Rotating Turbulent Rayleigh-Bénard Convection

For rapidly rotating turbulent Rayleigh--Bénard convection in a slender cylindrical cell, experiments and direct numerical simulations reveal a boundary zonal flow (BZF) that replaces the classical large-scale circulation. The BZF is located near the vertical side wall and enables enhanced heat transport there. Although the azimuthal velocity of the BZF is cyclonic (in the rotating frame), the temperature is an anticyclonic traveling wave of mode one whose signature is a bimodal temperature distribution near the radial boundary. The BZF width is found to scale like $Ra^{1/4}Ek^{2/3}$ where the Ekman number $Ek$ decreases with increasing rotation rate.

preprint2019arXiv

RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing

Personalized recommendation systems leverage deep learning models and account for the majority of data center AI cycles. Their performance is dominated by memory-bound sparse embedding operations with unique irregular memory access patterns that pose a fundamental challenge to accelerate. This paper proposes a lightweight, commodity DRAM compliant, near-memory processing solution to accelerate personalized recommendation inference. The in-depth characterization of production-grade recommendation models shows that embedding operations with high model-, operator- and data-level parallelism lead to memory bandwidth saturation, limiting recommendation inference performance. We propose RecNMP which provides a scalable solution to improve system throughput, supporting a broad range of sparse embedding models. RecNMP is specifically tailored to production environments with heavy co-location of operators on a single server. Several hardware/software co-optimization techniques such as memory-side caching, table-aware packet scheduling, and hot entry profiling are studied, resulting in up to 9.8x memory latency speedup over a highly-optimized baseline. Overall, RecNMP offers 4.2x throughput improvement and 45.8% memory energy savings.

preprint2016arXiv

Forward model with space-variant of source size for reconstruction on x-ray radiographic image

Forward imaging technique is the base of combined method on density reconstruction with the forward calculation and inverse problem solution. In the paper, we introduced the projection equation for the radiographic system with areal source blur and detector blur, gained the projecting matrix from any point source to any detector pixel with x-ray trace technique, proposed the ideal on gridding the areal source as many point sources with different weights, and used the blurring window as the effect of the detector blur. We used the forward projection equation to gain the same deviation information about the object edge as the experimental image. Our forward projection equation is combined with Constrained Conjugate Gradient method to form a new method for density reconstruction, XTRACE-CCG. The new method worked on the simulated image of French Test Object and experimental image. The same results have been concluded the affecting range of the blur is decreased and can be controlled to one or two pixels. The method is also suitable for reconstruction of density-variant object. The capability of our method to handle blur effect is useful for all radiographic systems with larger source size comparing to pixel size.

preprint2016arXiv

Magnetic fingerprint of interfacial coupling between CoFe and nanoscale ferroelectric domain walls

Magnetoelectric coupling in ferromagnet/multiferroic systems is often manifested in the exchange bias effect, which may have combined contributions from multiple sources, such as domain walls, chemical defects or strain. In this study we magnetically "fingerprint" the coupling behavior of CoFe grown on epitaxial BiFeO3 (BFO) thin films by magnetometry and first-order-reversal-curves (FORC). The contribution to exchange bias from 71°, 109° and charged ferroelectric domain walls (DWs) was elucidated by the FORC distribution. CoFe samples grown on BFO with 71° DWs only exhibit an enhancement of the coercivity, but little exchange bias. Samples grown on BFO with 109° DWs and mosaic DWs exhibit a much larger exchange bias, with the main enhancement attributed to 109° and charged DWs. Based on the Malozemoff random field model, a varying-anisotropy model is proposed to account for the exchange bias enhancement. This work sheds light on the relationship between the exchange bias effect of the CoFe/BFO heterointerface and the ferroelectric DWs, and provides a path for multiferroic device analysis and design.

preprint2016arXiv

The laboratory measurement of radioactivity purification for Pb212 in liquid scintillator

The liquid scintillator (LS) has been widely utilized in the past, running and future neutrino experiments, and requirement to the LS radio-purity is higher and higher. The water extraction is a powerful method to remove soluble radioactive nuclei, and a mini-extraction station has been constructed. To evaluate the extraction efficiency and optimize the operation parameters, a setup to load radioactivity to LS and a laboratory scale setup to measure radioactivity which use Bi^{212}-Po^{212}-Pb^{208} cascade decay are developed. Experiences from laboratory study will be useful to large scale water extraction plants design and the optimization of working in future.

preprint2015arXiv

p/${π^+}$ response of single layer THGEM detector in Ar/3%iC$_4$H$_{10}$

In this work, we study the response of single layer Thick GEM (THGEM) detector to p/${π^+}$ at E3 line of Beijing Test Beam Facility. In our experiment, the drift gap of THGEM Detector is 4mm, and the working gas is Ar/3% iso. Result shows at the momentum 500MeV/c to 1000MeV/c, detection efficiency for p is from 93% to 99% in a relatively lower gain($\sim$2000), while the detection efficiency for ${π^+}$ is slightly lower than that for p, which is from 82% to 88%. Meanwhile, simple Geant4 simulations have been done, and results of beam test are almost consistent with it. We preliminarily study the feasibility of THGEM detectors as sampling elements for Digital Hadronic Calorimeter(DHCAL), which may provide related reference for THGEM possibly applied in Circular Electron Positron Collider(CEPC) HCAL.

preprint2015arXiv

Preliminary study of light yield dependence on LAB liquid scintillator composition

Liquid scintillator (LS) will be adopted as the detector material in JUNO (Jiangmen Underground Neutrino Observatory). The energy resolution requirement of JUNO is 3%, which has never previously been reached. To achieve this energy resolution, the light yield of liquid scintillator is an important factor. PPO (the fluor) and bis-MSB (the wavelength shifter) are the two main materials dissolved in LAB. To study the influence of these two materials on the transmission of scintillation photons in LS, 25 and 12 cm-long quartz vessels were used in a light yield experiment. LS samples with different concentration of PPO and bis-MSB were tested. At these lengths, the light yield growth is not obvious when the concentration of PPO is higher than 4 g/L. The influence from bis-MSB becomes insignificant when its concentration is higher than 8 mg/L. This result could provide some useful suggestions for the JUNO LS.

preprint2015arXiv

Spectroscopic study of light scattering in linear alkylbenzene for liquid scintillator neutrino detectors

We has set up a light scattering spectrometer to study the depolarization of light scattering in linear alkylbenzene. From the scattering spectra it can be unambiguously shown that the depolarized part of light scattering belongs to Rayleigh scattering. The additional depolarized Rayleigh scattering can make the effective transparency of linear alkylbenzene much better than it was expected. Therefore sufficient scintillation photons can transmit through the large liquid scintillator detector of JUNO. Our study is crucial to achieving the unprecedented energy resolution 3\%/$\sqrt{E\mathrm{(MeV)}}$ for JUNO experiment to determine the neutrino mass hierarchy. The spectroscopic method can also be used to judge the attribution of the depolarization of other organic solvents used in neutrino experiments.

preprint2014arXiv

Aging research of the LAB-based liquid scintillator in stainless steel container

Stainless steel is the material used for the storage vessels and piping systems of LAB-based liquid scintillator in JUNO experiment. Aging is recognized as one of the main degradation mechanisms affecting the properties of liquid scintillator. LAB-based liquid scintillator aging experiments were carried out in different material of containers (type 316 and 304 stainless steel and glass) at two different temperature (40 and 25 degrees Celsius). For the continuous liquid scintillator properties tests, the light yield and the absorption spectrum are nearly the same as that of the unaged one. The attenuation length of the aged samples is 6%~12% shorter than that of the unaged one. But the concentration of element Fe in the LAB-based liquid scintillator does not show a clear change. So the self aging has small effect on liquid scintillator, as well as the stainless steel impurity quenching. Type 316 and 304 stainless steel can be used as LAB-based liquid scintillator vessel, transportation pipeline material.

preprint2014arXiv

Simulation of background reduction and Compton depression in low-background HPGe spectrometer at a surface laboratory

High-purity germanium detectors are well suited to analysis the radioactivity of samples. In order to reduce the environmental background, low-activity lead and oxygen free copper are installed outside of the probe to shield gammas, outmost is a plastic scintillator to veto the cosmic rays, and an anti-Compton detector can improve the Peak-to-Compton ratio. Using the GEANT4 tools and taking into account a detailed description of the detector, we optimize the sizes of the detectors to reach the design indexes. A group of experimental data from a HPGe spectrometer in using were used to compare with the simulation. As to new HPGe Detector simulation, considering the different thickness of BGO crystals and anti-coincidence efficiency, the simulation results show that the optimal thickness is 5.5cm, and the Peak-to-Compton ratio of 40K is raised to 1000 when the anti-coincidence efficiency is 0.85. As the background simulation, 15 cm oxygen-free copper plus 10 cm lead can reduce the environmental gamma rays to 0.0024 cps/100 cm3 Ge (50keV~2.8MeV), which is about 10-5 of environmental background.

Xuan Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

46 published item(s)

1-GHz VIS-to-MIR frequency combs enabled by CMOS-compatible nanophotonic waveguides

Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment

First Submillimeter Lights from Dome A: Tracing the Carbon Cycle in the Feedback of Massive Stars

GR-Ben: A General Reasoning Benchmark for Evaluating Process Reward Models

Synergy over Discrepancy: A Partition-Based Approach to Multi-Domain LLM Fine-Tuning

Building Concise Logical Patterns by Constraining Tsetlin Machine Clause Size

Joint Hybrid Beamforming and User Scheduling for Multi-Satellite Cooperative Networks

A Fair and Efficient Hybrid Federated Learning Framework based on XGBoost for Distributed Power Prediction

CATCH: Chasing All Transients Constellation Hunters Space Mission

Data-driven identification of the spatio-temporal structure of turbulent flows by streaming Dynamic Mode Decomposition

Data-Driven Robust Control for Discrete Linear Time-Invariant Systems: A Descriptor System Approach

Domain Knowledge-Based Automated Analog Circuit Design with Deep Reinforcement Learning

Domain Knowledge-Infused Deep Learning for Automated Analog/Radio-Frequency Circuit Parameter Optimization

Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation

Lattice Convolutional Networks for Learning Ground States of Quantum Many-Body Systems

Music Influence Modeling Based on Directed Network Model

Neural-PIM: Efficient Processing-In-Memory with Neural Approximation of Peripherals

Robust and fair work allocation

Self-Supervised Graph Neural Networks for Improved Electroencephalographic Seizure Analysis

Simultaneous Detection of Optical Flares of the Magnetically Active M Dwarf Wolf 359

Control Reconfiguration of Dynamical Systems for Improved Performance via Reverse- and Forward-engineering

Data-Driven Controllability Analysis and Stabilization for Linear Descriptor Systems

Disentangled Recurrent Wasserstein Autoencoder

Millimeter-sized Dust Grains Appear Surviving the Water-sublimating Temperature in the Inner 10 au of the FU Ori Disk

On the Convergence of Tsetlin Machines for the XOR Operator

Fluctuations of ergodic sums on periodic orbits under specification

Legal Assignments and fast EADAM with consent via classical theory of stable matchings

Machine Translation System Selection from Bandit Feedback

OralCam: Enabling Self-Examination and Awareness of Oral Health Using a Smartphone Camera

Quaternion Product Units for Deep Learning on 3D Rotation Groups

Site testing campaign for the Large Optical/infrared Telescope of China: Overview

Site-testing at Muztagh-ata site I: Ground Meteorology and Sky Brightness

Site-testing at Muztagh-ata site II: Seeing statistics

Stable intense 1 kHz supercontinuum light generation in air

The Architectural Implications of Facebook's DNN-based Personalized Recommendation

The Wide-field Photometric System of the Nanshan One-meter Telescope

Boundary Zonal Flow in Rotating Turbulent Rayleigh-Bénard Convection

RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing

Forward model with space-variant of source size for reconstruction on x-ray radiographic image

Magnetic fingerprint of interfacial coupling between CoFe and nanoscale ferroelectric domain walls

The laboratory measurement of radioactivity purification for Pb212 in liquid scintillator

p/${π^+}$ response of single layer THGEM detector in Ar/3%iC$_4$H$_{10}$

Preliminary study of light yield dependence on LAB liquid scintillator composition

Spectroscopic study of light scattering in linear alkylbenzene for liquid scintillator neutrino detectors

Aging research of the LAB-based liquid scintillator in stainless steel container

Simulation of background reduction and Compton depression in low-background HPGe spectrometer at a surface laboratory