Source author record

Fan Zhou

Fan Zhou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

29works

22topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Beyond Retrieval: A Multitask Benchmark and Model for Code Search

Code search has usually been evaluated as first-stage retrieval, even though production systems rely on broader pipelines with reranking and developer-style queries. Existing benchmarks also suffer from data contamination, label noise, and degenerate binary relevance. In this paper, we introduce \textsc{CoREB}, a contamination-limited, multitask \underline{co}de \underline{r}etrieval and r\underline{e}ranking \underline{b}enchmark, together with a fine-tuned code reranker, that goes beyond retrieval to cover the full code search pipeline. \textsc{CoREB} is built from counterfactually rewritten LiveCodeBench problems in five programming languages and delivered as timed releases with graded relevance judgments. We benchmark eleven embedding models and five rerankers across three tasks: text-to-code, code-to-text, and code-to-code. Our experiments reveal that: \circone code-specialised embeddings dominate code-to-code retrieval (${\sim}2{\times}$ over general encoders), yet no single model wins all three tasks; \circtwo short keyword queries, the format closest to real developer search, collapse every model to near-zero nDCG@10; \circthree off-the-shelf rerankers are task-asymmetric, with a 12-point swing on code-to-code and no baseline net-positive across all tasks; \circfour our fine-tuned \textsc{CoREB-Reranker} is the first to achieve consistent gains across all three tasks. The data and model are released.

preprint2026arXiv

Decompose to Understand, Fuse to Detect: Frequency-Decoupled Anomaly Detection for Encrypted Network Traffic

Network traffic anomaly detection represents a critical cybersecurity task, yet widespread encryption makes this task increasingly challenging. In response, image-based methods that model traffic as visual patterns have emerged as the dominant approach. However, this work pioneers the identification of a pervasive ``full-frequency'' characteristic and an associated limitation termed ``spectral mismatch'' within this paradigm. Specifically, while encrypted traffic exhibits prominent high-frequency components, mainstream reconstruction methods demonstrate an inherent bias toward learning low-frequency information. This fundamental mismatch results in incomplete representations that consequently degrade anomaly detection performance. To address this challenge, we propose FreeUp, a novel frequency-decoupled framework designed explicitly for encrypted traffic analysis. FreeUp decomposes traffic data into distinct low- and high-frequency bands, processing them through separate, dedicated branches along with a customized training strategy that ensures stable and independent frequency-specific learning. Furthermore, recognizing that simple reconstruction error proves inadequate for evaluating dual-branch architectures, we introduce an uncertainty-inspired fusion scoring mechanism. This mechanism quantifies the reconstruction uncertainty of the frequency-specific branches and dynamically integrates their outputs, yielding a more comprehensive and reliable anomaly score. Extensive experiments across multiple benchmarks demonstrate that FreeUp consistently outperforms state-of-the-art baselines. The code is available at https://github.com/ikun0124/FreeUp.

preprint2024arXiv

DB-GPT: Empowering Database Interactions with Private Large Language Models

The recent breakthroughs in large language models (LLMs) are positioned to transition many areas of software. Database technologies particularly have an important entanglement with LLMs as efficient and intuitive database interactions are paramount. In this paper, we present DB-GPT, a revolutionary and production-ready project that integrates LLMs with traditional database systems to enhance user experience and accessibility. DB-GPT is designed to understand natural language queries, provide context-aware responses, and generate complex SQL queries with high accuracy, making it an indispensable tool for users ranging from novice to expert. The core innovation in DB-GPT lies in its private LLM technology, which is fine-tuned on domain-specific corpora to maintain user privacy and ensure data security while offering the benefits of state-of-the-art LLMs. We detail the architecture of DB-GPT, which includes a novel retrieval augmented generation (RAG) knowledge system, an adaptive learning mechanism to continuously improve performance based on user feedback and a service-oriented multi-model framework (SMMF) with powerful data-driven agents. Our extensive experiments and user studies confirm that DB-GPT represents a paradigm shift in database interactions, offering a more natural, efficient, and secure way to engage with data repositories. The paper concludes with a discussion of the implications of DB-GPT framework on the future of human-database interaction and outlines potential avenues for further enhancements and applications in the field. The project code is available at https://github.com/eosphoros-ai/DB-GPT. Experience DB-GPT for yourself by installing it with the instructions https://github.com/eosphoros-ai/DB-GPT#install and view a concise 10-minute video at https://www.youtube.com/watch?v=KYs4nTDzEhk.

preprint2023arXiv

Gap Minimization for Knowledge Sharing and Transfer

Learning from multiple related tasks by knowledge sharing and transfer has become increasingly relevant over the last two decades. In order to successfully transfer information from one task to another, it is critical to understand the similarities and differences between the domains. In this paper, we introduce the notion of \emph{performance gap}, an intuitive and novel measure of the distance between learning tasks. Unlike existing measures which are used as tools to bound the difference of expected risks between tasks (e.g., $\mathcal{H}$-divergence or discrepancy distance), we theoretically show that the performance gap can be viewed as a data- and algorithm-dependent regularizer, which controls the model complexity and leads to finer guarantees. More importantly, it also provides new insights and motivates a novel principle for designing strategies for knowledge sharing and transfer: gap minimization. We instantiate this principle with two algorithms: 1. gapBoost, a novel and principled boosting algorithm that explicitly minimizes the performance gap between source and target domains for transfer learning; and 2. gapMTNN, a representation learning algorithm that reformulates gap minimization as semantic conditional matching for multitask learning. Our extensive evaluation on both transfer learning and multitask learning benchmark data sets shows that our methods outperform existing baselines.

preprint2023arXiv

Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization

Reinforcement learning (RL) is a powerful machine learning technique that enables an intelligent agent to learn an optimal policy that maximizes the cumulative rewards in sequential decision making. Most of methods in the existing literature are developed in \textit{online} settings where the data are easy to collect or simulate. Motivated by high stake domains such as mobile health studies with limited and pre-collected data, in this paper, we study \textit{offline} reinforcement learning methods. To efficiently use these datasets for policy optimization, we propose a novel value enhancement method to improve the performance of a given initial policy computed by existing state-of-the-art RL algorithms. Specifically, when the initial policy is not consistent, our method will output a policy whose value is no worse and often better than that of the initial policy. When the initial policy is consistent, under some mild conditions, our method will yield a policy whose value converges to the optimal one at a faster rate than the initial policy, achieving the desired ``value enhancement" property. The proposed method is generally applicable to any parametrized policy that belongs to certain pre-specified function class (e.g., deep neural networks). Extensive numerical studies are conducted to demonstrate the superior performance of our method.

preprint2022arXiv

Automatic Registration of Images with Inconsistent Content Through Line-Support Region Segmentation and Geometrical Outlier Removal

The implementation of automatic image registration is still difficult in various applications. In this paper, an automatic image registration approach through line-support region segmentation and geometrical outlier removal (ALRS-GOR) is proposed. This new approach is designed to address the problems associated with the registration of images with affine deformations and inconsistent content, such as remote sensing images with different spectral content or noise interference, or map images with inconsistent annotations. To begin with, line-support regions, namely a straight region whose points share roughly the same image gradient angle, are extracted to address the issues of inconsistent content existing in images. To alleviate the incompleteness of line segments, an iterative strategy with multi-resolution is employed to preserve global structures that are masked at full resolution by image details or noise. Then, Geometrical Outlier Removal (GOR) is developed to provide reliable feature point matching, which is based on affineinvariant geometrical classifications for corresponding matches initialized by SIFT. The candidate outliers are selected by comparing the disparity of accumulated classifications among all matches, instead of conventional methods which only rely on local geometrical relations. Various image sets have been considered in this paper for the evaluation of the proposed approach, including aerial images with simulated affine deformations, remote sensing optical and synthetic aperture radar images taken at different situations (multispectral, multisensor, and multitemporal), and map images with inconsistent annotations. Experimental results demonstrate the superior performance of the proposed method over the existing approaches for the whole data set.

preprint2022arXiv

Domain Generalization via Optimal Transport with Metric Similarity Learning

Generalizing knowledge to unseen domains, where data and labels are unavailable, is crucial for machine learning models. We tackle the domain generalization problem to learn from multiple source domains and generalize to a target domain with unknown statistics. The crucial idea is to extract the underlying invariant features across all the domains. Previous domain generalization approaches mainly focused on learning invariant features and stacking the learned features from each source domain to generalize to a new target domain while ignoring the label information, which will lead to indistinguishable features with an ambiguous classification boundary. For this, one possible solution is to constrain the label-similarity when extracting the invariant features and to take advantage of the label similarities for class-specific cohesion and separation of features across domains. Therefore we adopt optimal transport with Wasserstein distance, which could constrain the class label similarity, for adversarial training and also further deploy a metric learning objective to leverage the label information for achieving distinguishable classification boundary. Empirical results show that our proposed method could outperform most of the baselines. Furthermore, ablation studies also demonstrate the effectiveness of each component of our method.

preprint2022arXiv

Evolving Domain Generalization

Domain generalization aims to learn a predictive model from multiple different but related source tasks that can generalize well to a target task without the need of accessing any target data. Existing domain generalization methods ignore the relationship between tasks, implicitly assuming that all the tasks are sampled from a stationary environment. Therefore, they can fail when deployed in an evolving environment. To this end, we formulate and study the \emph{evolving domain generalization} (EDG) scenario, which exploits not only the source data but also their evolving pattern to generate a model for the unseen task. Our theoretical result reveals the benefits of modeling the relation between two consecutive tasks by learning a globally consistent directional mapping function. In practice, our analysis also suggests solving the DDG problem in a meta-learning manner, which leads to \emph{directional prototypical network}, the first method for the DDG problem. Empirical evaluation of both synthetic and real-world data sets validates the effectiveness of our approach.

preprint2022arXiv

Forgetting Prevention for Cross-regional Fraud Detection with Heterogeneous Trade Graph

With the booming growth of e-commerce, detecting financial fraud has become an urgent task to avoid transaction risks. Despite the successful applications of Graph Neural Networks (GNNs) in fraud detection, the existing solutions are only suitable for a narrow scope due to the limitation in data collection. Especially when expanding a business into new territory, e.g., new cities or new countries, developing a totally new model will bring the cost issue and result in forgetting previous knowledge. Moreover, recent works strive to devise GNNs to expose the implicit interactions behind financial transactions. However, most existing GNNs-based solutions concentrate on either homogeneous graphs or decomposing heterogeneous interactions into several homogeneous connections for convenience. To this end, this study proposes a novel solution based on heterogeneous trade graphs, namely HTG-CFD, to prevent knowledge forgetting of cross-regional fraud detection. In particular, the heterogeneous trade graph (HTG) is meticulously constructed from original transaction records to explore the complex semantics among different types of entities and relationships. And motivated by recent continual learning, we present a practical and task-oriented forgetting prevention method to alleviate knowledge forgetting in the context of cross-regional detection. Extensive experiments demonstrate that the proposed HTG-CFD not only promotes the performance in cross-regional scenarios but also significantly contributes to single-regional fraud detection.

preprint2022arXiv

Quantification and Analysis of Layer-wise and Pixel-wise Information Discarding

This paper presents a method to explain how the information of each input variable is gradually discarded during the forward propagation in a deep neural network (DNN), which provides new perspectives to explain DNNs. We define two types of entropy-based metrics, i.e. (1) the discarding of pixel-wise information used in the forward propagation, and (2) the uncertainty of the input reconstruction, to measure input information contained by a specific layer from two perspectives. Unlike previous attribution metrics, the proposed metrics ensure the fairness of comparisons between different layers of different DNNs. We can use these metrics to analyze the efficiency of information processing in DNNs, which exhibits strong connections to the performance of DNNs. We analyze information discarding in a pixel-wise manner, which is different from the information bottleneck theory measuring feature information w.r.t. the sample distribution. Experiments have shown the effectiveness of our metrics in analyzing classic DNNs and explaining existing deep-learning techniques.

preprint2022arXiv

Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks

Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs, and various other document types, a flurry of table pre-training frameworks have been proposed following the success of text and images, and they have achieved new state-of-the-arts on various tasks such as table question answering, table type recognition, column relation classification, table search, formula prediction, etc. To fully use the supervision signals in unlabeled tables, a variety of pre-training objectives have been designed and evaluated, for example, denoising cell values, predicting numerical relationships, and implicitly executing SQLs. And to best leverage the characteristics of (semi-)structured tables, various tabular language models, particularly with specially-designed attention mechanisms, have been explored. Since tables usually appear and interact with free-form text, table pre-training usually takes the form of table-text joint pre-training, which attracts significant research interests from multiple domains. This survey aims to provide a comprehensive review of different model designs, pre-training objectives, and downstream tasks for table pre-training, and we further share our thoughts and vision on existing challenges and future opportunities.

preprint2022arXiv

TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions over Tabular Data

Existing auto-regressive pre-trained language models (PLMs) like T5 and BART, have been well applied to table question answering by UNIFIEDSKG and TAPEX, respectively, and demonstrated state-of-the-art results on multiple benchmarks. However, auto-regressive PLMs are challenged by recent emerging numerical reasoning datasets, such as TAT-QA, due to the error-prone implicit calculation. In this paper, we present TaCube, to pre-compute aggregation/arithmetic results for the table in advance, so that they are handy and readily available for PLMs to answer numerical reasoning questions. TaCube systematically and comprehensively covers a collection of computational operations over table segments. By simply concatenating TaCube to the input sequence of PLMs, it shows significant experimental effectiveness. TaCube promotes the F1 score from 49.6% to 66.2% on TAT-QA and achieves new state-of-the-art results on WikiTQ (59.6% denotation accuracy). TaCube's improvements on numerical reasoning cases are even more notable: on TAT-QA, TaCube promotes the exact match accuracy of BART-large by 39.6% on sum, 52.5% on average, 36.6% on substraction, and 22.2% on division. We believe that TaCube is a general and portable pre-computation solution that can be potentially integrated to various numerical reasoning frameworks

preprint2021arXiv

Construction D' Lattices for Power-Constrained Communications

Designs and methods for nested lattice codes using Construction D' lattices for coding and convolutional code lattices for shaping are described. Two encoding methods and a decoding algorithm for Construction D' coding lattices that can be used with shaping lattices for power-constrained channels are given. We construct nested lattice codes with good coding properties, a high shaping gain, and low-complexity encoding and decoding. Convolutional code generator polynomials for Construction A lattices with the greatest shaping gain are given, as a result of an extensive search. It is shown that rate 1/3 convolutional codes provide a more favorable performance-complexity trade-off than rate 1/2 convolutional codes. Tail-biting convolutional codes have higher shaping gain than that of zero-tailed convolutional codes. A design for quasi-cyclic low-density parity-check (QC-LDPC) codes to form Construction D' lattices which have efficient encoding and indexing is presented. The resulting QC-LDPC Construction D' lattices are evaluated using four shaping lattices: the $E_8$ lattice, the $BW_{16}$ lattice, the Leech lattice and our best-found convolutional code lattice, showing a shaping gain of approximately 0.65 dB, 0.86 dB, 1.03 dB and 1.25 dB at dimension 2304.

preprint2021arXiv

Embedding Symbolic Temporal Knowledge into Deep Sequential Models

Sequences and time-series often arise in robot tasks, e.g., in activity recognition and imitation learning. In recent years, deep neural networks (DNNs) have emerged as an effective data-driven methodology for processing sequences given sufficient training data and compute resources. However, when data is limited, simpler models such as logic/rule-based methods work surprisingly well, especially when relevant prior knowledge is applied in their construction. However, unlike DNNs, these "structured" models can be difficult to extend, and do not work well with raw unstructured data. In this work, we seek to learn flexible DNNs, yet leverage prior temporal knowledge when available. Our approach is to embed symbolic knowledge expressed as linear temporal logic (LTL) and use these embeddings to guide the training of deep models. Specifically, we construct semantic-based embeddings of automata generated from LTL formula via a Graph Neural Network. Experiments show that these learnt embeddings can lead to improvements in downstream robot tasks such as sequential action recognition and imitation learning.

preprint2021arXiv

High-Order Statistical Functional Expansion and Its Application To Some Nonsmooth Problems

Let $\bx_j = \btheta +\bep_j, j=1,...,n$, be observations of an unknown parameter $\btheta$ in a Euclidean or separable Hilbert space $\scrH$, where $\bep_j$ are noises as random elements in $\scrH$ from a general distribution. We study the estimation of $f(\btheta)$ for a given functional $f:\scrH\rightarrow \RR$ based on $\bx_j$'s. The key element of our approach is a new method which we call High-Order Degenerate Statistical Expansion. It leverages the use of classical multivariate Taylor expansion and degenerate $U$-statistic and yields an elegant explicit formula. In the univariate case of $\scrH=\R$, the formula expresses the error of the proposed estimator as a sum of order $k$ degenerate $U$-products of the noises with coefficient $f^{(k)}(\btheta)/k!$ and an explicit remainder term in the form of the Riemann-Liouville integral as in the Taylor expansion around the true $\btheta$. For general $\scrH$, the formula expresses the estimation error in terms of the inner product of $f^{(k)}(\btheta)/k!$ and the average of the tensor products of $k$ noises with distinct indices and a parallel extension of the remainder term from the univariate case. This makes the proposed method a natural statistical version of the classical Taylor expansion. The proposed estimator can be viewed as a jackknife estimator of an ideal degenerate expansion of $f(\cdot)$ around the true $\btheta$ with the degenerate $U$-product of the noises, and can be approximated by bootstrap. Thus, the jackknife, bootstrap and Taylor expansion approaches all converge to the proposed estimator. We develop risk bounds for the proposed estimator and a central limit theorem under a second moment condition (even in expansions of higher than the second order). We apply this new method to generalize several existing results with smooth and nonsmooth $f$ to universal $\bep_j$'s with only minimum moment constraints.

preprint2021arXiv

Multi-task Learning by Leveraging the Semantic Information

One crucial objective of multi-task learning is to align distributions across tasks so that the information between them can be transferred and shared. However, existing approaches only focused on matching the marginal feature distribution while ignoring the semantic information, which may hinder the learning performance. To address this issue, we propose to leverage the label information in multi-task learning by exploring the semantic conditional relations among tasks. We first theoretically analyze the generalization bound of multi-task learning based on the notion of Jensen-Shannon divergence, which provides new insights into the value of label information in multi-task learning. Our analysis also leads to a concrete algorithm that jointly matches the semantic distribution and controls label distribution divergence. To confirm the effectiveness of the proposed method, we first compare the algorithm with several baselines on some benchmarks and then test the algorithms under label space shift conditions. Empirical results demonstrate that the proposed method could outperform most baselines and achieve state-of-the-art performance, particularly showing the benefits under the label shift conditions.

preprint2021arXiv

Overcoming Catastrophic Forgetting in Graph Neural Networks with Experience Replay

Graph Neural Networks (GNNs) have recently received significant research attention due to their superior performance on a variety of graph-related learning tasks. Most of the current works focus on either static or dynamic graph settings, addressing a single particular task, e.g., node/graph classification, link prediction. In this work, we investigate the question: can GNNs be applied to continuously learning a sequence of tasks? Towards that, we explore the Continual Graph Learning (CGL) paradigm and present the Experience Replay based framework ER-GNN for CGL to alleviate the catastrophic forgetting problem in existing GNNs. ER-GNN stores knowledge from previous tasks as experiences and replays them when learning new tasks to mitigate the catastrophic forgetting issue. We propose three experience node selection strategies: mean of feature, coverage maximization, and influence maximization, to guide the process of selecting experience nodes. Extensive experiments on three benchmark datasets demonstrate the effectiveness of our ER-GNN and shed light on the incremental graph (non-Euclidean) structure learning.

preprint2020arXiv

A Heterogeneous Dynamical Graph Neural Networks Approach to Quantify Scientific Impact

Quantifying and predicting the long-term impact of scientific writings or individual scholars has important implications for many policy decisions, such as funding proposal evaluation and identifying emerging research fields. In this work, we propose an approach based on Heterogeneous Dynamical Graph Neural Network (HDGNN) to explicitly model and predict the cumulative impact of papers and authors. HDGNN extends heterogeneous GNNs by incorporating temporally evolving characteristics and capturing both structural properties of attributed graph and the growing sequence of citation behavior. HDGNN is significantly different from previous models in its capability of modeling the node impact in a dynamic manner while taking into account the complex relations among nodes. Experiments conducted on a real citation dataset demonstrate its superior performance of predicting the impact of both papers and authors.

preprint2020arXiv

Beyond $\mathcal{H}$-Divergence: Domain Adaptation Theory With Jensen-Shannon Divergence

We reveal the incoherence between the widely-adopted empirical domain adversarial training and its generally-assumed theoretical counterpart based on $\mathcal{H}$-divergence. Concretely, we find that $\mathcal{H}$-divergence is not equivalent to Jensen-Shannon divergence, the optimization objective in domain adversarial training. To this end, we establish a new theoretical framework by directly proving the upper and lower target risk bounds based on joint distributional Jensen-Shannon divergence. We further derive bi-directional upper bounds for marginal and conditional shifts. Our framework exhibits inherent flexibilities for different transfer learning problems, which is usable for various scenarios where $\mathcal{H}$-divergence-based theory fails to adapt. From an algorithmic perspective, our theory enables a generic guideline unifying principles of semantic conditional matching, feature marginal matching, and label marginal shift correction. We employ algorithms for each principle and empirically validate the benefits of our framework on real datasets.

preprint2020arXiv

Deep Active Learning: Unified and Principled Method for Query and Training

In this paper, we are proposing a unified and principled method for both the querying and training processes in deep batch active learning. We are providing theoretical insights from the intuition of modeling the interactive procedure in active learning as distribution matching, by adopting the Wasserstein distance. As a consequence, we derived a new training loss from the theoretical analysis, which is decomposed into optimizing deep neural network parameters and batch query selection through alternative optimization. In addition, the loss for training a deep neural network is naturally formulated as a min-max optimization problem through leveraging the unlabeled data information. Moreover, the proposed principles also indicate an explicit uncertainty-diversity trade-off in the query batch selection. Finally, we evaluate our proposed method on different benchmarks, consistently showing better empirical performances and a better time-efficient query strategy compared to the baselines.

preprint2020arXiv

Discriminative Active Learning for Domain Adaptation

Domain Adaptation aiming to learn a transferable feature between different but related domains has been well investigated and has shown excellent empirical performances. Previous works mainly focused on matching the marginal feature distributions using the adversarial training methods while assuming the conditional relations between the source and target domain remained unchanged, $i.e.$, ignoring the conditional shift problem. However, recent works have shown that such a conditional shift problem exists and can hinder the adaptation process. To address this issue, we have to leverage labelled data from the target domain, but collecting labelled data can be quite expensive and time-consuming. To this end, we introduce a discriminative active learning approach for domain adaptation to reduce the efforts of data annotation. Specifically, we propose three-stage active adversarial training of neural networks: invariant feature space learning (first stage), uncertainty and diversity criteria and their trade-off for query strategy (second stage) and re-training with queried target labels (third stage). Empirical comparisons with existing domain adaptation methods using four benchmark datasets demonstrate the effectiveness of the proposed approach.

preprint2020arXiv

Metric learning by Similarity Network for Deep Semi-Supervised Learning

Deep semi-supervised learning has been widely implemented in the real-world due to the rapid development of deep learning. Recently, attention has shifted to the approaches such as Mean-Teacher to penalize the inconsistency between two perturbed input sets. Although these methods may achieve positive results, they ignore the relationship information between data instances. To solve this problem, we propose a novel method named Metric Learning by Similarity Network (MLSN), which aims to learn a distance metric adaptively on different domains. By co-training with the classification network, similarity network can learn more information about pairwise relationships and performs better on some empirical tasks than state-of-art methods.

preprint2020arXiv

Relational State-Space Model for Stochastic Multi-Object Systems

Real-world dynamical systems often consist of multiple stochastic subsystems that interact with each other. Modeling and forecasting the behavior of such dynamics are generally not easy, due to the inherent hardness in understanding the complicated interactions and evolutions of their constituents. This paper introduces the relational state-space model (R-SSM), a sequential hierarchical latent variable model that makes use of graph neural networks (GNNs) to simulate the joint state transitions of multiple correlated objects. By letting GNNs cooperate with SSM, R-SSM provides a flexible way to incorporate relational information into the modeling of multi-object dynamics. We further suggest augmenting the model with normalizing flows instantiated for vertex-indexed random variables and propose two auxiliary contrastive objectives to facilitate the learning. The utility of R-SSM is empirically evaluated on synthetic and real time-series datasets.

preprint2016arXiv

The LAMOST spectroscopic survey of star clusters in M31. II. Metallicities, ages and masses

We select from Paper I a sample of 306 massive star clusters observed with the Large Sky Area Multi-Object Fibre Spectroscopic Telescope (LAMOST) in the vicinity fields of M31 and M33 and determine their metallicities, ages and masses. Metallicities and ages are estimated by fitting the observed integrated spectra with stellar synthesis population (SSP) models with a pixel-to-pixel spectral fitting technique. Ages for most young clusters are also derived by fitting the multi-band photometric measurements with model spectral energy distributions (SEDs). The estimated cluster ages span a wide range, from several million years to the age of the universe. The numbers of clusters younger and older than 1 Gyr are respectively 46 and 260. With ages and metallicities determined, cluster masses are then estimated by comparing the multi-band photometric measurements with SSP model SEDs. The derived masses range from $\sim 10^{3}$ to $\sim 10^7$ $M_{\odot}$, peaking at $\sim 10^{4.3}$ and $\sim 10^{5.7}$ $M_{\odot}$ for young ($< 1$ Gyr) and old ($>1$ Gyr) clusters, respectively. Our estimated metallicities, ages and masses are in good agreement with available literature values. Old clusters richer than [Fe/H] $\sim -0.7$ dex have a wide range of ages. Those poorer than [Fe/H] $\sim -0.7$ dex seem to be composed of two groups, as previously found for Galactic GCs -- one of the oldest ages with all values of metallicity down to $\sim -2$ dex and another with metallicity increasing with decreasing age. The old clusters in the inner disk of M\,31 (0 -- 30 kpc) show a clear metallicity gradient measured at $-0.038\pm0.023$ dex/kpc.

preprint2016arXiv

The Voronoi formula and double Dirichlet series

We prove a Voronoi formula for coefficients of a large class of $L$-functions including Maass cusp forms, Rankin-Selberg convolutions, and certain isobaric sums. Our proof is based on the functional equations of $L$-functions twisted by Dirichlet characters and does not directly depend on automorphy. Hence it has wider application than previous proofs. The key ingredient is the construction of a double Dirichlet series.

preprint2015arXiv

Ultrafast All-optical Modulation Exploiting the Vibrational Dynamic of Metallic Meta-atoms

Optical control over elementary molecular vibration establishes fundamental capabilities for exploiting the broad range of optical linear and nonlinear phenomena. However, experimental demonstration of the coherently driven molecular vibration remains a challenge task due to the weak optical force imposed on natural materials. Here we report the design of "meta-atom" that exhibits giant artificial optical nonlinearity. These "meta-atoms" support co-localized magnetic resonance at optical frequency and vibration resonance at GHz frequency with a deep-sub-diffraction-limit spatial confinement ($λ^2/100$). The coherent coupling of those two distinct resonances manifests a strong optical force, which is fundamentally different from the commonly studied form of radiation forces, the gradient forces, or photo-thermal induced deformation. It results in a giant third-order susceptibility $χ^{(3)}$ of $10^{-13}$ $m^2$/$V^2$, which is more than six orders of magnitude larger than that found in natural materials. The all-optical modulation at the frequency well above 1 GHz has thus been demonstrated experimentally.

preprint2014arXiv

On the Hecke Eigenvalues of Maass Forms

Let $ϕ$ denote a primitive Hecke-Maass cusp form for $Γ_o(N)$ with the Laplacian eigenvalue $λ_ϕ=1/4+t_ϕ^2$. In this work we show that there exists a prime $p$ such that $p\nmid N$, $|α_{p}|=|β_{p}| = 1$, and $p\ll(N(1+|t_ϕ|))^c$, where $α_{p},\;β_{p}$ are the Satake parameters of $ϕ$ at $p$, and $c$ is an absolute constant with $0<c<1$. In fact, $c$ can be taken as $0.27332$. In addition, we prove that the natural density of such primes $p$ ($p\nmid N$ and $|α_{p}|=|β_{p}| = 1$) is at least $34/35$.

preprint2011arXiv

Hiding a Realistic Object Using a Broadband Terahertz Invisibility Cloak

The invisibility cloak has been a long-standing dream for many researchers over the decades. The introduction of transformational optics has revitalized this field by providing a general method to design material distributions to hide the subject from detection. By transforming space and light propagation, a three-dimensional (3D) object is perceived as having a reduced number of dimensions, in the form of points, lines, and thin sheets, making it "undetectable" judging from the scattered field. Although a variety of cloaking devices have been reported at microwave and optical frequencies, the spectroscopically important Terahertz (THz) domain remains unexplored. Moreover, due to the difficulties in fabricating cloaking devices that are optically large in all three dimensions, hiding realistic 3D objects has yet to be demonstrated. Here, we report the first experimental demonstration of a 3D THz cloaking device fabricated using a scalable Projection Microstereolithography process. The cloak operates at a broad frequency range between 0.3 and 0.6 THz, and is placed over an α- lactose monohydrate absorber with rectangular shape. Characterized using angularresolved reflection THz time-domain spectroscopy (THz-TDS), the results indicate that the THz invisibility cloak has successfully concealed both the geometrical and spectroscopic signatures of the absorber, making it undetectable to the observer.

preprint2010arXiv

Three-Dimensional Cloaking Device Operates at Terahertz Frequencies

The invisibility cloak has been a long-standing dream for many researchers over the decades. By transforming space and light propagation, a three-dimensional (3D) object can be perceived as having reduced number of dimensions, in the form of points, lines, and thin sheets, making it "undetectable" judging from scattered field. Although a variety of cloaking devices have been reported at microwave and optical frequencies, the Terahertz (THz) domain remains unexplored. Moreover, it should be noted that all the previous experimental demonstrations are performed in a two-dimensional (2D) waveguide configuration. Although those works represent a critical step in validating the concept of the invisibility cloak, one would expect the cloaking device to be realized in 3D with the ability to cloak an object of realistic size. This requires the construction of an optically large cloaking device with features much smaller than the wavelength. Fabricating 3D structures with aspect ratio close to 100:1 is obviously a challenging task. Here, we report an experimental demonstration of a 3D THz ground plane cloak. Reflection terahertz time-domain spectroscopy (THz-TDS) was employed to characterize the cloaking samples. Two distinct reflection peaks can be clearly observed across a broad frequency range, which is caused by the reflection at the surface of the bump. The measured peak positions are consistent with the numerical simulation peak positions. By contrast, in the spectral map of the cloak sample, the wavefront is relatively smooth with a single peak.

Fan Zhou

What is connected

Connect this record

See the researcher in context

Building this map preview

29 published item(s)

Beyond Retrieval: A Multitask Benchmark and Model for Code Search

Decompose to Understand, Fuse to Detect: Frequency-Decoupled Anomaly Detection for Encrypted Network Traffic

DB-GPT: Empowering Database Interactions with Private Large Language Models

Gap Minimization for Knowledge Sharing and Transfer

Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization

Automatic Registration of Images with Inconsistent Content Through Line-Support Region Segmentation and Geometrical Outlier Removal

Domain Generalization via Optimal Transport with Metric Similarity Learning

Evolving Domain Generalization

Forgetting Prevention for Cross-regional Fraud Detection with Heterogeneous Trade Graph

Quantification and Analysis of Layer-wise and Pixel-wise Information Discarding

Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks

TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions over Tabular Data

Construction D' Lattices for Power-Constrained Communications

Embedding Symbolic Temporal Knowledge into Deep Sequential Models

High-Order Statistical Functional Expansion and Its Application To Some Nonsmooth Problems

Multi-task Learning by Leveraging the Semantic Information

Overcoming Catastrophic Forgetting in Graph Neural Networks with Experience Replay

A Heterogeneous Dynamical Graph Neural Networks Approach to Quantify Scientific Impact

Beyond $\mathcal{H}$-Divergence: Domain Adaptation Theory With Jensen-Shannon Divergence

Deep Active Learning: Unified and Principled Method for Query and Training

Discriminative Active Learning for Domain Adaptation

Metric learning by Similarity Network for Deep Semi-Supervised Learning

Relational State-Space Model for Stochastic Multi-Object Systems

The LAMOST spectroscopic survey of star clusters in M31. II. Metallicities, ages and masses

The Voronoi formula and double Dirichlet series

Ultrafast All-optical Modulation Exploiting the Vibrational Dynamic of Metallic Meta-atoms

On the Hecke Eigenvalues of Maass Forms

Hiding a Realistic Object Using a Broadband Terahertz Invisibility Cloak

Three-Dimensional Cloaking Device Operates at Terahertz Frequencies