Researcher profile

Zilong Wang

Zilong Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2026arXiv

A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5

The rapid evolution of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) has driven major gains in reasoning, perception, and generation across language and vision, yet whether these advances translate into comparable improvements in safety remains unclear, partly due to fragmented evaluations that focus on isolated modalities or threat models. In this report, we present an integrated safety evaluation of six frontier models--GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5--assessing each across language, vision-language, and image generation using a unified protocol that combines benchmark, adversarial, multilingual, and compliance evaluations. By aggregating results into safety leaderboards and model profiles, we reveal a highly uneven safety landscape: while GPT-5.2 demonstrates consistently strong and balanced performance, other models exhibit clear trade-offs across benchmark safety, adversarial robustness, multilingual generalization, and regulatory compliance. Despite strong results under standard benchmarks, all models remain highly vulnerable under adversarial testing, with worst-case safety rates dropping below 6%. Text-to-image models show slightly stronger alignment in regulated visual risk categories, yet remain fragile when faced with adversarial or semantically ambiguous prompts. Overall, these findings highlight that safety in frontier models is inherently multidimensional--shaped by modality, language, and evaluation design--underscoring the need for standardized, holistic safety assessments to better reflect real-world risk and guide responsible deployment.

preprint2026arXiv

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

AI agents may soon become capable of autonomously completing valuable, long-horizon tasks in diverse domains. Current benchmarks either do not measure real-world tasks, or are not sufficiently difficult to meaningfully measure frontier models. To this end, we present Terminal-Bench 2.0: a carefully curated hard benchmark composed of 89 tasks in computer terminal environments inspired by problems from real workflows. Each task features a unique environment, human-written solution, and comprehensive tests for verification. We show that frontier models and agents score less than 65\% on the benchmark and conduct an error analysis to identify areas for model and agent improvement. We publish the dataset and evaluation harness to assist developers and researchers in future work at https://www.tbench.ai/ .

preprint2025arXiv

WonderHuman: Hallucinating Unseen Parts in Dynamic 3D Human Reconstruction

In this paper, we present WonderHuman to reconstruct dynamic human avatars from a monocular video for high-fidelity novel view synthesis. Previous dynamic human avatar reconstruction methods typically require the input video to have full coverage of the observed human body. However, in daily practice, one typically has access to limited viewpoints, such as monocular front-view videos, making it a cumbersome task for previous methods to reconstruct the unseen parts of the human avatar. To tackle the issue, we present WonderHuman, which leverages 2D generative diffusion model priors to achieve high-quality, photorealistic reconstructions of dynamic human avatars from monocular videos, including accurate rendering of unseen body parts. Our approach introduces a Dual-Space Optimization technique, applying Score Distillation Sampling (SDS) in both canonical and observation spaces to ensure visual consistency and enhance realism in dynamic human reconstruction. Additionally, we present a View Selection strategy and Pose Feature Injection to enforce the consistency between SDS predictions and observed data, ensuring pose-dependent effects and higher fidelity in the reconstructed avatar. In the experiments, our method achieves SOTA performance in producing photorealistic renderings from the given monocular video, particularly for those challenging unseen parts. The project page and source code can be found at https://wyiguanw.github.io/WonderHuman/.

preprint2022arXiv

A Note on "Optimum Sets of Interference-Free Sequences With Zero Autocorrelation Zone"

In this paper, a simple construction of interference-free zero correlation zone (IF-ZCZ) sequence sets is proposed by well designed finite Zak transform lattice tessellation. Each set is characterized by the period of sequences $KM$, the set size $K$ and the length of zero correlation zone $M-1$, which is optimal with respect to the Tang-Fan-Matsufuji bound. Secondly, the transformations that keep the properties of the optimal IF-ZCZ sequence set unchanged are given, and the equivalent relation of the optimal IF-ZCZ sequence set is defined based on these transformations. Then, it is proved that the general construction of the optimal IF-ZCZ sequence set proposed by Popovic is equivalent to the simple construction of the optimal IF-ZCZ sequence set, which indicates that the generation of the optimal IF-ZCZ sequence set can be simplified. Moreover, it is pointed out that the alphabet size for the special case of the simple construction of the optimal IF-ZCZ sequence set can be a factor of the period. Finally, both the simple construction of the optimal IF-ZCZ sequence set and its special case have sparse and highly structured Zak spectra, which can greatly reduce the computational complexity of implementing matched filter banks.

preprint2022arXiv

Bilateral-ViT for Robust Fovea Localization

The fovea is an important anatomical landmark of the retina. Detecting the location of the fovea is essential for the analysis of many retinal diseases. However, robust fovea localization remains a challenging problem, as the fovea region often appears fuzzy, and retina diseases may further obscure its appearance. This paper proposes a novel Vision Transformer (ViT) approach that integrates information both inside and outside the fovea region to achieve robust fovea localization. Our proposed network, named Bilateral-Vision-Transformer (Bilateral-ViT), consists of two network branches: a transformer-based main network branch for integrating global context across the entire fundus image and a vessel branch for explicitly incorporating the structure of blood vessels. The encoded features from both network branches are subsequently merged with a customized Multi-scale Feature Fusion (MFF) module. Our comprehensive experiments demonstrate that the proposed approach is significantly more robust for diseased images and establishes the new state of the arts using the Messidor and PALM datasets.

preprint2022arXiv

Boolean Functions of Binary Type-II and Type-II/III Complementary Array Pair

The sequence pairs of length $2^{m}$ projected from complementary array pairs of Type-II of size $\mathbf{2}^{(m)}$ and mixed Type-II/III and of size $\mathbf{2}^{(m-1)}\times2$ are complementary sequence pairs Type-II and Type-III respectively. An exhaustive search for binary Type-II and Type-III complementary sequence pairs of small lengths $2^{m}$ ($m=1,2,3,4$) shows that they are all projected from the aforementioned complementary array pairs, whose algebraic normal forms satisfy specified expressions. It's natural to ask whether the conclusion holds for all $m$. In this paper, we proved that these expressions of algebraic normal forms determine all the binary complementary array pairs of Type-II of size $\mathbf{2}^{(m)}$ and mixed Type-II/III of size $\mathbf{2}^{(m-1)}\times2$ respectively.

preprint2022arXiv

CFL: Cluster Federated Learning in Large-scale Peer-to-Peer Networks

Federated learning (FL) has sparked extensive interest in exploiting the private data on clients' local devices. However, the parameter server setting of FL not only has high bandwidth requirements, but also poses data privacy issues and a single point of failure. In this paper, we propose an efficient and privacy-preserving protocol, dubbed CFL, which is the first fine-grained global model training for FL in large-scale peer-to-peer (P2P) networks. Unlike previous FL in P2P networks, CFL aggregates local model update parameters hierarchically, which improves the communication efficiency facing large amounts of clients. Also, the aggregation in CFL is performed in a secure manner by introducing the authenticated encryption scheme, whose key is established through a random pairwise key scheme enhanced by a proposed voting-based key revocation mechanism. Rigorous analyses show that CFL guarantees the privacy and data integrity and authenticity of local model update parameters under two widespread threat models. More importantly, the proposed key revocation mechanism can effectively resist hijack attacks, thereby ensuring the confidentiality of the communication keys. Ingenious experiments on the Trec06p and Trec07 datasets show that the global model trained by CFL has good classification accuracy, model generalization, and rapid convergence rate, and the dropout-robustness of the system is achieved. Compared to the first global model training protocol for FL in P2P networks, PPT, CFL improves communication efficiency by 43.25%. Also, CFL outperforms PPT in terms of computational efficiency.

preprint2022arXiv

Dap-FL: Federated Learning flourishes by adaptive tuning and secure aggregation

Federated learning (FL), an attractive and promising distributed machine learning paradigm, has sparked extensive interest in exploiting tremendous data stored on ubiquitous mobile devices. However, conventional FL suffers severely from resource heterogeneity, as clients with weak computational and communication capability may be unable to complete local training using the same local training hyper-parameters. In this paper, we propose Dap-FL, a deep deterministic policy gradient (DDPG)-assisted adaptive FL system, in which local learning rates and local training epochs are adaptively adjusted by all resource-heterogeneous clients through locally deployed DDPG-assisted adaptive hyper-parameter selection schemes. Particularly, the rationality of the proposed hyper-parameter selection scheme is confirmed through rigorous mathematical proof. Besides, due to the thoughtlessness of security consideration of adaptive FL systems in previous studies, we introduce the Paillier cryptosystem to aggregate local models in a secure and privacy-preserving manner. Rigorous analyses show that the proposed Dap-FL system could guarantee the security of clients' private local models against chosen-plaintext attacks and chosen-message attacks in a widely used honest-but-curious participants and active adversaries security model. In addition, through ingenious and extensive experiments, the proposed Dap-FL achieves higher global model prediction accuracy and faster convergence rates than conventional FL, and the comprehensiveness of the adjusted local training hyper-parameters is validated. More importantly, experimental results also show that the proposed Dap-FL achieves higher model prediction accuracy than two state-of-the-art RL-assisted FL methods, i.e., 6.03% higher than DDPG-based FL and 7.85% higher than DQN-based FL.

preprint2022arXiv

Improving Differential-Neural Distinguisher Model For DES, Chaskey, and PRESENT

In CRYPTO'19, Gohr proposed a new cryptanalysis strategy using machine learning algorithms. Combining the differential-neural distinguisher with a differential path and integrating the advanced key recovery procedure, Gohr achieved a 12-round key recovery attack on Speck32/64. Chen and Yu improved prediction accuracy of differential-neural distinguisher considering derived features from multiple-ciphertext pairs instead of single-ciphertext pairs. By modifying the kernel size of initial convolutional layer to capture more dimensional information, the prediction accuracy of differential-neural distinguisher can be improved for for three reduced symmetric ciphers. For DES, we improve the prediction accuracy of (5-6)-round differential-neural distinguisher and train a new 7-round differential-neural distinguisher. For Chaskey, we improve the prediction accuracy of (3-4)-round differential-neural distinguisher. For PRESENT, we improve the prediction accuracy of (6-7)-round differential-neural distinguisher. The source codes are available in https://drive.google.com/drive/folders/1i0RciZlGZsEpCyW-wQAy7zzJeOLJNWqL?usp=sharing.

preprint2022arXiv

Non-standard Golay Complementary Sequence Pair over QAM

We generalize the three-stage process for constructing and enumerating Golay array and sequence pairs given in 2008 by Frank Fiedler et al. [A multi-dimensional approach to the construction and enumeration of Golay complementary sequences, Journal of Combinatorial Theory, Series A 115 (2008) 753-776] to $4^{q}$-QAM constellation based on para-unitary matrix method, which partly solves their open questions. Our work not only includes the main part of known results of Golay complementary sequences over $4^{q}$-QAM based on Boolean functions and standard Golay sequence pairs over QPSK, but also generates new Golay complementary arrays (sequences) over $4^{q}$-QAM based on non-standard Golay array pairs over QPSK.

preprint2022arXiv

Spatio-temporal sampling of near-petahertz vortex fields

Measuring the field of visible light with high spatial resolution has been challenging, as many established methods only detect a focus-averaged signal. Here, we introduce a near-field method for optical field sampling that overcomes that limitation by employing the localization of the enhanced near-field of a nanometric needle tip. A probe field perturbs the photoemission from the tip, which is induced by a pump pulse, generating a field-dependent current modulation that can easily be captured with our electronic detection scheme. The approach provides reliable characterization of near-petahertz fields. We show that not only the spiral wave-front of visible femtosecond light pulses carrying orbital angular momentum (OAM) can be resolved, but also the field evolution with time in the focal plane. Additionally, our method is polarization sensitive, which makes it applicable to vectorial field reconstruction.

preprint2022arXiv

The $q$-ary Golay complementary arrays of size $\mathbf{2}^{(m)}$ are standard

To find the non-standard binary Golay complementary sequences (GCSs) of length $2^{m}$ or theoretically prove the nonexistence of them are still open. Since it has been shown that all the standard $q$-ary (where $q$ is even) GCSs of length $2^m$ can be obtained by standard $q$-ary Golay complementary array pair (GAP) of dimension $m$ and size $2\times 2 \times \cdots \times 2$ (abbreviated to size $\mathbf{2}^{(m)}$), it's natural to ask whether all the $q$-ary GAP of size $\mathbf{2}^{(m)}$ are standard. We give a positive answer to this question.

preprint2022arXiv

Towards Few-shot Entity Recognition in Document Images: A Label-aware Sequence-to-Sequence Framework

Entity recognition is a fundamental task in understanding document images. Traditional sequence labeling frameworks treat the entity types as class IDs and rely on extensive data and high-quality annotations to learn semantics which are typically expensive in practice. In this paper, we aim to build an entity recognition model requiring only a few shots of annotated document images. To overcome the data limitation, we propose to leverage the label surface names to better inform the model of the target entity type semantics and also embed the labels into the spatial embedding space to capture the spatial correspondence between regions and labels. Specifically, we go beyond sequence labeling and develop a novel label-aware seq2seq framework, LASER. The proposed model follows a new labeling scheme that generates the label surface names word-by-word explicitly after generating the entities. During training, LASER refines the label semantics by updating the label surface name representations and also strengthens the label-region correlation. In this way, LASER recognizes the entities from document images through both semantic and layout correspondence. Extensive experiments on two benchmark datasets demonstrate the superiority of LASER under the few-shot setting.

preprint2021arXiv

The emergence of macroscopic currents in photoconductive sampling of optical fields

Photoconductive field sampling is a key methodology for advancing our understanding of light-matter interaction and ultrafast optoelectronic applications. For visible light the bandwidth of photoconductive sampling of fields and field-induced dynamics can be extended to the petahertz domain. Despite the growing importance of ultrafast photoconductive measurements, a rigorous model for connecting the microscopic electron dynamics to the macroscopic external signal is lacking. This has caused conflicting interpretations about the origin of macroscopic currents. Here, we present systematic experimental studies on the macroscopic signal formation of ultrafast currents in gases. We developed a theoretical model based on the Ramo-Shockley-theorem that overcomes the previously introduced artificial separation into dipole and current contributions. Extensive numerical particle-in-cell (PIC)-type simulations based on this model permit a quantitative comparison with experimental results and help to identify the roles of electron scattering and Coulomb interactions. The results imply that most of the heuristic models utilized so far will need to be amended. Our approach can aid in the design of more sensitive and more efficient photoconductive devices. We demonstrate for the case of gases that over an order of magnitude increase in signal is achievable, paving the way towards petahertz field measurements with the highest sensitivity.

preprint2020arXiv

Constructions of complementary sequence sets and complete complementary codes by 2-level autocorrelation sequences and permutation polynomials

In this paper, a recent method to construct complementary sequence sets and complete complementary codes by Hadamard matrices is deeply studied. By taking the algebraic structure of Hadamard matrices into consideration, our main result determine the so-called $δ$-linear terms and $δ$-quadratic terms. As a first consequence, a powerful theory linking Golay complementary sets of $p$-ary ($p$ prime) sequences and the generalized Reed-Muller codes by Kasami et al. is developed. These codes enjoy good error-correcting capability, tightly controlled PMEPR, and significantly extend the range of coding options for applications of OFDM using $p^n$ subcarriers. As another consequence, we make a previously unrecognized connection between the sequences in CSSs and CCCs and the sequence with 2-level autocorrelation, trace function and permutation polynomial (PP) over the finite fields.

preprint2020arXiv

New Construction of Complementary Sequence (or Array) Sets and Complete Complementary Codes

A new method to construct $q$-ary complementary sequence sets (CSSs) and complete complementary codes (CCCs) of size $N$ is proposed by using desired para-unitary (PU) matrices. The concept of seed PU matrices is introduced and a systematic approach on how to compute the explicit forms of the functions in constructed CSSs and CCCs from the seed PU matrices is given. A general form of these functions only depends on a basis of the functions from $\Z_N$ to $\Z_q$ and representatives in the equivalent class of Butson-type Hadamard (BH) matrices. Especially, the realization of Golay pairs from the our general form exactly coincides with the standard Golay pairs. The realization of ternary complementary sequences of size $3$ is first reported here. For the realization of the quaternary complementary sequences of size 4, almost all the sequences derived here are never reported before. Generalized seed PU matrices and the recursive constructions of the desired PU matrices are also studied, and a large number of new constructions of CSSs and CCCs are given accordingly. From the perspective of this paper, all the known results of CSSs and CCCs with explicit GBF form in the literature (except non-standard Golay pairs) are constructed from the Walsh matrices of order 2. This suggests that the proposed method with the BH matrices of higher orders will yield a large number of new CSSs and CCCs with the exponentially increasing number of the sequences of low peak-to-mean envelope power ratio.

preprint2020arXiv

New Construction of Complementary Sequence (or Array) Sets and Complete Complementary Codes (I)

A new method to construct $q$-ary complementary sequence (or array) sets (CSSs) and complete complementary codes (CCCs) of size $N$ is introduced in this paper. An algorithm on how to compute the explicit form of the functions in constructed CSS and CCC is also given. A general form of these functions only depends on a basis of functions from $\Z_N$ to $\Z_q$ and representatives in the equivalent class of Butson-type Hadamard matrices. Surprisingly, all the functions fill up a larger number of cosets of a linear code, compared with the existing constructions. From our general construction, its realization of $q$-ary Golay pairs exactly coincides with the standard Golay sequences. The realization of ternary complementary sequences of size $3$ is first reported here. For binary and quaternary complementary sequences of size 4, a general Boolean function form of these sequences is obtained. Most of these sequences are also new. Moreover, most of quaternary sequences cannot be generalized from binary sequences, which is different from known constructions. More importantly, both binary and quaternary sequences of size 4 constitute a large number of cosets of the linear code respectively.

preprint2020arXiv

New Construction of Complementary Sequence (or Array) Sets and Complete Complementary Codes (II)

Previously, we have presented a framework to use the para-unitary (PU) matrix-based approach for constructing new complementary sequence set (CSS), complete complementary code (CCC), complementary sequence array (CSA), and complete complementary array (CCA). In this paper, we introduce a new class of delay matrices for the PU construction. In this way, generalized Boolean functions (GBF) derived from PU matrix can be represented by an array of size $2\times 2 \times \cdots \times 2$. In addition, we introduce a new method to construct PU matrices using block matrices. With these two new ingredients, our new framework can construct an extremely large number of new CSA, CCA, CSS and CCC, and their respective GBFs can be also determined recursively. Furthermore, we can show that the known constructions of CSSs, proposed by Paterson and Schmidt respectively, the known CCCs based on Reed-muller codes are all special cases of this new framework. In addition, we are able to explain the bound of PMEPR of the sequences in the part of the open question, proposed in 2000 by Paterson.

preprint2020arXiv

New Construction of Optimal Interference-Free ZCZ Sequence Sets by Zak Transform

In this paper, a new construction of interference-free zero correlation zone (IF-ZCZ) sequence sets is proposed by well designed finite Zak transform lattice tessellation. Each set is characterized by the period of sequences $KM^2$, the set size $K$ and the length of zero correlation zone $M^2-1$, which is optimal with respect to the Tang-Fan-Matsufuji bound. In particular, all sequences in these sets have sparse and highly structured Zak and Fourier spectra, which can decrease the computational complexity of the implementation of the banks of matched filters. Moreover, for the parameters proposed in this paper, the new construction is essentially different from the general construction of optimal IF-ZCZ sequence sets given by Popovic.