Source author record

Jing Yang

Jing Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

92works

40topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

$f$-Divergence Regularized RLHF: Two Tales of Sampling and Unified Analyses

Reinforcement Learning from Human Feedback (RLHF) has become a cornerstone technique for post-training large language models. While most existing approaches rely on the reverse KL-regularization, recent empirical studies have begun exploring alternative divergences (e.g., forward KL, chi-squared) as regularizers in RLHF. However, a unified theoretical understanding of general $f$-divergence regularization remains under-explored. To fill this gap, this work develops a comprehensive theoretical framework for online RLHF with a general $f$-divergence regularized objective. Rather than treating each possible divergence function individually, we adopt a holistic perspective across the entire function class and propose two algorithms based on distinct sampling principles. The first extends the classical optimism principle with a carefully designed exploration bonus, while the second introduces a new method that exploits the sensitivity of the optimal policy to reward perturbations under $f$-divergence regularization. Theoretical analysis shows that $O(\log T)$ regret and $O(1/T)$ sub-optimality gap are achievable, establishing provable efficiency of both algorithms and, to the best of our knowledge, the first performance bounds for online RLHF under general $f$-divergence regularization.

preprint2026arXiv

Breaking the Computational Barrier: Provably Efficient Actor-Critic for Low-Rank MDPs

Reinforcement learning (RL) is a fundamental framework for sequential decision-making, in which an agent learns an optimal policy through interactions with an unknown environment. In settings with function approximation, many existing RL algorithms achieve favorable sample complexity, but often rely on computationally intractable oracles. In this paper, we use supervised learning as a computational proxy to establish a clear hierarchy of commonly adopted RL oracles under low-rank Markov Decision Processes (MDPs). This hierarchy shows that policy evaluation is the most computationally efficient oracle, provided that supervised learning can be efficiently solved. Motivated by this observation, we propose a novel optimistic actor-critic algorithm that relies solely on the policy evaluation oracle. We prove that our algorithm outperforms the existing sample complexity guarantees for low-rank MDPs while avoiding computationally expensive planning or optimization oracles commonly assumed in prior works. We further extend our theoretical results to approximately low-rank MDPs and demonstrate that this setting captures a broad class of real-world environments. Finally, we validate our theoretical results with experiments on several standard Gym environments.

preprint2026arXiv

Efficient Multi-objective Prompt Optimization via Pure-exploration Bandits

Prompt engineering has become central to eliciting the capabilities of large language models (LLMs). At its core lies prompt selection -- efficiently identifying the most effective prompts. However, most prior investigations overlook a key challenge: the inherently multi-faceted nature of prompt performance, which cannot be captured by a single metric. To fill this gap, we study the multi-objective prompt selection problem under two practical settings: Pareto prompt set recovery and best feasible prompt identification. Casting the problem into the pure-exploration bandits framework, we adapt provably efficient algorithms from multi-objective bandits and further introduce a novel design for best feasible arm identification in structured bandits, with theoretical guarantees on the identification error in the linear case. Extensive experiments across multiple LLMs show that the bandit-based approaches yield significant improvements over baselines, establishing a principled and efficient framework for multi-objective prompt optimization.

preprint2026arXiv

Enhancing Multilingual Counterfactual Generation through Alignment-as-Preference Optimization

Self-generated counterfactual explanations (SCEs) are minimally modified inputs (minimality) generated by large language models (LLMs) that flip their own predictions (validity), offering a causally grounded approach to unraveling black-box LLM behavior. Yet extending them beyond English remains challenging: existing methods struggle to produce valid SCEs in non-dominant languages, and a persistent trade-off between validity and minimality undermines explanation quality. We introduce Macro, a preference alignment framework that applies Direct Preference Optimization (DPO) to multilingual SCE generation, using a composite scoring function to construct preference pairs that effectively translate the trade-off into measurable preference signals. Experiments across four LLMs and seven typologically diverse languages show that Macro improves validity by 12.55\% on average over the chain-of-thought baseline without degrading minimality, while avoiding the severe minimality violations of the translation-based baseline. Compared to supervised fine-tuning, Macro achieves superior performance on both metrics, confirming that explicit preference optimization is essential for balancing this trade-off. Further analyses reveal that Macro increases cross-lingual perturbation alignment and mitigates common generation errors. Our results highlight preference optimization as a promising direction for enhancing multilingual model explanations.

preprint2026arXiv

GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning

Large Multimodal Models (LMMs) often struggle with geometric reasoning due to visual hallucinations and a lack of mathematically precise Chain-of-Thought (CoT) data. To address this, we propose the GeoSym Engine, an automated and scalable neuro-symbolic framework. By leveraging a type-conditional grammar and an analytic SymGT Solver, it derives exact symbolic ground truths and seamlessly integrates with a robust rendering pipeline to produce high-precision geometric diagrams. Using this engine, we construct GeoSym127K, a difficulty-stratified dataset featuring 51K high-resolution images, 127K questions with symbolic ground truths, and 55K answer-verified CoT QA pairs. We also introduce GeoSym-Bench, an expert-curated suite of 511 complex samples for rigorous evaluation. Through extensive supervised fine-tuning (SFT), we demonstrate that GeoSym drives concentrated improvements specifically on diagram-dependent and multi-step geometry tasks. Our Qwen3-VL-8B model gains an absolute +22.21% on the MathVerse Vision-Only subset and reaches 61.52% (+6.19% improvement) on WeMath, mitigating long-horizon logic fragmentation and outperforming advanced closed-source models like Doubao-1.8. Furthermore, applying Reinforcement Learning with Verifiable Rewards (RLVR) via GRPO reveals that initializing from structural SFT checkpoints substantially elevates the performance ceiling over zero-shot RL. Driven by deterministic exact-match signals, this showcases the robust scaling potential of our verifiable reasoning synthesis. Datasets and code are available at https://huggingface.co/datasets/Tomie0506/GeoSym127K and https://github.com/Tomie56/GeoSym127K.

preprint2026arXiv

HisTrackMap: Global Vectorized High-Definition Map Construction via History Map Tracking

As an essential component of autonomous driving systems, high-definition (HD) maps provide rich and precise environmental information for auto-driving scenarios; however, existing methods, which primarily rely on query-based detection frameworks to directly model map elements or implicitly propagate queries over time, often struggle to maintain consistent temporal perception outcomes. These inconsistencies pose significant challenges to the stability and reliability of real-world autonomous driving and map data collection systems. To address this limitation, we propose a novel end-to-end tracking framework for global map construction by temporally tracking map elements' historical trajectories. Firstly, instance-level historical rasterization map representation is designed to explicitly store previous perception results, which can control and maintain different global instances' history information in a fine-grained way. Secondly, we introduce a Map-Trajectory Prior Fusion module within this tracking framework, leveraging historical priors for tracked instances to improve temporal smoothness and continuity. Thirdly, we propose a global perspective metric to evaluate the quality of temporal geometry construction in HD maps, filling the gap in current metrics for assessing global geometric perception results. Substantial experiments on the nuScenes and Argoverse2 datasets demonstrate that the proposed method outperforms state-of-the-art (SOTA) methods in both single-frame and temporal metrics. The project page is available at: https://yj772881654.github.io/HisTrackMap.

preprint2026arXiv

Judge Circuits

LLM-as-a-judge has become the dominant paradigm for grading model outputs at scale, yet the same model assigns systematically different scores when its output format changes (e.g., a 1-5 rating vs. a True/False label). Existing diagnoses of these format-induced inconsistencies stop at the input-output level. Using Position-aware Edge Attribution Patching (PEAP), we causally investigate the internal mechanism in Gemma-3, Qwen2.5, and Llama-3. We find that judgments across structured understanding and open-ended preference tasks share a sparse, generalized Latent Evaluator sub-graph in the mid-to-late multi-layer perceptrons (MLPs); zero-ablating it collapses judgment while preserving world knowledge in architecturally modular models. By structurally decoupling abstract judging from output formatting, we provide a mechanistic account of format-induced inconsistency on the open-weight models we study: a continuous judgment signal computed in the shared trunk is mapped through fragile, format-specific terminal branches, enabling format-independent preference to be isolated downstream of the requested output format. Our findings imply that benchmark-level reliability comparisons across formats are partially measuring formatter geometry rather than evaluation quality.

preprint2026arXiv

Lean Clients, Full Accuracy: Hybrid Zeroth- and First-Order Split Federated Learning

Split Federated Learning (SFL) enables collaborative training between resource-constrained edge devices and a compute-rich server. Communication overhead is a central issue in SFL and can be mitigated with auxiliary networks. Yet, the fundamental client-side computation challenge remains, as back-propagation requires substantial memory and computation costs, severely limiting the scale of models that edge devices can support. To enable more resource-efficient client computation and reduce the client-server communication, we propose HERON-SFL, a novel hybrid optimization framework that integrates zeroth-order (ZO) optimization for local client training while retaining first-order (FO) optimization on the server. With the assistance of auxiliary networks, ZO updates enable clients to approximate local gradients using perturbed forward-only evaluations per step, eliminating memory-intensive activation caching and avoiding explicit gradient computation in the traditional training process. Leveraging the low effective rank assumption, we theoretically prove that HERON-SFL's convergence rate is independent of model dimensionality, addressing a key scalability concern common to ZO algorithms. Empirically, on ResNet training and language model (LM) fine-tuning tasks, HERON-SFL matches benchmark accuracy while reducing client peak memory by up to 64% and client-side compute cost by up to 33% per step, substantially expanding the range of models that can be trained or adapted on resource-limited devices.

preprint2026arXiv

Order in the Evaluation Court: A Critical Analysis of NLG Evaluation Trends

Despite advances in Natural Language Generation (NLG), evaluation remains challenging. Although various new metrics and LLM-as-a-judge (LaaJ) methods are proposed, human judgment persists as the gold standard. To systematically review how NLG evaluation has evolved, we employ an automatic information extraction scheme to gather key information from NLG papers, focusing on different evaluation methods (metrics, LaaJ and human evaluation). With extracted metadata from 14,171 papers across four major conferences (ACL, EMNLP, NAACL, and INLG) over the past six years, we reveal several critical findings: (1) Task Divergence: While Dialogue Generation demonstrates a rapid shift toward LaaJ (>40% in 2025), Machine Translation remains locked into n-gram metrics, and Question Answering exhibits a substantial decline in the proportion of studies conducting human evaluation. (2) Metric Inertia: Despite the development of semantic metrics, general-purpose metrics (e.g., BLEU, ROUGE) continue to be widely used across tasks without empirical justification, often lacking the discriminative power to distinguish between specific quality criteria. (3) Human-LaaJ Divergence: Our association analysis challenges the assumption that LLMs act as mere proxies for humans; LaaJ and human evaluations prioritize very different signals, and explicit validation is scarce (<8% of papers comparing the two), with only moderate to low correlation. Based on these observations, we derive practical recommendations to improve the rigor of future NLG evaluation.

preprint2026arXiv

Unlabeled Data Can Provably Enhance In-Context Learning of Transformers

Large language models (LLMs) exhibit impressive in-context learning (ICL) capabilities, yet the quality of their predictions is fundamentally limited by the few costly labeled demonstrations that can fit into a prompt. Meanwhile, there exist vast and continuously growing amounts of unlabeled data that may be closely related to the ICL task. How to utilize such unlabeled data to provably enhance the performance of ICL thus becomes an emerging fundamental question. In this work, we propose a novel augmented ICL framework, in which the prompt includes a small set of labeled examples alongside a block of unlabeled inputs. We focus on the multi-class linear classification setting and demonstrate that, with chain-of-thought (CoT) prompting, a multi-layer transformer can effectively emulate an expectation-maximization (EM) algorithm. This enables the transformer to implicitly extract useful information from both labeled and unlabeled data, leading to provable improvements in ICL accuracy. Moreover, we show that such a transformer can be trained via teacher forcing, with its parameters converging to the desired solution at a linear rate. Experiments demonstrate that the augmented ICL framework consistently outperforms conventional few-shot ICL, providing empirical support for our theoretical findings. To the best of our knowledge, this is the first theoretical study on the impact of unlabeled data on the ICL performance of transformers.

preprint2024arXiv

AID-DTI: Accelerating High-fidelity Diffusion Tensor Imaging with Detail-Preserving Model-based Deep Learning

Deep learning has shown great potential in accelerating diffusion tensor imaging (DTI). Nevertheless, existing methods tend to suffer from Rician noise and detail loss in reconstructing the DTI-derived parametric maps especially when sparsely sampled q-space data are used. This paper proposes a novel method, AID-DTI (Accelerating hIgh fiDelity Diffusion Tensor Imaging), to facilitate fast and accurate DTI with only six measurements. AID-DTI is equipped with a newly designed Singular Value Decomposition (SVD)-based regularizer, which can effectively capture fine details while suppressing noise during network training. Experimental results on Human Connectome Project (HCP) data consistently demonstrate that the proposed method estimates DTI parameter maps with fine-grained details and outperforms three state-of-the-art methods both quantitatively and qualitatively.

preprint2024arXiv

Simultaneous q-Space Sampling Optimization and Reconstruction for Fast and High-fidelity Diffusion Magnetic Resonance Imaging

Diffusion Magnetic Resonance Imaging (dMRI) plays a crucial role in the noninvasive investigation of tissue microstructural properties and structural connectivity in the \textit{in vivo} human brain. However, to effectively capture the intricate characteristics of water diffusion at various directions and scales, it is important to employ comprehensive q-space sampling. Unfortunately, this requirement leads to long scan times, limiting the clinical applicability of dMRI. To address this challenge, we propose SSOR, a Simultaneous q-Space sampling Optimization and Reconstruction framework. We jointly optimize a subset of q-space samples using a continuous representation of spherical harmonic functions and a reconstruction network. Additionally, we integrate the unique properties of diffusion magnetic resonance imaging (dMRI) in both the q-space and image domains by applying $l1$-norm and total-variation regularization. The experiments conducted on HCP data demonstrate that SSOR has promising strengths both quantitatively and qualitatively and exhibits robustness to noise.

preprint2023arXiv

Determinate Node Selection for Semi-supervised Classification Oriented Graph Convolutional Networks

Graph Convolutional Networks (GCNs) have been proved successful in the field of semi-supervised node classification by extracting structural information from graph data. However, the random selection of labeled nodes used by GCNs may lead to unstable generalization performance of GCNs. In this paper, we propose an efficient method for the deterministic selection of labeled nodes: the Determinate Node Selection (DNS) algorithm. The DNS algorithm identifies two categories of representative nodes in the graph: typical nodes and divergent nodes. These labeled nodes are selected by exploring the structure of the graph and determining the ability of the nodes to represent the distribution of data within the graph. The DNS algorithm can be applied quite simply on a wide range of semi-supervised graph neural network models for node classification tasks. Through extensive experimentation, we have demonstrated that the incorporation of the DNS algorithm leads to a remarkable improvement in the average accuracy of the model and a significant decrease in the standard deviation, as compared to the original method.

preprint2022arXiv

A gated group sequential design for seamless Phase II/III trial with subpopulation selection

Due to the high cost and high failure rate of Phase III trials, seamless Phase II/III designs are more and more popular to trial efficiency. A potential attraction of Phase II/III design is to allow a randomized proof-of-concept stage prior to committing to the full cost of the Phase III trial. Population selection during the trial allows a trial to adapt and focus investment where it is most likely to provide patient benefit. Motivated by a clinical trial to find the population that potential benefits with dual-primary endpoints progression free survival (PFS) and overall survival (OS), we propose a gated group sequential design for a seamless Phase II/III trial design with population selection. The investigated design controls the familywise error rate and allows multiple interim analyses to enable early stopping for efficacy or futility. Simulations and an illustrative example suggest that the proposed gated group sequential design can have more power than the commonly used classical group sequential design, and reduces the patient's exposure to less effective treatment if the complementary sub-group has less significant treatment effect. The proposed design has the potential to save drug development cost and more quickly fulfill unmet medical needs.

preprint2022arXiv

ArcFace: Additive Angular Margin Loss for Deep Face Recognition

Recently, a popular line of research in face recognition is adopting margins in the well-established softmax loss function to maximize class separability. In this paper, we first introduce an Additive Angular Margin Loss (ArcFace), which not only has a clear geometric interpretation but also significantly enhances the discriminative power. Since ArcFace is susceptible to the massive label noise, we further propose sub-center ArcFace, in which each class contains $K$ sub-centers and training samples only need to be close to any of the $K$ positive sub-centers. Sub-center ArcFace encourages one dominant sub-class that contains the majority of clean faces and non-dominant sub-classes that include hard or noisy faces. Based on this self-propelled isolation, we boost the performance through automatically purifying raw web faces under massive real-world noise. Besides discriminative feature embedding, we also explore the inverse problem, mapping feature vectors to face images. Without training any additional generator or discriminator, the pre-trained ArcFace model can generate identity-preserved face images for both subjects inside and outside the training data only by using the network gradient and Batch Normalization (BN) priors. Extensive experiments demonstrate that ArcFace can enhance the discriminative feature embedding as well as strengthen the generative face synthesis.

preprint2022arXiv

Existence, Local uniqueness and periodicity of bubbling solutions for a critical nonlinear elliptic equation

We revisit the following nonlinear critical elliptic equation \begin{equation*} -Δu+Q(y)u=u^{\frac{N+2}{N-2}},\;\;\; u>0\;\;\;\hbox{ in } \mathbb{R}^N, \end{equation*} where $N\geq 5.$ There seems to be no results about the periodicity of bubbling solutions. Here we try to investigate some related problems. Assuming that $Q(y)$ is periodic in $y_1$ with period 1 and has a local minimum at 0 satisfying $Q(0)=0,$ we prove the existence and local uniqueness of infinitely many bubbling solutions of the problem above. This local uniqueness result implies that some bubbling solutions preserve the symmetry of the potential function $Q(y),$ i.e. the bubbling solution whose blow-up set is $\{(jL,0,...,0):j=0,\pm 1, \pm 2,..., \pm m\}$ must be periodic in $y_{1}$ provided that $L$ is large enough, where $m$ is the number of the bubbles which is large enough but independent of $L.$ Moreover, we also show a non-existence of this bubbling solutions for the problem above if the local minimum of $Q(y)$ does not equal to zero.

preprint2022arXiv

Interplay between jamming and MIPS in persistent self-propelling particles

In living and engineered systems of active particles, self-propulsion induces an unjamming transition from a solid to a fluid phase and phase separation between a gas and a liquid-like phase. We demonstrate an interplay between these two nonequilibrium transitions in systems of persistent active particles. The coexistence and jamming lines in the activity-density plane meet at the jamming transition point in the limit of hard particles or zero activity. This interplay induces an anomalous dynamic in the liquid phase and hysteresis at the active jamming transition.

preprint2022arXiv

Killing Two Birds with One Stone:Efficient and Robust Training of Face Recognition CNNs by Partial FC

Learning discriminative deep feature embeddings by using million-scale in-the-wild datasets and margin-based softmax loss is the current state-of-the-art approach for face recognition. However, the memory and computing cost of the Fully Connected (FC) layer linearly scales up to the number of identities in the training set. Besides, the large-scale training data inevitably suffers from inter-class conflict and long-tailed distribution. In this paper, we propose a sparsely updating variant of the FC layer, named Partial FC (PFC). In each iteration, positive class centers and a random subset of negative class centers are selected to compute the margin-based softmax loss. All class centers are still maintained throughout the whole training process, but only a subset is selected and updated in each iteration. Therefore, the computing requirement, the probability of inter-class conflict, and the frequency of passive update on tail class centers, are dramatically reduced. Extensive experiments across different training data and backbones (e.g. CNN and ViT) confirm the effectiveness, robustness and efficiency of the proposed PFC. The source code is available at \https://github.com/deepinsight/insightface/tree/master/recognition.

preprint2022arXiv

Minimum-Time Quantum Control and the Quantum Brachistochrone Equation

Minimum-time quantum control protocols can be obtained from the quantum brachistochrone formalism [Carlini, Hosoya, Koike, and Okudaira, Phys. Rev. Lett. 96, 06053, (2006)]. We point out that the original treatment implicitly applied the variational calculus with fixed boundary conditions. We argue that the genuine quantum brachistochrone problem involves a variational problem with a movable endpoint, contrary to the classical brachistochrone problem. This formulation not only simplifies the derivation of the quantum brachistochrone equation but introduces an additional constraint at the endpoint due to the boundary effect. We present the general solution to the full quantum brachistochrone equation and discuss its main features. Using it, we prove that the speed of evolution under constraints is reduced with respect to the unrestricted case. In addition, we find that solving the quantum brachistochrone equation is closely connected to solving the dynamics of the Lagrange multipliers, which is in general governed by nonlinear differential equations. Their numerical integration allows generating time-extremal trajectories. Furthermore, when the restricted operators form a closed subalgebra, the Lagrange multipliers become constant and the optimal Hamiltonian takes a concise form. The new class of analytically solvable models for the quantum brachistochrone problem opens up the possibility of applying it to many-body quantum systems, exploring notions related to geometry such as quantum speed limits, and advancing significantly the quantum state and gate preparation for quantum information processing.

preprint2022arXiv

Multi-channel Attentive Graph Convolutional Network With Sentiment Fusion For Multimodal Sentiment Analysis

Nowadays, with the explosive growth of multimodal reviews on social media platforms, multimodal sentiment analysis has recently gained popularity because of its high relevance to these social media posts. Although most previous studies design various fusion frameworks for learning an interactive representation of multiple modalities, they fail to incorporate sentimental knowledge into inter-modality learning. This paper proposes a Multi-channel Attentive Graph Convolutional Network (MAGCN), consisting of two main components: cross-modality interactive learning and sentimental feature fusion. For cross-modality interactive learning, we exploit the self-attention mechanism combined with densely connected graph convolutional networks to learn inter-modality dynamics. For sentimental feature fusion, we utilize multi-head self-attention to merge sentimental knowledge into inter-modality feature representations. Extensive experiments are conducted on three widely-used datasets. The experimental results demonstrate that the proposed model achieves competitive performance on accuracy and F1 scores compared to several state-of-the-art approaches.

preprint2022arXiv

On Federated Learning with Energy Harvesting Clients

Catering to the proliferation of Internet of Things devices and distributed machine learning at the edge, we propose an energy harvesting federated learning (EHFL) framework in this paper. The introduction of EH implies that a client's availability to participate in any FL round cannot be guaranteed, which complicates the theoretical analysis. We derive novel convergence bounds that capture the impact of time-varying device availabilities due to the random EH characteristics of the participating clients, for both parallel and local stochastic gradient descent (SGD) with non-convex loss functions. The results suggest that having a uniform client scheduling that maximizes the minimum number of clients throughout the FL process is desirable, which is further corroborated by the numerical experiments using a real-world FL task and a state-of-the-art EH scheduler.

preprint2022arXiv

Post-quantum Multi-stage Secret Sharing Schemes using Inhomogeneous Linear Recursion and Ajtai's Function

Secret sharing was firstly proposed in 1979 by Shamir and Blakley respectively. To avoid deficiencies of original schemes, researchers presented improvement schemes, among which the multi-secret sharing scheme (MSS) is significant. There are three categories of MSSs, however, we focus on multi-stage secret sharing scheme (MSSS) recovering secrets with any order in this work. By observing inhomogeneous linear recursions (ILRs) in the literature, we conclude a general formula and divide ILRs into two types according to different variables in them. Utilizing these two kinds of ILRs, we propose four verifiable MSSSs with Ajtai's function, which is a lattice-based function. Our schemes have the following advantages. Firstly, our schemes can detect cheat of the dealer and participants, and are multi-use. Secondly, we have several ways to restore secrets. Thirdly, we can turn our schemes into other types of MSSs due to the universality of our method. Fourthly, since we utilize a lattice-based function to mask shares, our schemes can resist the attack from the quantum computer with computational security. Finally, although our schemes need more memory consumption than some known schemes, we need much less time consumption, which makes our schemes more suitable facing limited computing power.

preprint2022arXiv

Pre-training strategies and datasets for facial representation learning

What is the best way to learn a universal face representation? Recent work on Deep Learning in the area of face analysis has focused on supervised learning for specific tasks of interest (e.g. face recognition, facial landmark localization etc.) but has overlooked the overarching question of how to find a facial representation that can be readily adapted to several facial analysis tasks and datasets. To this end, we make the following 4 contributions: (a) we introduce, for the first time, a comprehensive evaluation benchmark for facial representation learning consisting of 5 important face analysis tasks. (b) We systematically investigate two ways of large-scale representation learning applied to faces: supervised and unsupervised pre-training. Importantly, we focus our evaluations on the case of few-shot facial learning. (c) We investigate important properties of the training datasets including their size and quality (labelled, unlabelled or even uncurated). (d) To draw our conclusions, we conducted a very large number of experiments. Our main two findings are: (1) Unsupervised pre-training on completely in-the-wild, uncurated data provides consistent and, in some cases, significant accuracy improvements for all facial tasks considered. (2) Many existing facial video datasets seem to have a large amount of redundancy. We will release code, and pre-trained models to facilitate future research.

preprint2022arXiv

Precoding and Scheduling for AoI Minimization in MIMO Broadcast Channels

In this paper, we consider a status updating system where updates are generated at a constant rate at $K$ sources and sent to the corresponding recipients through a noise-free broadcast channel. We assume that perfect channel state information (CSI) is available at the transmitter before each transmission, and the transmitter is able to utilize the CSI information to precode the updates. Our object is to design optimal precoding schemes to minimize the summed average \emph{age of information} (AoI) at the recipients. Under various assumptions on the size of each update $B$, the number of transmit antennas $M$, and the number of receive antennas $N$ at each user, this paper identifies the corresponding age-optimal precoding and transmission scheduling strategies. Specifically, for the case when $N=1$, a round-robin based updating scheme is shown to be optimal. For the two-user systems with $N>B$ or $M\notin[N:2N]$, framed updating schemes are proven to be optimal. For other cases in the two-user systems, a framed alternating updating scheme is proven to be $2$-optimal.

preprint2022arXiv

Random Orthogonalization for Federated Learning in Massive MIMO Systems

We propose a novel uplink communication method, coined random orthogonalization, for federated learning (FL) in a massive multiple-input and multiple-output (MIMO) wireless system. The key novelty of random orthogonalization comes from the tight coupling of FL model aggregation and two unique characteristics of massive MIMO - channel hardening and favorable propagation. As a result, random orthogonalization can achieve natural over-the-air model aggregation without requiring transmitter side channel state information, while significantly reducing the channel estimation overhead at the receiver. Theoretical analyses with respect to both communication and machine learning performances are carried out. In particular, an explicit relationship among the convergence rate, the number of clients and the number of antennas is established. Experimental results validate the effectiveness and efficiency of random orthogonalization for FL in massive MIMO.

preprint2022arXiv

Region-Aware Network: Model Human's Top-Down Visual Perception Mechanism for Crowd Counting

Background noise and scale variation are common problems that have been long recognized in crowd counting. Humans glance at a crowd image and instantly know the approximate number of human and where they are through attention the crowd regions and the congestion degree of crowd regions with a global receptive field. Hence, in this paper, we propose a novel feedback network with Region-Aware block called RANet by modeling humans Top-Down visual perception mechanism. Firstly, we introduce a feedback architecture to generate priority maps that provide prior about candidate crowd regions in input images. The prior enables the RANet pay more attention to crowd regions. Then we design Region-Aware block that could adaptively encode the contextual information into input images through global receptive field. More specifically, we scan the whole input images and its priority maps in the form of column vector to obtain a relevance matrix estimating their similarity. The relevance matrix obtained would be utilized to build global relationships between pixels. Our method outperforms state-of-the-art crowd counting methods on several public datasets.

preprint2022arXiv

Unified BERT for Few-shot Natural Language Understanding

Even as pre-trained language models share a semantic encoder, natural language understanding suffers from a diversity of output schemas. In this paper, we propose UBERT, a unified bidirectional language understanding model based on BERT framework, which can universally model the training objects of different NLU tasks through a biaffine network. Specifically, UBERT encodes prior knowledge from various aspects, uniformly constructing learning representations across multiple NLU tasks, which is conducive to enhancing the ability to capture common semantic understanding. By using the biaffine to model scores pair of the start and end position of the original text, various classification and extraction structures can be converted into a universal, span-decoding approach. Experiments show that UBERT wins the first price in the 2022 AIWIN - World Artificial Intelligence Innovation Competition, Chinese insurance few-shot multi-task track, and realizes the unification of extensive information extraction and linguistic reasoning tasks.

preprint2022arXiv

Variational principle for optimal quantum controls in quantum metrology

We develop a variational principle to determine the quantum controls and initial state which optimizes the quantum Fisher information, the quantity characterizing the precision in quantum metrology. When the set of available controls is limited, the exact optimal initial state and the optimal controls are in general dependent on the probe time, a feature missing in the unrestricted case. Yet, for time-independent Hamiltonians with restricted controls, the problem can be approximately reduced to the unconstrained case via the Floquet engineering. In particular, we find for magnetometry with a time-independent spin chain containing three-body interactions, even when the controls are restricted to one and two-body interaction, that the Heisenberg scaling can still be approximately achieved. Our results open the door to investigate quantum metrology under a limited set of available controls, of relevance to many-body quantum metrology in realistic scenarios.

preprint2021arXiv

Document Layout Analysis via Dynamic Residual Feature Fusion

The document layout analysis (DLA) aims to split the document image into different interest regions and understand the role of each region, which has wide application such as optical character recognition (OCR) systems and document retrieval. However, it is a challenge to build a DLA system because the training data is very limited and lacks an efficient model. In this paper, we propose an end-to-end united network named Dynamic Residual Fusion Network (DRFN) for the DLA task. Specifically, we design a dynamic residual feature fusion module which can fully utilize low-dimensional information and maintain high-dimensional category information. Besides, to deal with the model overfitting problem that is caused by lacking enough data, we propose the dynamic select mechanism for efficient fine-tuning in limited train data. We experiment with two challenging datasets and demonstrate the effectiveness of the proposed module.

preprint2021arXiv

Feature Generation and Hypothesis Verification for Reliable Face Anti-Spoofing

Although existing face anti-spoofing (FAS) methods achieve high accuracy in intra-domain experiments, their effects drop severely in cross-domain scenarios because of poor generalization. Recently, multifarious techniques have been explored, such as domain generalization and representation disentanglement. However, the improvement is still limited by two issues: 1) It is difficult to perfectly map all faces to a shared feature space. If faces from unknown domains are not mapped to the known region in the shared feature space, accidentally inaccurate predictions will be obtained. 2) It is hard to completely consider various spoof traces for disentanglement. In this paper, we propose a Feature Generation and Hypothesis Verification framework to alleviate the two issues. Above all, feature generation networks which generate hypotheses of real faces and known attacks are introduced for the first time in the FAS task. Subsequently, two hypothesis verification modules are applied to judge whether the input face comes from the real-face space and the real-face distribution respectively. Furthermore, some analyses of the relationship between our framework and Bayesian uncertainty estimation are given, which provides theoretical support for reliable defense in unknown domains. Experimental results show our framework achieves promising results and outperforms the state-of-the-art approaches on extensive public datasets.

preprint2021arXiv

Federated Multi-armed Bandits with Personalization

A general framework of personalized federated multi-armed bandits (PF-MAB) is proposed, which is a new bandit paradigm analogous to the federated learning (FL) framework in supervised learning and enjoys the features of FL with personalization. Under the PF-MAB framework, a mixed bandit learning problem that flexibly balances generalization and personalization is studied. A lower bound analysis for the mixed model is presented. We then propose the Personalized Federated Upper Confidence Bound (PF-UCB) algorithm, where the exploration length is chosen carefully to achieve the desired balance of learning the local model and supplying global information for the mixed learning objective. Theoretical analysis proves that PF-UCB achieves an $O(\log(T))$ regret regardless of the degree of personalization, and has a similar instance dependency as the lower bound. Experiments using both synthetic and real-world datasets corroborate the theoretical analysis and demonstrate the effectiveness of the proposed algorithm.

preprint2021arXiv

Magnetic field generation from bubble collisions during first-order phase transition

We study the magnetic fields generation from the cosmological first-order electroweak phase transition. We calculate the magnetic field induced by the variation of the Higgs phase for two bubbles and three bubbles collisions. Our study shows that electromagnetic currents in the collision direction produce the ring-like magnetic field in the intersect regions of colliding bubbles, which may seed the primordial magnetic field that are constrained by intergalatic field observations.

preprint2021arXiv

Multi-Spectrally Constrained Transceiver Design against Signal-Dependent Interference

This paper focuses on the joint synthesis of constant envelope transmit signal and receive filter aimed at optimizing radar performance in signal-dependent interference and spectrally contested-congested environments. To ensure the desired Quality of Service (QoS) at each communication system, a precise control of the interference energy injected by the radar in each licensed/shared bandwidth is imposed. Besides, along with an upper bound to the maximum transmitted energy, constant envelope (with either arbitrary or discrete phases) and similarity constraints are forced to ensure compatibility with amplifiers operating in saturation regime and bestow relevant waveform features, respectively. To handle the resulting NP-hard design problems, new iterative procedures (with ensured convergence properties) are devised to account for continuous and discrete phase constraints, capitalizing on the Coordinate Descent (CD) framework. Two heuristic procedures are also proposed to perform valuable initializations. Numerical results are provided to assess the effectiveness of the conceived algorithms in comparison with the existing methods.

preprint2021arXiv

Quantum system dynamics with a weakly nonlinear Josephson junction bath

We investigate the influence of a weakly nonlinear Josephson bath consisting of a chain of Josephson junctions on the dynamics of a small quantum system (LC oscillator). Focusing on the regime where the charging energy is the largest energy scale, we perturbatively calculate the correlation function of the Josephson bath to the leading order in the Josephson energy divided by the charging energy while keeping the cosine potential exactly. When the variation of the charging energy along the chain ensures fast decay of the bath correlation function, the dynamics of the LC oscillator that is weakly and capacitively coupled to the Josephson bath can be solved through the Markovian master equation. We establish a duality relation for the Josephson bath between the regimes of large charging and Josephson energies respectively. The results can be applied to cases where the charging energy either is nonuniformly engineered or disordered in the chain. Furthermore, we find that the Josephson bath may become non-Markovian when the temperature is increased beyond the zero-temperature limit in that the bath correlation function gets shifted by a constant and does not decay with time.

preprint2021arXiv

Revisiting the Concrete Security of Goldreich's Pseudorandom Generator

Local pseudorandom generators are a class of fundamental cryptographic primitives having very broad applications in theoretical cryptography. Following Couteau et al.'s work in ASIACRYPT 2018, this paper further studies the concrete security of one important class of local pseudorandom generators, i.e., Goldreich's pseudorandom generators. Our first attack is of the guess-and-determine type. Our result significantly improves the state-of-the-art algorithm proposed by Couteau et al., in terms of both asymptotic and concrete complexity, and breaks all the challenge parameters they proposed. For instance, for a parameter set suggested for 128 bits of security, we could solve the instance faster by a factor of about $2^{61}$, thereby destroying the claimed security completely. Our second attack further exploits the extremely sparse structure of the predicate $P_5$ and combines ideas from iterative decoding. This novel attack, named guess-and-decode, substantially improves the guess-and-determine approaches for cryptographic-relevant parameters. All the challenge parameter sets proposed in Couteau et al.'s work in ASIACRYPT 2018 aiming for 80-bit (128-bit) security levels can be solved in about $2^{58}$ ($2^{78}$) operations. We suggest new parameters for achieving 80-bit (128-bit) security with respect to our attacks. We also extend the attack to other promising predicates and investigate their resistance.

preprint2021arXiv

Robust Kalman filter-based dynamic state estimation of natural gas pipeline networks

To obtain the accurate transient states of the big scale natural gas pipeline networks under the bad data and non-zero mean noises conditions, a robust Kalman filter-based dynamic state estimation method is proposed using the linearized gas pipeline transient flow equations in this paper. Firstly, the dynamic state estimation model is built. Since the gas pipeline transient flow equations are less than the states, the boundary conditions are used as supplementary constraints to predict the transient states. To increase the measurement redundancy, the zero mass flow rate constraints at the sink nodes are taken as virtual measurements. Secondly, to ensure the stability under bad data condition, the robust Kalman filter algorithm is proposed by introducing a time-varying scalar matrix to regulate the measurement error variances correctly according to the innovation vector at every time step. At last, the proposed method is applied to a 30-node gas pipeline networks in several kinds of measurement conditions. The simulation shows that the proposed robust dynamic state estimation can decrease the effects of bad data and achieve better estimating results.

preprint2021arXiv

Single production of vectorlike $B$ quarks at the CLIC

The vector-like quarks are predicted in many new physics scenarios beyond the Standard Model~(SM) and could be seen potential signatures of new physics at the TeV energy scale. In this work, we study single production of exotic singlet and doublet vectorlike bottom quarks (VLQ-$B$) at future Compact Linear Collider~(CLIC) via the process $e^{+}e^{-}\to B\bar{b}$ with the decay channel $B\to bZ$ and two types of modes: $Z\to \ell^{+}\ell^{-}$ and $Z\to ν\barν$. We calculate the cross sections of signal and relevant SM backgrounds. After a fast simulation of the signal and background events, the exclusion limit at 95\% confidence level and $5σ$ discovery prospects on the parameters (the coupling strength $κ_{B}$ and the VLQ-$B$ mass) have been, respectively, presented at the future CLIC with centre of mass energy $\sqrt{s}=3$ TeV and integrated luminosity of 5~ab$^{-1}$.

preprint2021arXiv

Super-Heisenberg scaling in Hamiltonian parameter estimation in the long-range Kitaev chain

In quantum metrology, nonlinear many-body interactions can enhance the precision of Hamiltonian parameter estimation to surpass the Heisenberg scaling. Here, we consider the estimation of the interaction strength in linear systems with long-range interactions and using the Kitaev chains as a case study, we establish a transition from the Heisenberg to super-Heisenberg scaling in the quantum Fisher information by varying the interaction range. We further show that quantum control can improve the prefactor of the quantum Fisher information. Our results explore the advantage of optimal quantum control and long-range interactions in many-body quantum metrology.

preprint2021arXiv

Ultra-high pressure disordered eight-coordinated phase of Mg$_2$GeO$_4$: Analogue for super-Earth mantles

Mg2GeO4 is an analogue for the ultra-high pressure behavior of Mg2SiO4, so we have investigated magnesium germanate to 275 GPa and over 2000 K using a laser-heated diamond anvil cell combined with in situ synchrotron X-ray diffraction and density functional theory (DFT) computations. The experimental results are consistent with a novel phase with disordered Mg and Ge, in which germanium adopts eight-fold coordination with oxygen: the cubic Th3P4- type structure. Simulations using the special quasirandom structure (SQS) method suggest partial order in the tetragonal I-42d structure, indistinguishable from I-43d Th3P4 in our experiments. These structures have not been reported before in any oxide. If applicable to silicates, the formation of this highly coordinated and intrinsically disordered phase would have important implications for the interior mineralogy of large, rocky extrasolar planets.

Jing Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

92 published item(s)

$f$-Divergence Regularized RLHF: Two Tales of Sampling and Unified Analyses

Breaking the Computational Barrier: Provably Efficient Actor-Critic for Low-Rank MDPs

Efficient Multi-objective Prompt Optimization via Pure-exploration Bandits

Enhancing Multilingual Counterfactual Generation through Alignment-as-Preference Optimization

GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning

HisTrackMap: Global Vectorized High-Definition Map Construction via History Map Tracking

Judge Circuits

Lean Clients, Full Accuracy: Hybrid Zeroth- and First-Order Split Federated Learning

Order in the Evaluation Court: A Critical Analysis of NLG Evaluation Trends

Unlabeled Data Can Provably Enhance In-Context Learning of Transformers

AID-DTI: Accelerating High-fidelity Diffusion Tensor Imaging with Detail-Preserving Model-based Deep Learning

Simultaneous q-Space Sampling Optimization and Reconstruction for Fast and High-fidelity Diffusion Magnetic Resonance Imaging

Determinate Node Selection for Semi-supervised Classification Oriented Graph Convolutional Networks

A gated group sequential design for seamless Phase II/III trial with subpopulation selection

ArcFace: Additive Angular Margin Loss for Deep Face Recognition

Existence, Local uniqueness and periodicity of bubbling solutions for a critical nonlinear elliptic equation

Interplay between jamming and MIPS in persistent self-propelling particles

Killing Two Birds with One Stone:Efficient and Robust Training of Face Recognition CNNs by Partial FC

Minimum-Time Quantum Control and the Quantum Brachistochrone Equation

Multi-channel Attentive Graph Convolutional Network With Sentiment Fusion For Multimodal Sentiment Analysis

On Federated Learning with Energy Harvesting Clients

Post-quantum Multi-stage Secret Sharing Schemes using Inhomogeneous Linear Recursion and Ajtai's Function

Pre-training strategies and datasets for facial representation learning

Precoding and Scheduling for AoI Minimization in MIMO Broadcast Channels

Random Orthogonalization for Federated Learning in Massive MIMO Systems

Region-Aware Network: Model Human's Top-Down Visual Perception Mechanism for Crowd Counting

Unified BERT for Few-shot Natural Language Understanding

Variational principle for optimal quantum controls in quantum metrology

Document Layout Analysis via Dynamic Residual Feature Fusion

Feature Generation and Hypothesis Verification for Reliable Face Anti-Spoofing

Federated Multi-armed Bandits with Personalization

Magnetic field generation from bubble collisions during first-order phase transition

Multi-Spectrally Constrained Transceiver Design against Signal-Dependent Interference

Quantum system dynamics with a weakly nonlinear Josephson junction bath

Revisiting the Concrete Security of Goldreich's Pseudorandom Generator

Robust Kalman filter-based dynamic state estimation of natural gas pipeline networks

Single production of vectorlike $B$ quarks at the CLIC

Super-Heisenberg scaling in Hamiltonian parameter estimation in the long-range Kitaev chain

Ultra-high pressure disordered eight-coordinated phase of Mg$_2$GeO$_4$: Analogue for super-Earth mantles

A Condition for Multiplicity Structure of Univariate Polynomials

A Real-Time Deep Network for Crowd Counting

Adaptive 3D Face Reconstruction from a Single Image

Decentralized Multi-player Multi-armed Bandits with No Collision Information

Edge-Aware Deep Image Deblurring

Fast Video Crowd Counting with a Temporal Aware Network

Gravitational Waves from first-order phase transition and domain wall

Information Freshness for Timely Detection of Status Changes

Knowledge distillation via adaptive instance normalization

Nonequilibrium Steady State and Heat Transport in Nonlinear Open Quantum Systems: Stochastic Influence Action and Functional Perturbative Analysis

On mu-Symmetric Polynomials

OrgMining 2.0: A Novel Framework for Organizational Model Mining from Event Logs

Stochastic Linear Contextual Bandits with Diverse Contexts

Timely Synchronization with Sporadic Status Changes

Training Binary Neural Networks with Real-to-Binary Convolutions

Hydrodynamical response of plane correlation in Pb+Pb collisions at $\sqrt{s_\text{NN}}$=2.76 TeV

Meta-neural-network for Realtime and Passive Deep-learning-based Object Recognition

Fog Data: Enhancing Telehealth Big Data Through Fog Computing

Inferring the perturbation time from biological time course data

Learning the Interference Graph of a Wireless Network

Non-Asymptotic Achievable Rates for Energy-Harvesting Channels using Save-and-Transmit

Non-Hermitian acoustic metamaterial for the complete control of sound by accessing the exceptional points

Origin of the superconductivity of WTe2 under pressure

Quickest Change Detection with Mismatched Post-Change Models

Adaptive Compressive Tracking via Online Vector Boosting Feature Selection

Anisotropic defect-induced ferromagnetism and transport in Gd-doped GaN two-dimensional electron gasses

Flagellar Kinematics and Swimming of Algal Cells in Viscoelastic Fluids

Infinitely many sign-changing solutions for an elliptic problem with double critical Hardy-Sobolev-Maz'ya terms

Infinitely many solutions to linearly coupled Schrödinger equations with non-symmetric potential

Pion Transverse Momentum Spectrum, Elliptic Flow and Interferometry in the Granular Source Model in Ultra-Relativistic Heavy Ion Collisions

Positive or sign-changing solutions for a critical semilinear nonlocal equation

Relativistic effects on the back-to-back correlation functions of boson-antiboson pairs in high energy heavy ion collisions

Solutions for a nonlocal elliptic equation involving critical growth and Hardy potential

Squeezed correlations of $ϕ$ meson pairs for hydrodynamic sources in high-energy heavy-ion collisions

Theoretical Modeling of Tribochemical Reaction on Pt and Au Contacts: Mechanical Load and Catalysis