Source author record

Ming Zhang

Ming Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

91works

35topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

ADR: An Agentic Detection System for Enterprise Agentic AI Security

We present the Agentic AI Detection and Response (ADR) system, the first large-scale, production-proven enterprise framework for securing AI agents operating through the Model Context Protocol (MCP). We identify three persistent challenges in this domain: (1) limited observability -- existing Endpoint Detection and Response (EDR) tools see file writes but not the agent reasoning, prompts, or causal chains linking intent to execution; (2) insufficient robustness -- static defenses constrained by pre-defined rules fail to generalize across diverse attack techniques and enterprise contexts; and (3) high detection costs -- LLM-based inference is prohibitively expensive at scale. ADR addresses these challenges via three components: the ADR Sensor for high-fidelity agentic telemetry, the ADR Explorer for systematic pre-deployment red teaming and hard-example generation, and the ADR Detector for scalable, two-tier online detection combining fast triage with context-aware reasoning. Deployed at Uber for over ten months, ADR has sustained reliable detection in production with growing adoption reaching over 7,200 unique hosts and processing over 10,000 agent sessions daily, uncovering hundreds of credential exposures across 26 categories and enabling a shift-left prevention layer (97.2% precision, 206 detected credentials). To validate the approach and enable community adoption, we introduce ADR-Bench (302 tasks, 17 techniques, 133 MCP servers), where ADR achieves zero false positives while detecting 67% of attacks -- outperforming three state-of-the-art baselines (ALRPHFS, GuardAgent, LlamaFirewall) by 2--4x in F1-score. On AgentDojo (public prompt injection benchmark), ADR detects all attacks with only three false alarms out of 93 tasks.

preprint2026arXiv

Beyond Scaling: Measuring and Predicting the Upper Bound of Knowledge Retention in Language Model Pre-Training

The GPT-4 technical report suggests that downstream performance can be predicted from pre-training signals, but offers little methodological detail on how to quantify this. This work address this gap by modeling knowledge retention, the capacity of a pre-trained language model to memorize factual information from its corpus, and introduce a principled method to estimate it prior to training. We propose Size-dependent Mutual Information (SMI), an information-theoretic predictor that integrates knowledge frequency, knowledge specificity, and model size to forecast closed-book question answering (QA) accuracy. SMI is validated through large-scale document retrieval over the disclosed pre-training corpora of 21 public and 3 custom models, combined with a robust multi-template QA evaluation. Experiments show that SMI significantly outperforms repetition-based baselines and achieves $R^2$ > 0.7 in predicting QA accuracy for models above 1B parameters, without additional training. The analysis further reveals diminishing returns from scaling data and model size and provides evidence for an intrinsic upper bound on knowledge retention achievable by pre-training alone, motivating retrieval and other augmentation strategies. The dataset and code are available at https://github.com/yuhui1038/SMI.

preprint2026arXiv

CL-bench Life: Can Language Models Learn from Real-Life Context?

Today's AI assistants such as OpenClaw are designed to handle context effectively, making context learning an increasingly important capability for models. As these systems move beyond professional settings into everyday life, the nature of the contexts they must handle also shifts. Real-life contexts are often messy, fragmented, and deeply tied to personal and social experience, such as multi-party conversations, personal archives, and behavioral traces. Yet it remains unclear whether current frontier language models can reliably learn from such contexts and solve tasks grounded in them. To this end, we introduce CL-bench Life, a fully human-curated benchmark comprising 405 context-task pairs and 5,348 verification rubrics, covering common real-life scenarios. Solving tasks in CL-bench Life requires models to reason over complex, messy real-life contexts, calling for strong real-life context learning abilities that go far beyond those evaluated in existing benchmarks. We evaluate ten frontier LMs and find that real-life context learning remains highly challenging: even the best-performing model achieves only 19.3% task solving rate, while the average performance across models is only 13.8%. Models still struggle to reason over contexts such as messy group chat histories and fragmented behavioral records from everyday life. CL-bench Life provides a crucial testbed for advancing real-life context learning, and progress on it can enable more intelligent and reliable AI assistants in everyday life.

preprint2026arXiv

CMDAR: A Chinese Multi-scene Dynamic Audio Reasoning Benchmark with Diverse Challenges

The ability to reason from audio, including speech, environmental sounds, and music, is essential for AI agents to interact effectively in real-world scenarios. Existing benchmarks mainly focus on static or single-scene settings and English audio data and do not fully capture scenarios where multiple speakers, unfolding events, and heterogeneous audio sources interact. To address these challenges, we introduce CMDAR, a Chinese benchmark for evaluating models on complex, multi-scene, and dynamically evolving audio reasoning tasks. CMDAR comprises 3,000 carefully curated question-answer pairs linked to diverse audio clips, covering five categories of complex reasoning and spanning three question types. We benchmark 26 state-of-the-art audio language models on CMDAR and observe that they exhibit limitations in complex reasoning tasks. In CMDAR-main, Qwen2.5-Omni achieves 76.67% accuracy, whereas GPT-4o Audio reaches 68.47%. However, GPT-4o Audio substantially outperforms Qwen2.5-Omni on the more challenging multiple-choice with multiple audios and open-ended tasks. And we provide detail analysis corresponding suggestions for the future development of large audio language models.

preprint2026arXiv

Generative structure search for efficient and diverse discovery of molecular and crystal structures

Predicting stable and metastable structures is central to molecular and materials discovery, but remains limited by the cost of searching high-dimensional energy landscapes. Deep generative models offer efficient structure sampling, yet their outputs remain shaped by training data and can underexplore minima that are rare but physically relevant. We introduce generative structure search (GSS), a unified framework that formulates diffusion-based generation and random structure search (RSS) as limiting regimes of a common sampling process driven by learned score fields and physical forces. Coupling these drivers lets GSS use data priors to accelerate sampling while retaining energy-guided exploration of local minima. Across molecular and crystalline systems, GSS recovers diverse metastable structures with more than tenfold lower sampling cost than RSS for broad coverage and remains effective for compositions outside the training distribution. The results establish a physically grounded generative search strategy for discovering structures beyond the reach of data-driven sampling alone.

preprint2026arXiv

Impact of Pressure and Apical Oxygen Vacancies on Superconductivity in La$_3$Ni$_2$O$_7$

The bilayer nickelate La$_3$Ni$_2$O$_7$ under pressure has recently emerged as a promising system for high-$T_c$ superconductivity. In this work, we investigate the fate of the superconducting properties in La$_3$Ni$_2$O$_{7}$ under pressure, focusing on the effects of structural deformation and apical oxygen vacancies. Employing a low-energy effective $t$-$J_{\parallel}$-$J_{\perp}$ model for the $3d_{x^2-y^2}$ orbitals within the slave-boson mean-field approach, we demonstrate that the pairing strength is significantly enhanced in the high-pressure tetragonal $I4/mmm$ phase compared to the ambient pressure orthorhombic $Amam$ phase. Furthermore, by simulating random configurations of apical oxygen vacancies, we show that oxygen vacancies suppress both pairing strength and superfluid density. These results underscore the critical role of pressure and oxygen stoichiometry in tuning the SC of La$_3$Ni$_2$O$_7$, providing key insights into optimizing its high-$T_c$ behavior.

preprint2026arXiv

Interests Burn-down Diffusion Process for Personalized Collaborative Filtering

Generative methods have gained widespread attention in Collaborative Filtering (CF) tasks for their ability to produce high-quality personalized samples aligned with users' interests. Among them, diffusion generative models have raised increasing attention in recommendation field. Despite that the pioneering efforts have applied the conventional diffusion process to model diffusive user interests, the incongruity between the Gaussian noise and the subtle nature of user's personalized interaction behavior has led to sub-optimal results. To this end, we introduce a specifically-tailored diffusion scheme for interaction systems, namely the interests burn-down process. The interests burn-down process delineates the decay of user interests towards candidate items, complemented by its reverse burn-up process that yields personalized recommendation for users. The inherent burn-down nature of this process adeptly models the diffusive user interests, aligning seamlessly with the requirements of CF tasks. We present a novel recommendation method StageCF to illustrate the superiority of this newly proposed diffusion process. Experimental results have demonstrated the effectiveness of StageCF against existing generative and diffusion-based baseline methods. Furthermore, comprehensive studies validate the functionality of interests burn-down process, shedding light on its capacity to generate personalized interactions.

preprint2026arXiv

LLMEval-Logic: A Solver-Verified Chinese Benchmark for Logical Reasoning of LLMs with Adversarial Hardening

Evaluating large language models (LLMs) on natural-language logical reasoning is essential because rule-governed tasks require conclusions to follow strictly from stated premises. Many existing logical-reasoning benchmarks are generated by templating natural-language items from sampled formulas, provide only coarse or unaudited formal annotations, and are now quickly saturated by frontier reasoning models. We present LLMEval-Logic, a Chinese logical reasoning benchmark built from realistic situational scenarios. Its pipeline forward-authors and expert-audits natural-language items together with their reference formalizations, verifies annotated answers with Z3, constructs expert rubrics for natural-to-formal grading, and hardens selected items through a closed-loop adversarial workflow. The benchmark is released in two paired subsets: a 246-item Base subset shipped with 1,400 expert-developed rubric atoms, and a 190-item Hard subset with 938 multi-step sub-questions over closed model spaces. Evaluating 14 frontier LLMs on LLMEval-Logic reveals substantial gaps in current models: the best model reaches only 37.5% Hard Item Accuracy, and even with reference symbols the highest joint Z3+Rubric formalization score among evaluated models reaches only 60.16%. Our benchmark is publicly available at https://github.com/llmeval/LLMEval-Logic.

preprint2026arXiv

Muse: Towards Reproducible Long-Form Song Generation with Fine-Grained Style Control

Recent commercial systems such as Suno demonstrate strong capabilities in long-form song generation, while academic research remains largely non-reproducible due to the lack of publicly available training data, hindering fair comparison and progress. To this end, we release a fully open-source system for long-form song generation with fine-grained style conditioning, including a licensed synthetic dataset, training and evaluation pipelines, and Muse, an easy-to-deploy song generation model. The dataset consists of 116k fully licensed synthetic songs with automatically generated lyrics and style descriptions paired with audio synthesized by SunoV5. We train Muse via single-stage supervised finetuning of a Qwen-based language model extended with discrete audio tokens using MuCodec, without task-specific losses, auxiliary objectives, or additional architectural components. Our evaluations find that although Muse is trained with a modest data scale and model size, it achieves competitive performance on phoneme error rate, text--music style similarity, and audio aesthetic quality, while enabling controllable segment-level generation across different musical structures. All data, model weights, and training and evaluation pipelines will be publicly released, paving the way for continued progress in controllable long-form song generation research. The project repository is available at https://github.com/yuhui1038/Muse.

preprint2026arXiv

OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment

Evaluating novelty is critical yet challenging in peer review, as reviewers must assess submissions against a vast, rapidly evolving literature. This report presents OpenNovelty, an LLM-powered agentic system for transparent, evidence-based novelty analysis. The system operates through four phases: (1) extracting the core task and contribution claims to generate retrieval queries; (2) retrieving relevant prior work based on extracted queries via semantic search engine; (3) constructing a hierarchical taxonomy of core-task-related work and performing contribution-level full-text comparisons against each contribution; and (4) synthesizing all analyses into a structured novelty report with explicit citations and evidence snippets. Unlike naive LLM-based approaches, \textsc{OpenNovelty} grounds all assessments in retrieved real papers, ensuring verifiable judgments. We deploy our system on 500+ ICLR 2026 submissions with all reports publicly available on our website, and preliminary analysis suggests it can identify relevant prior work, including closely related papers that authors may overlook. OpenNovelty aims to empower the research community with a scalable tool that promotes fair, consistent, and evidence-backed peer review.

preprint2026arXiv

SciCustom: A Framework for Custom Evaluation of Scientific Capabilities in Large Language Models

Large language models (LLMs) are increasingly applied to scientific research, yet existing evaluations often fail to reflect the fine-grained capabilities required in practice. Most benchmarks are manually curated or domain-generic, limiting scalability and alignment with real scientific use cases. In this paper, we propose a new framework named SciCustom to address the problem. It enables the custom construction of benchmarks from large-scale scientific data to evaluate application-specific scientific capabilities in LLMs. SciCustom first organizes scientific knowledge into ontology-grounded knowledge units with controlled granularity and trains a tagger to map large-scale data instances into this knowledge space. Given a custom requirement, relevant knowledge units are identified via voting-based multi-model consensus. These units enable relevance-aware benchmark retrieval via binary search, followed by proxy subset selection and data-grounded benchmark generation for efficient evaluation. Experiments in chemistry and healthcare demonstrate that SciCustom reveals fine-grained differences in LLM scientific capabilities that standard benchmarks overlook, while requiring neither expert annotation nor synthetic question generation. This work provides a scalable and application-aware foundation for benchmarking scientific capabilities in LLMs. The source code is available at https://github.com/yjwtheonly/SciCustom.

preprint2026arXiv

What Makes a Good Speech Tokenizer for LLM-Centric Speech Generation? A Systematic Study

Speech-language models (SLMs) offer a promising path toward unifying speech and text understanding and generation. However, challenges remain in achieving effective cross-modal alignment and high-quality speech generation. In this work, we systematically investigate the role of speech tokenizer designs in LLM-centric SLMs, augmented by speech heads and speaker modeling. We compare coupled, semi-decoupled, and fully decoupled speech tokenizers under a fair SLM framework and find that decoupled tokenization significantly improves alignment and synthesis quality. To address the information density mismatch between speech and text, we introduce multi-token prediction (MTP) into SLMs, enabling each hidden state to decode multiple speech tokens. This leads to up to 12$\times$ faster decoding and a substantial drop in word error rate (from 6.07 to 3.01). Furthermore, we propose a speaker-aware generation paradigm and introduce RoleTriviaQA, a large-scale role-playing knowledge QA benchmark with diverse speaker identities. Experiments demonstrate that our methods enhance both knowledge understanding and speaker consistency.

preprint2025arXiv

OxygenREC: An Instruction-Following Generative Framework for E-commerce Recommendation

Traditional recommendation systems suffer from inconsistency in multi-stage optimization objectives. Generative Recommendation (GR) mitigates them through an end-to-end framework; however, existing methods still rely on matching mechanisms based on inductive patterns. Although responsive, they lack the ability to uncover complex user intents that require deductive reasoning based on world knowledge. Meanwhile, LLMs show strong deep reasoning capabilities, but their latency and computational costs remain challenging for industrial applications. More critically, there are performance bottlenecks in multi-scenario scalability: as shown in Figure 1, existing solutions require independent training and deployment for each scenario, leading to low resource utilization and high maintenance costs-a challenge unaddressed in GR literature. To address these, we present OxygenREC, an industrial recommendation system that leverages Fast-Slow Thinking to deliver deep reasoning with strict latency and multi-scenario requirements of real-world environments. First, we adopt a Fast-Slow Thinking architecture. Slow thinking uses a near-line LLM pipeline to synthesize Contextual Reasoning Instructions, while fast thinking employs a high-efficiency encoder-decoder backbone for real-time generation. Second, to ensure reasoning instructions effectively enhance recommendation generation, we introduce a semantic alignment mechanism with Instruction-Guided Retrieval (IGR) to filter intent-relevant historical behaviors and use a Query-to-Item (Q2I) loss for instruction-item consistency. Finally, to resolve multi-scenario scalability, we transform scenario information into controllable instructions, using unified reward mapping and Soft Adaptive Group Clip Policy Optimization (SA-GCPO) to align policies with diverse business objectives, realizing a train-once-deploy-everywhere paradigm.

preprint2024arXiv

RJUA-QA: A Comprehensive QA Dataset for Urology

We introduce RJUA-QA, a novel medical dataset for question answering (QA) and reasoning with clinical evidence, contributing to bridge the gap between general large language models (LLMs) and medical-specific LLM applications. RJUA-QA is derived from realistic clinical scenarios and aims to facilitate LLMs in generating reliable diagnostic and advice. The dataset contains 2,132 curated Question-Context-Answer pairs, corresponding about 25,000 diagnostic records and clinical cases. The dataset covers 67 common urological disease categories, where the disease coverage exceeds 97.6\% of the population seeking medical services in urology. Each data instance in RJUA-QA comprises: (1) a question mirroring real patient to inquiry about clinical symptoms and medical conditions, (2) a context including comprehensive expert knowledge, serving as a reference for medical examination and diagnosis, (3) a doctor response offering the diagnostic conclusion and suggested examination guidance, (4) a diagnosed clinical disease as the recommended diagnostic outcome, and (5) clinical advice providing recommendations for medical examination. RJUA-QA is the first medical QA dataset for clinical reasoning over the patient inquiries, where expert-level knowledge and experience are required for yielding diagnostic conclusions and medical examination advice. A comprehensive evaluation is conducted to evaluate the performance of both medical-specific and general LLMs on the RJUA-QA dataset. Our data is are publicly available at \url{https://github.com/alipay/RJU_Ant_QA}.

preprint2024arXiv

TBDD: A New Trust-based, DRL-driven Framework for Blockchain Sharding in IoT

Integrating sharded blockchain with IoT presents a solution for trust issues and optimized data flow. Sharding boosts blockchain scalability by dividing its nodes into parallel shards, yet it's vulnerable to the $1\%$ attacks where dishonest nodes target a shard to corrupt the entire blockchain. Balancing security with scalability is pivotal for such systems. Deep Reinforcement Learning (DRL) adeptly handles dynamic, complex systems and multi-dimensional optimization. This paper introduces a Trust-based and DRL-driven (\textsc{TbDd}) framework, crafted to counter shard collusion risks and dynamically adjust node allocation, enhancing throughput while maintaining network security. With a comprehensive trust evaluation mechanism, \textsc{TbDd} discerns node types and performs targeted resharding against potential threats. The model maximizes tolerance for dishonest nodes, optimizes node movement frequency, ensures even node distribution in shards, and balances sharding risks. Rigorous evaluations prove \textsc{TbDd}'s superiority over conventional random-, community-, and trust-based sharding methods in shard risk equilibrium and reducing cross-shard transactions.

preprint2023arXiv

Analytical approximate solutions for scalarized AdS black holes

The spontaneous scalarization of Schwarzscild-AdS is investigated in the Einstein-scalar-Gauss--Bonnet (ESGB) theory. Firstly, we construct scalarized AdS black holes numerically. Secondly, making use of the homotopy analysis method (HAM), we obtain analytical approximate solutions for scalarized AdS black holes in the ESGB theory. It is found that scalarized AdS black holes constructed numerically are consistent with analytical approximate solutions in the whole space.

preprint2023arXiv

Channel Measurement for Holographic MIMO: Benefits and Challenges of Spatial Oversampling

In this paper, the channel of an indoor holographic multiple-input multiple-output (MIMO) system is measured. It is demonstrated through experiments for the first time that the spatial oversampling of holographic MIMO systems is able to increase the capacity of a wireless communication system significantly. However, the antenna efficiency is the most crucial challenge preventing us from getting the capacity improvement. An extended EM-compliant channel model is also proposed for holographic MIMO systems, which is able to take the non-isotropic characteristics of the propagation environment, the antenna pattern distortion, the antenna efficiency, and the polarization characteristics into consideration.

preprint2023arXiv

Ultrafast X-ray Diffraction Probe of Coherent Spin-state Dynamics in Molecules

We propose an approach to probe coherent spin-state dynamics of molecules using circularly polarized hard x-ray pulses. For the dynamically aligned nitric oxide molecules in a coherent superposition spin-orbit coupled electronic state that can be prepared through stimulated Raman scattering, we demonstrate the capability of ultrafast x-ray diffraction to not only reveal the quantum beating of the coherent spin-state wave packet, but also image the spatial spin density of the molecule. With circularly polarized ultrafast x-ray diffraction signal, we show that the electronic density matrix can be retrieved. The spatio-temporal resolving power of ultrafast x-ray diffraction paves the way for tracking transient spatial wave function in molecular dynamics involving spin degree of freedom.

preprint2022arXiv

A Probabilistic Model-Based Robust Waveform Design for MIMO Radar Detection

This paper addresses robust waveform design for multiple-input-multiple-output (MIMO) radar detection. A probabilistic model is proposed to describe the target uncertainty. Considering that waveform design based on maximizing the probability of detection is intractable, the relative entropy between the distributions of the observations under two hypotheses (viz., the target is present/absent) is employed as the design metric. To tackle the resulting non-convex optimization problem, an efficient algorithm based on minorization-maximization (MM) is derived. Numerical results demonstrate that the waveform synthesized by the proposed algorithm is more robust to model mismatches.

preprint2022arXiv

Chiral SO(4) spin-valley density wave and degenerate topological superconductivity in magic-angle-twisted bilayer-graphene

Starting from a realistic extended Hubbard model for a $p_{x,y}$-orbital tight-binding model on the Honeycomb lattice, we perform a thorough investigation on the possible electron instabilities in the MA-TBG near the van Hove (VH) dopings. Here we focus on the interplay between the approximate SU(2)$\times$SU(2) symmetry and the $D_3$ symmetry, which leads to intriguing quantum states relevant to recent experiments, as revealed by our systematic RPA based calculations followed by a succeeding mean-field energy minimization for the ground state energy. At the SU(2)$\times$SU(2) symmetric point, the degenerate inter-valley SDW and VDW are mixed into a new state of matter dubbed as the chiral SO(4) spin-valley DW. This state simultaneously hosts three 4-component vectorial spin-valley DW orders with each adopting one wave vector, and the polarization directions of the three DW orders are mutually perpendicular to one another. %in the $\mathbb{R}^4$ space. In the presence of a tiny inter-valley exchange interaction with coefficient $J_H\to 0^{-}$ which breaks the SU(2)$\times$SU(2) symmetry, a pure chiral SDW state is obtained. In the case of $J_H\to 0^{+}$, a nematic VDW+SDW state emerges which possesses a stripy distribution of the charge density, consistent with the recent STM observations. On the aspect of SC, while the triplet $p+ip$ and singlet $d+id$ topological SCs are degenerate at $J_H=0$ near the VH dopings, the former (latter) is favored for $J_H\to 0^{-}$ ($J_H\to 0^{+}$). In addition, the two asymmetric doping-dependent behaviors of the obtained pairing phase diagram are well consistent with experiments.

preprint2022arXiv

Ekar: An Explainable Method for Knowledge Aware Recommendation

This paper studies recommender systems with knowledge graphs, which can effectively address the problems of data sparsity and cold start. Recently, a variety of methods have been developed for this problem, which generally try to learn effective representations of users and items and then match items to users according to their representations. Though these methods have been shown quite effective, they lack good explanations, which are critical to recommender systems. In this paper, we take a different route and propose generating recommendations by finding meaningful paths from users to items. Specifically, we formulate the problem as a sequential decision process, where the target user is defined as the initial state, and the edges on the graphs are defined as actions. We shape the rewards according to existing state-of-the-art methods and then train a policy function with policy gradient methods. Experimental results on three real-world datasets show that our proposed method not only provides effective recommendations but also offers good explanations.

preprint2022arXiv

Generalized Covariant Entropy Bound in Lanczos-Lovelock Gravity

In this paper, we investigate the generalized covariant entropy bound in the theory where the Einstein gravity is perturbed by the higher-order Lovelock terms. After replacing the Bekenstein-Hawking entropy with the Jacobson-Myers entropy and introducing two reasonable physical assumptions, we showed that the corresponding generalized covariant entropy bound is satisfied under a higher-order approximation of the perturbation from the higher-order Lovelock terms. Our result implies that the Jacobson-Myers entropy strictly obeys the entropy bound under the perturbation level, and the generalized second law of Lanczos-Lovelock gravity is also satisfied when the Einstein gravity is perturbed by the higher-order Lovelock terms.

preprint2022arXiv

Higher dimensional Reissner-Nordström black holes supporting static scalar shells

We analytically study scalarization of higher-dimensional charged Reissner-Nordström (RN) black hole. It is shown that static massive scalar field which is non-minimally coupled to Gauss-Bonnet invariant can be supported by higher-dimensional black hole in super-critical charge regime $Q/M\ge \bar{C}_d$ with $Q, M$ charge and mass of the black hole and $\bar{C}_d$ some unitless spacetime dimension-dependent quantity. Moreover, we show that the static massive scalar shell can be quite thin in the large mass regime $μM^{\frac{1}{d-3}}\gg 1$ with $μ$ mass of the scalar field.

preprint2022arXiv

Joule-Thomson Expansion of Born-Infeld AdS Black Holes in 4D Einstein-Gauss-Bonnet gravity

In this paper, the Joule{Thomson expansion of Born{Infeld AdS black holes in the consistent Aoki{Gorji{Mukohyama theory of 4D Einstein-Gauss-Bonnet gravity is studied in the extended phase space. We further analyze the effect of parameters αand βon the inversion curves and plot the inversion and isenthalpic curves in the T-P plane, which can determine the cooling-heating regions.

preprint2022arXiv

KGNN: Harnessing Kernel-based Networks for Semi-supervised Graph Classification

This paper studies semi-supervised graph classification, which is an important problem with various applications in social network analysis and bioinformatics. This problem is typically solved by using graph neural networks (GNNs), which yet rely on a large number of labeled graphs for training and are unable to leverage unlabeled graphs. We address the limitations by proposing the Kernel-based Graph Neural Network (KGNN). A KGNN consists of a GNN-based network as well as a kernel-based network parameterized by a memory network. The GNN-based network performs classification through learning graph representations to implicitly capture the similarity between query graphs and labeled graphs, while the kernel-based network uses graph kernels to explicitly compare each query graph with all the labeled graphs stored in a memory for prediction. The two networks are motivated from complementary perspectives, and thus combing them allows KGNN to use labeled graphs more effectively. We jointly train the two networks by maximizing their agreement on unlabeled graphs via posterior regularization, so that the unlabeled graphs serve as a bridge to let both networks mutually enhance each other. Experiments on a range of well-known benchmark datasets demonstrate that KGNN achieves impressive performance over competitive baselines.

preprint2022arXiv

Lifshitz transition enhanced triplet $p_z$-wave superconductivity in hydrogen doped KCr$_3$As$_3$

The recently synthesized air-insensitive hydrogen doped KCr$_3$As$_3$ superconductor has aroused great research interests. This material has, for the first time in the research area of the quasi-one-dimensional Cr-based superconductivity (SC), realized a tunability through charge doping, which will potentially significantly push the development of this area. Here based on the band structure from first-principle calculations, we construct a six-band tight-binding (TB) model equipped with multi-orbital Hubbard interactions, and adopt the random-phase-approximation approach to study the hydrogen-doping dependence of the pairing symmetry and superconducting $T_c$. Under the rigid-band approximation, our pairing phase diagram is occupied by the triplet $p_z$-wave pairing through out the hydrogen-doping regime $x\in (0.4,1)$ in which SC has been experimentally detected. Remarkably, the $x$-dependence of $T_c$ shows a peak at the 3D-quasi-1D Lifshitz transition point, although the total density of state exhibit a dip there. A thorough investigation of the band structure reveals type-II van-Hove singularities (VHSs) in the $γ$ band, which favor the formation of the triplet SC. It turns out that the $γ$- Fermi surface (FS) comprises two flat quasi-1D FS sheets almost parallel to the $k_z=0$ plane and six almost perpendicular tube-like FS sheets, and the type-II VHS just lies in the boundary between these two FS parts. Furthermore, the $\left|k_z\right|$ of the VH planes reaches the maximum near the Lifshitz-transition point, which pushes the $T_c$ of the $p_z$-wave SC to the maximum. Our results appeal more experimental access into this intriguing superconductor.

preprint2022arXiv

Neutrino Rocket Jet Model: An Explanation of High-velocity Pulsars and their Spin-down Evolution

The fact that the spatial velocity of pulsars is generally higher than that of their progenitor stars has bothered astronomers for nearly 50 years. It has been extensively argued that the high pulsar velocity should be acquired during a natal kick process on a timescale of 100ms - 10s in the supernova explosion, in which some asymmetrical dynamical mechanism plays a key role. However, a satisfactory picture generally is still lacking. In this study, it is argued that the neutrino rocket model can well account for the high speed as well as the long-term evolution behaviors of pulsars. The neutrinos are emitted from superfluid vortex neutrons through the neutrino cyclotron radiation mechanism. The unique characters of left-handed neutrinos and right-handed antineutrinos resulting from the nonconservation of parity in weak interactions play a major role in the spatial asymmetry. The continuous acceleration of pulsars can be naturally explained by this model, which yields a maximum velocity surpassing 1000 km s$^{-1}$. The alignment between the spinning axis and the direction of motion observed for the Crab pulsar (PSR 0531) and the Vela pulsar (PSR 0833) can be well accounted for. The observed correlation between the spin-down rate and the period of long-period pulsars with $P \gtrsim 0.5$s can also be satisfactorily explained.

preprint2022arXiv

Perspective: Ultrafast Imaging of Molecular Dynamics Using Ultrafast Low-Frequency Lasers, X-ray Free Electron Laser and Electron Pulses

The requirement of high space-time resolution and brightness is a great challenge for imaging atomic motion and making molecular movies. Important breakthroughs in ultrabright tabletop laser, x-ray and electron sources have enabled the direct imaging of evolving molecular structures in chemical processes. And recent experimental advances in preparing ultrafast laser and electron pulses equipped molecular imaging with femtosecond time resolution. This Perspectives present an overview of versatile imaging methods of molecular dynamics. High-order harmonic generation imaging and photoelectron diffraction imaging are based on laser-induced ionization and rescattering processes. Coulomb explosion imaging retrieves molecular structural information by detecting the momentum vectors of fragmented ions. Diffraction imaging encodes molecular structural and electronic information in reciprocal space. We also present various applications of these ultrafast imaging methods in resolving laser-induced nuclear and electronic dynamics.

preprint2022arXiv

Scalar-hairy Lovelock gravity respects zeroth law

We study the zeroth law for Killing horizon in scalar-hairy Lovelock gravity, and show that the surface gravity of a general Killing horizon in the scalar-hairy Lovelock gravity is constant, provided that the dominant energy condition is obeyed by the matter field.

preprint2022arXiv

Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal Language

Silent Speech Decoding (SSD), based on articulatory neuromuscular activities, has become a prevalent task of Brain-Computer Interface (BCI) in recent years. Many works have been devoted to decoding surface electromyography (sEMG) from articulatory neuromuscular activities. However, restoring silent speech in tonal languages such as Mandarin Chinese is still difficult. This paper proposes an optimized Sequence-to-Sequence (Seq2Seq) approach to synthesize voice from the sEMG-based silent speech. We extract duration information to regulate the sEMG-based silent speech using the audio length. Then, we provide a deep-learning model with an encoder-decoder structure and a state-of-art vocoder to generate the audio waveform. Experiments based on six Mandarin Chinese speakers demonstrate that the proposed model can successfully decode silent speech in Mandarin Chinese and achieve a character error rate (CER) of 6.41% on average with human evaluation.

preprint2022arXiv

Silicon photonic devices for scalable quantum information applications

With high integration density and excellent optical properties, silicon photonics is becoming a promising platform for complete integration and large-scale optical quantum information processing. Scalable quantum information applications need photon generation and detection to be integrated on the same chip, and we have seen that various devices on the silicon photonic chip have been developed for this goal. This paper reviews the relevant research results and state-of-the-art technologies on the silicon photonic chip for scalable quantum applications. Despite the shortcomings, properties of some components have already met the requirements for further expansion. Furthermore, we point out the challenges ahead and further research directions for on-chip scalable quantum information applications.

preprint2022arXiv

Taxonomy and evolution predicting using deep learning in images

Molecular and morphological characters, as important parts of biological taxonomy, are contradictory but need to be integrated. Organism's image recognition and bioinformatics are emerging and hot problems nowadays but with a gap between them. In this work, a multi-branching recognition framework mediated by genetic information bridges this barrier, which establishes the link between macro-morphology and micro-molecular information of mushrooms. The novel multi-perspective structure is proposed to fuse the feature images from three branching models, which significantly improves the accuracy of recognition by about 10% and up to more than 90%. Further, genetic information is implemented to the mushroom image recognition task by using genetic distance embeddings as the representation space for predicting image distance and species identification. Semantic overfitting of traditional classification tasks and the granularity of fine-grained image recognition are also discussed in depth for the first time. The generalizability of the model was investigated in fine-grained scenarios using zero-shot learning tasks, which could predict the taxonomic and evolutionary information of unseen samples. We presented the first method to map images to DNA, namely used an encoder mapping image to genetic distances, and then decoded DNA through a pre-trained decoder, where the total test accuracy on 37 species for DNA prediction is 87.45%. This study creates a novel recognition framework by systematically studying the mushroom image recognition problem, bridging the gap between macroscopic biological information and microscopic molecular information, which will provide a new reference for intelligent biometrics in the future.

preprint2022arXiv

Transferable Cross-Tokamak Disruption Prediction with Deep Hybrid Neural Network Feature Extractor

Predicting disruptions across different tokamaks is a great obstacle to overcome. Future tokamaks can hardly tolerate disruptions at high performance discharge. Few disruption discharges at high performance can hardly compose an abundant training set, which makes it difficult for current data-driven methods to obtain an acceptable result. A machine learning method capable of transferring a disruption prediction model trained on one tokamak to another is required to solve the problem. The key is a disruption prediction model containing a feature extractor that is able to extract common disruption precursor traces in tokamak diagnostic data, and a transferable disruption classifier. Based on the concerns above, the paper first presents a deep fusion feature extractor designed specifically for extracting disruption precursor features from common diagnostics on tokamaks according to currently known precursors of disruption, providing a promising foundation for transferable models. The fusion feature extractor is proved by comparing with manual feature extraction on J-TEXT. Based on the feature extractor trained on J-TEXT, the disruption prediction model was transferred to EAST data with mere 20 discharges from EAST experiment. The performance is comparable with a model trained with 1896 discharges from EAST. From the comparison among other model training scenarios, transfer learning showed its potential in predicting disruptions across different tokamaks.

preprint2021arXiv

Null hypersurface caustics and super-entropic black holes

We obtain a charged, rotating and accelerating black hole solution in the $f(R)$ gravity and calculate thermodynamic quantities of the black hole in the slow acceleration regime. We then find that the black hole can be super-entropic in a certain condition. After investigating the null hypersurface of the black hole, we show that there exist super-entropic black holes whose null hypersurface caustics only form inside the Cauchy horizon.

preprint2021arXiv

Partial FC: Training 10 Million Identities on a Single Machine

Face recognition has been an active and vital topic among computer vision community for a long time. Previous researches mainly focus on loss functions used for facial feature extraction network, among which the improvements of softmax-based loss functions greatly promote the performance of face recognition. However, the contradiction between the drastically increasing number of face identities and the shortage of GPU memories is gradually becoming irreconcilable. In this paper, we thoroughly analyze the optimization goal of softmax-based loss functions and the difficulty of training massive identities. We find that the importance of negative classes in softmax function in face representation learning is not as high as we previously thought. The experiment demonstrates no loss of accuracy when training with only 10\% randomly sampled classes for the softmax-based loss functions, compared with training with full classes using state-of-the-art models on mainstream benchmarks. We also implement a very efficient distributed sampling algorithm, taking into account model accuracy and training efficiency, which uses only eight NVIDIA RTX2080Ti to complete classification tasks with tens of millions of identities. The code of this paper has been made available https://github.com/deepinsight/insightface/tree/master/recognition/partial_fc.

preprint2021arXiv

QoS-Driven Video Uplinking in NOMA-Based IoT

In recent years, with the explosive growth of visual sensors and a large number of related video applications in Internet of Things (IoT), massive video data is generated by IoT devices. Since the volume of video data is far greater than traditional data in IoT, it is challenging to ensure high Quality of Service (QoS) for video uplinking in IoT. To address this challenge, we integrate non-orthogonal multiple access (NOMA) and scalable video coding (SVC) in IoT. To improve the video quality, we formulate a power allocation problem to maximize the average QoS in the proposed integrated system. Due to that the problem is non-convex, we transform it into a monotonic problem based on its hidden monotonicity. Then a power allocation algorithm based on polyblock outer approximation is proposed to solve the problem effectively. Finally, simulation results demonstrate that the proposed algorithm outperforms existing OMA and NOMA based schemes for video uplinking in IoT in terms of QoS and energy efficiency.

preprint2021arXiv

Time-dependent Clearance of Cyclosporine in Adult Renal Transplant Recipients: A Population Pharmacokinetic Perspective

Aim The pharmacokinetic (PK) properties of cyclosporine (CsA) in renal transplant recipients are patient- and time-dependent. Knowledge of this time-related variability is necessary to maintain or achieve CsA target exposure. Here, we aimed to identify factors explaining variabilities in CsA PK properties and characterise time-dependent clearance (CL/F) by performing a comprehensive analysis of CsA PK factors using population PK (popPK) modelling of long-term follow-up data from our institution. Methods In total, 3,674 whole-blood CsA concentrations from 183 patients who underwent initial renal transplantation were analysed using nonlinear mixed-effects modelling. The effects of potential covariates were selected according to a previous report and well-accepted theoretical mechanisms. Model-informed individualised therapeutic regimens were also conducted. Results A two-compartment model adequately described the data and the estimated mean CsA CL/F was 32.6 L h-1 (5%). Allometrically scaled body size, haematocrit (HCT) level, CGC haplotype carrier status, and postoperative time may contribute to CsA PK variability. The CsA bioavailability in patients receiving a prednisolone dose (PD) of 80 mg was 20.6% lower than that in patients receiving 20 mg. A significant decrease (52.6%) in CL/F was observed as the HCT increased from 10.5% to 60.5%. The CL/F of the non-CGC haplotype carrier was 14.4% lower than that of the CGC haplotype carrier at 3 months post operation. CsA dose adjustments should be considered in different postoperative periods. Conclusions By monitoring body size, HCT, PD, and CGC haplotype, changes in CsA CL/F over time could be predicted. Such information could be used to optimise CsA therapy.

preprint2021arXiv

Transverse mode-encoded quantum gate on a silicon photonic chip

As an important degree of freedom (DoF) in integrated photonic circuits, the orthogonal transverse mode provides a promising and flexible way to increasing communication capability, for both classical and quantum information processing. To construct large-scale on-chip multimode multi-DoF quantum systems, a transverse mode-encoded controlled-NOT (CNOT) gate is necessary. Here, through design and integrate transverse mode-dependent directional coupler and attenuators on a silicon photonic chip, we demonstrate the first multimode implementation of a two-qubit quantum gate. With the aid of state preparation and analysis parts, we show the ability of the gate to entangle two separated transverse mode qubits with an average fidelity of $0.89\pm0.02$ and the achievement of 10 standard deviations of violations in the quantum nonlocality verification. In addition, a fidelity of $0.82\pm0.01$ was obtained from quantum process tomography used to completely characterize the CNOT gate. Our work paves the way for universal transverse mode-encoded quantum operations and large-scale multimode multi-DoF quantum systems.

Ming Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

91 published item(s)

ADR: An Agentic Detection System for Enterprise Agentic AI Security

Beyond Scaling: Measuring and Predicting the Upper Bound of Knowledge Retention in Language Model Pre-Training

CL-bench Life: Can Language Models Learn from Real-Life Context?

CMDAR: A Chinese Multi-scene Dynamic Audio Reasoning Benchmark with Diverse Challenges

Generative structure search for efficient and diverse discovery of molecular and crystal structures

Impact of Pressure and Apical Oxygen Vacancies on Superconductivity in La$_3$Ni$_2$O$_7$

Interests Burn-down Diffusion Process for Personalized Collaborative Filtering

LLMEval-Logic: A Solver-Verified Chinese Benchmark for Logical Reasoning of LLMs with Adversarial Hardening

Muse: Towards Reproducible Long-Form Song Generation with Fine-Grained Style Control

OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment

SciCustom: A Framework for Custom Evaluation of Scientific Capabilities in Large Language Models

What Makes a Good Speech Tokenizer for LLM-Centric Speech Generation? A Systematic Study

OxygenREC: An Instruction-Following Generative Framework for E-commerce Recommendation

RJUA-QA: A Comprehensive QA Dataset for Urology

TBDD: A New Trust-based, DRL-driven Framework for Blockchain Sharding in IoT

Analytical approximate solutions for scalarized AdS black holes

Channel Measurement for Holographic MIMO: Benefits and Challenges of Spatial Oversampling

Ultrafast X-ray Diffraction Probe of Coherent Spin-state Dynamics in Molecules

A Probabilistic Model-Based Robust Waveform Design for MIMO Radar Detection

Chiral SO(4) spin-valley density wave and degenerate topological superconductivity in magic-angle-twisted bilayer-graphene

Ekar: An Explainable Method for Knowledge Aware Recommendation

Generalized Covariant Entropy Bound in Lanczos-Lovelock Gravity

Higher dimensional Reissner-Nordström black holes supporting static scalar shells

Joule-Thomson Expansion of Born-Infeld AdS Black Holes in 4D Einstein-Gauss-Bonnet gravity

KGNN: Harnessing Kernel-based Networks for Semi-supervised Graph Classification

Lifshitz transition enhanced triplet $p_z$-wave superconductivity in hydrogen doped KCr$_3$As$_3$

Neutrino Rocket Jet Model: An Explanation of High-velocity Pulsars and their Spin-down Evolution

Perspective: Ultrafast Imaging of Molecular Dynamics Using Ultrafast Low-Frequency Lasers, X-ray Free Electron Laser and Electron Pulses

Scalar-hairy Lovelock gravity respects zeroth law

Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal Language

Silicon photonic devices for scalable quantum information applications

Taxonomy and evolution predicting using deep learning in images

Transferable Cross-Tokamak Disruption Prediction with Deep Hybrid Neural Network Feature Extractor

Null hypersurface caustics and super-entropic black holes

Partial FC: Training 10 Million Identities on a Single Machine

QoS-Driven Video Uplinking in NOMA-Based IoT

Time-dependent Clearance of Cyclosporine in Adult Renal Transplant Recipients: A Population Pharmacokinetic Perspective

Transverse mode-encoded quantum gate on a silicon photonic chip

A Lightweight and Accurate Localization Algorithm Using Multiple Inertial Measurement Units

Association and Caching in Relay-Assisted mmWave Networks: From A Stochastic Geometry Perspective

Augmented Bi-path Network for Few-shot Learning

Can shadows reflect phase structures of black holes?

EasyQuant: Post-training Quantization via Scale Optimization

Entropy increases at linear order in scalar-hairy Lovelock gravity

Escape probability of particle from Kerr-Sen black hole

GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation

Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks

Multi-task Learning via Adaptation to Similar Tasks for Mortality Prediction of Diverse Rare Diseases

New gedanken experiment on higher-dimensional asymptotically AdS Reissner-Nordström black hole

New version of the gedanken experiments to test the weak cosmic censorship in charged dilaton-Lifshitz black holes

Relaxation rate of RNdS black hole

Revisiting collisional Penrose processes in term of escape probabilities for spinning particles

Shadows of the accelerating black holes

Snowmass 2021 LoI: Determination of cosmic ray properties in the local interstellar medium with all-sky anisotropy observations

Stable circular orbits of spinning test particles around accelerating Kerr black hole

Charged black holes in the Einstein-Maxwell-Weyl gravity

Effect of protein binding on exposure of unbound and total mycophenolic acid: a population pharmacokinetic analysis in Chinese adult kidney transplant recipients

Holographic complexity of the electromagnetic black hole

Towards Automated ICD Coding Using Deep Learning

Coexistent physics of massive black holes in the phase transitions

Context-aware Natural Language Generation with Recurrent Neural Networks

Dialogue Session Segmentation by Embedding-Enhanced TextTiling

Less is More: Learning Prominent and Diverse Topics for Data Summarization

On-chip coherent conversion of photonic quantum signals between different degrees of freedom

StalemateBreaker: A Proactive Content-Introducing Approach to Automatic Human-Computer Conversation

Two are Better than One: An Ensemble of Retrieval- and Generation-Based Dialog Systems

Unsupervised Word and Dependency Path Embeddings for Aspect Term Extraction

Visualizing Large-scale and High-dimensional Data

World Knowledge as Indirect Supervision for Document Clustering

High performance FeSe0.5Te0.5 thin films grown at low temperature by pulsed laser deposition

LINE: Large-scale Information Network Embedding

The East-Asian VLBI Network

"Look Ma, No Hands!" A Parameter-Free Topic Model

$P-V$ criticality of AdS black hole in the Einstein-Maxwell-power-Yang-Mills gravity