Source author record

Qiongkai Xu

Qiongkai Xu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Artificial Intelligence Cryptography and Security Machine Learning

Catalog footprint

What is connected

3works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

PatentMind: A Multi-Aspect Reasoning Graph for Patent Similarity Evaluation

Patent similarity evaluation plays a critical role in intellectual property analysis. However, existing methods often overlook the intricate structure of patent documents, which integrate technical specifications, legal boundaries, and application contexts. We introduce PatentMind, a novel framework for patent similarity assessment based on a Multi-Aspect Reasoning Graph (MARG). PatentMind decomposes patents into their three dimensions of technical features, application domains, and claim scopes, then dimension-specific similarity scores are calculated over the MARG. These scores are dynamically weighted through a context-aware reasoning process, which integrates contextual signals to emulate expert-level judgment. To support evaluation, we construct a human-annotated benchmark PatentSimBench, comprising 500 patent pairs. Experimental results demonstrate that the PatentMind-generated scores show a strong correlation ($r=0.938$) with expert annotations, significantly outperforming embedding-based models, patent-specific models, and advanced prompt engineering methods. Beyond computational linguistics, our framework provides a structured and semantically grounded foundation for real-world decision-making, particularly for tasks such as infringement risk assessment, underscoring its broader impact on both patent analytics and evaluation.

preprint2026arXiv

PHAGE: Patent Heterogeneous Attention-Guided Graph Encoder for Representation Learning

Patent claims form a directed dependency structure in which dependent claims inherit and refine the scope of earlier claims; however, existing patent encoders linearize claims as text and discard this hierarchy. Directly encoding this structure into self-attention poses two challenges: claim dependencies mix relation types that differ in semantics and extraction reliability, and the dependency graph is defined over claims while Transformers attend over tokens. PHAGE addresses the first challenge through a deterministic graph construction pipeline that separates near-deterministic legal citations from noisier rule-based technical relations, preserving type distinctions as heterogeneous edges. It addresses the second through a connectivity mask and learnable relation-aware biases that lift claim-level topology into token-level attention, allowing the encoder to differentially weight each relation type. A dual-granularity contrastive objective then aligns representations with both inter-patent taxonomy and intra-patent topology. PHAGE outperforms all baselines on classification, retrieval, and clustering, showing that intra-document claim topology is a stronger inductive bias than inter-document structure and that this bias persists in the encoder weights after training.

preprint2022arXiv

Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs

Machine-learning-as-a-service (MLaaS) has attracted millions of users to their splendid large-scale models. Although published as black-box APIs, the valuable models behind these services are still vulnerable to imitation attacks. Recently, a series of works have demonstrated that attackers manage to steal or extract the victim models. Nonetheless, none of the previous stolen models can outperform the original black-box APIs. In this work, we conduct unsupervised domain adaptation and multi-victim ensemble to showing that attackers could potentially surpass victims, which is beyond previous understanding of model extraction. Extensive experiments on both benchmark datasets and real-world APIs validate that the imitators can succeed in outperforming the original black-box models on transferred domains. We consider our work as a milestone in the research of imitation attack, especially on NLP APIs, as the superior performance could influence the defense or even publishing strategy of API providers.