Researcher profile

Qing Liu

Qing Liu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

Multi-domain Multi-modal Document Classification Benchmark with a Multi-level Taxonomy

Document classification forms the backbone of modern enterprise content management, yet existing benchmarks remain trapped in oversimplified paradigms -- single domain settings with flat label structures -- that bear little resemblance to the hierarchical, multi-modal, and cross-domain nature of real-world business documents. This gap not only misrepresents practical complexity but also stifles progress toward industrially viable document intelligence. To bridge this gap, we construct the first Multi-level, Multi-domain, Multi-modal document classification Benchmark (MMM-Bench). MMM-Bench includes (1) a deeply hierarchical taxonomy spanning five levels that capture the authentic organizational logic of business documentation; and (2) 5,990 real-world multi-modal documents meticulously curated from 12 commercial domains in Alibaba. Each document is manually annotated with a complete hierarchical path by domain experts. We establish comprehensive baselines on MMM-Bench, which consists of open-weight models and API-based models. Through systematic experiments, we identify four fundamental challenges within MMM-Bench and propose corresponding insights. To provide a solid foundation for advancing research in multi-level, multi-domain document classification, we release all of the data and the evaluation toolkit of MMM-Bench at https://github.com/MMMDC-Bench/MMMDC-Bench.

preprint2026arXiv

On the cohomological representations of finite automorphism groups of singular curves and compact complex spaces

Let G be a finite group acting tamely on a proper reduced curve C over an algebraically closed field. We study the G-module structure on the cohomology groups of a G-equivariant locally free sheaf F on C, and give formulas of Chevalley--Weil type, with values in the Grothendieck ring R_k(G)_Q of finitely generated G-modules. We also give a similar formula for the singular cohomology of compact complex spaces. The focus is on the case where C is nodal. Using the Chevalley--Weil formula, we compute the G-invariant part of the global sections of the pluricanonical bundle ω_C^{\otimes m}. In turn, we use the formula for m=2 to compute the equivariant deformation space of a stable G-curve C. We also obtain numerical criteria for the presence of any given irreducible representation in space of the global sections of ω_C\otimes F, where F is an ample locally free G-sheaf on C. Some new phenomena, pathological compared to the smooth curve case, are discussed.

preprint2026arXiv

PrivGemo: Privacy-Preserving Dual-Tower Graph Retrieval for Empowering LLM Reasoning with Memory Augmentation

Knowledge graphs (KGs) provide structured evidence that can ground large language model (LLM) reasoning for knowledge-intensive question answering. However, many practical KGs are private, and sending retrieved triples or exploration traces to closed-source LLM APIs introduces leakage risk. Existing privacy treatments focus on masking entity names, but they still face four limitations: structural leakage under semantic masking, uncontrollable remote interaction, fragile multi-hop and multi-entity reasoning, and limited experience reuse for stability and efficiency. To address these issues, we propose PrivGemo, a privacy-preserving retrieval-augmented framework for KG-grounded reasoning with memory-guided exposure control. PrivGemo uses a dual-tower design to keep raw KG knowledge local while enabling remote reasoning over an anonymized view that goes beyond name masking to limit both semantic and structural exposure. PrivGemo supports multi-hop, multi-entity reasoning by retrieving anonymized long-hop paths that connect all topic entities, while keeping grounding and verification on the local KG. A hierarchical controller and a privacy-aware experience memory further reduce unnecessary exploration and remote interactions. Comprehensive experiments on six benchmarks show that PrivGemo achieves overall state-of-the-art results, outperforming the strongest baseline by up to 17.1%. Furthermore, PrivGemo enables smaller models (e.g., Qwen3-4B) to achieve reasoning performance comparable to that of GPT-4-Turbo.

preprint2026arXiv

Unsupervised dense random survival forests identify interpretable patient profiles with heterogeneous treatment benefit

Precision oncology aims to prescribe the optimal cancer treatment to the right patients, maximizing therapeutic benefits. However, identifying patient subgroups that may benefit more from experimental cancer treatments based on randomized clinical trials presents a significant analytical challenge. To address this, we introduce a novel unsupervised machine learning approach based on very dense random survival forests (up to 100,000 trees), equipped with a new splitting rule that explicitly targets treatment-effect heterogeneity. This method is robust, interpretable, and effectively identifies responsive subgroups. Extensive simulations confirm its ability to detect heterogeneous patient responses and distinguish between datasets with and without heterogeneity, while maintaining a stringent Type I error rate of 1%. We further validate its performance using Phase III randomized clinical trial datasets, demonstrating significant patient heterogeneity in treatment response based on baseline characteristics.

preprint2025arXiv

Quantitative Morphology of Galactic Cirrus in Deep Optical Imaging

Imaging of optical Galactic cirrus, the spatially resolved form of diffuse Galactic light, provides important insights into the properties of the diffuse interstellar medium (ISM) in the Milky Way. While previous investigations have focused mainly on the intensity characteristics of optical cirrus, their morphological properties remain largely unexplored. In this study, we employ several complementary statistical approaches -- local intensity statistics, angular power spectrum / $Δ$-variance analysis, and wavelet scattering transform analysis -- to characterize the morphology of cirrus in deep optical imaging data. We place our investigation of optical cirrus into a multi-wavelength context by comparing the morphology of cirrus seen with the Dragonfly Telephoto Array to that seen with space-based facilities working at longer wavelengths (Herschel 250 $μm$, WISE 12 $μm$, and Planck radiance), as well as with structures seen in the DHIGLS HI column density map. Our statistical methods quantify the similarities and the differences of cirrus morphology in all these datasets. The morphology of cirrus at visible wavelengths resembles that of far-infrared cirrus more closely than that of mid-infrared cirrus; on small scales, anisotropies in the cosmic infrared background and systematics may lead to differences. Across all dust tracers, cirrus morphology can be well described by a power spectrum with a common power-law index $γ\sim-2.9$. We demonstrate quantitatively that optical cirrus exhibits filamentary, coherent structures across a broad range of angular scales. Our results offer promising avenues for linking the analysis of coherent structures in optical cirrus to the underlying physical processes in the ISM that shape them. Furthermore, we demonstrate that these morphological signatures can be leveraged to distinguish and disentangle cirrus from extragalactic light.