Researcher profile

Dingyu Yang

Dingyu Yang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

DeXOR: Enabling XOR in Decimal Space for Streaming Lossless Compression of Floating-point Data

With streaming floating-point numbers being increasingly prevalent, effective and efficient compression of such data is critical. Compression schemes must be able to exploit the similarity, or smoothness, of consecutive numbers and must be able to contend with extreme conditions, such as high-precision values or the absence of smoothness. We present DeXOR, a novel framework that enables decimal XOR procedure to encode decimal-space longest common prefixes and suffixes, achieving optimal prefix reuse and effective redundancy elimination. To ensure accurate and low-cost decompression even with binary-decimal conversion errors, DeXOR incorporates 1) scaled truncation with error-tolerant rounding and 2) different bit management strategies optimized for decimal XOR. Additionally, a robust exception handler enhances stability by managing floating-point exponents, maintaining high compression ratios under extreme conditions. In evaluations across 22 datasets, DeXOR surpasses state-of-the-art schemes, achieving a 15% higher compression ratio and a 20% faster decompression speed while maintaining a competitive compression speed. DeXOR also offers scalability under varying conditions and exhibits robustness in extreme scenarios where other schemes fail.

preprint2026arXiv

SafeLoad: Efficient Admission Control Framework for Identifying Memory-Overloading Queries in Cloud Data Warehouses

Memory overload is a common form of resource exhaustion in cloud data warehouses. When database queries fail due to memory overload, it not only wastes critical resources such as CPU time but also disrupts the execution of core business processes, as memory-overloading (MO) queries are typically part of complex workflows. If such queries are identified in advance and scheduled to memory-rich serverless clusters, it can prevent resource wastage and query execution failure. Therefore, cloud data warehouses desire an admission control framework with high prediction precision, interpretability, efficiency, and adaptability to effectively identify MO queries. However, existing admission control frameworks primarily focus on scenarios like SLA satisfaction and resource isolation, with limited precision in identifying MO queries. Moreover, there is a lack of publicly available MO-labeled datasets with workloads for training and benchmarking. To tackle these challenges, we propose SafeLoad, the first query admission control framework specifically designed to identify MO queries. Alongside, we release SafeBench, an open-source, industrial-scale benchmark for this task, which includes 150 million real queries. SafeLoad first filters out memory-safe queries using the interpretable discriminative rule. It then applies a hybrid architecture that integrates both a global model and cluster-level models, supplemented by a misprediction correction module to identify MO queries. Additionally, a self-tuning quota management mechanism dynamically adjusts prediction quotas per cluster to improve precision. Experimental results show that SafeLoad achieves state-of-the-art prediction performance with low online and offline time overhead. Specifically, SafeLoad improves precision by up to 66% over the best baseline and reduces wasted CPU time by up to 8.09x compared to scenarios without SafeLoad.

preprint2026arXiv

SVFusion: A CPU-GPU Co-Processing Architecture for Large-Scale Real-Time Vector Search

Approximate Nearest Neighbor Search (ANNS) underpins modern applications such as information retrieval and recommendation. With the rapid growth of vector data, efficient indexing for real-time vector search has become rudimentary. Existing CPU-based solutions support updates but suffer from low throughput, while GPU-accelerated systems deliver high performance but face challenges with dynamic updates and limited GPU memory, resulting in a critical performance gap for continuous, large-scale vector search requiring both accuracy and speed. In this paper, we present SVFusion, a GPU-CPU-disk collaborative framework for real-time vector search that bridges sophisticated GPU computation with online updates. SVFusion leverages a hierarchical vector index architecture that employs CPU-GPU co-processing, along with a workload-aware vector caching mechanism to maximize the efficiency of limited GPU memory. It further enhances performance through real-time coordination with CUDA multi-stream optimization and adaptive resource management, along with concurrency control that ensures data consistency under interleaved queries and updates. Empirical results demonstrate that SVFusion achieves significant improvements in query latency and throughput, exhibiting a 20.9x higher throughput on average and 1.3x to 50.7x lower latency compared to baseline methods, while maintaining high recall for large-scale datasets under various streaming workloads.

preprint2026arXiv

Token Economics for LLM Agents: A Dual-View Study from Computing and Economics

As LLM agents evolve, tokens have emerged as the core economic primitives of Agentic AI. However, their exponential consumption introduces severe computational, collaborative, and security bottlenecks. Current surveys remain fragmented across system optimization, architecture design, and trust, lacking a unified framework to evaluate the fundamental trade-off between output quality and economic cost. To bridge this gap, this survey presents the first comprehensive survey of Token Economics. By unifying computer science and economics, we conceptualize tokens as production factors, exchange mediums, and units of account. We synthesize existing literature across a four-dimensional taxonomy: (1) Micro-level (Single Agent): Optimizing budget-constrained factor substitution via neoclassical firm theory. (2) Meso-level (Multi-Agent Systems): Minimizing collaboration friction using transaction cost and principal-agent theories. (3) Macro-level (Agent Ecosystems): Addressing congestion externalities and pricing via mechanism design. (4) Security: Internalizing adversarial threats as endogenous economic constraints. Finally, we outline frontier directions, including differentiable token budgets and dynamic markets, to lay the theoretical foundation for scalable next-generation agent systems.

preprint2022arXiv

Distributed Processing of k Shortest Path Queries over Dynamic Road Networks

The problem of identifying the k-shortest paths (KSPs for short) in a dynamic road network is essential to many location-based services. Road networks are dynamic in the sense that the weights of the edges in the corresponding graph constantly change over time, representing evolving traffic conditions. Very often such services have to process numerous KSP queries over large road networks at the same time, thus there is a pressing need to identify distributed solutions for this problem. However, most existing approaches are designed to identify KSPs on a static graph in a sequential manner (i.e., the (i+1)-th shortest path is generated based on the i-th shortest path), restricting their scalability and applicability in a distributed setting. We therefore propose KSP-DG, a distributed algorithm for identifying k-shortest paths in a dynamic graph. It is based on partitioning the entire graph into smaller subgraphs, and reduces the problem of determining KSPs into the computation of partial KSPs in relevant subgraphs, which can execute in parallel on a cluster of servers. A distributed two-level index called DTLP is developed to facilitate the efficient identification of relevant subgraphs. A salient feature of DTLP is that it indexes a set of virtual paths that are insensitive to varying traffic conditions, leading to very low maintenance cost in dynamic road networks. This is the first treatment of the problem of processing KSP queries over dynamic road networks. Extensive experiments conducted on real road networks confirm the superiority of our proposal over baseline methods.