Source author record

Song Cao

Song Cao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computer Science and Game Theory Distributed, Parallel, and Cluster Computing Machine Learning

Catalog footprint

What is connected

3works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

D-Artemis: A Deliberative Cognitive Framework for Mobile GUI Multi-Agents

Graphical User Interface (GUI) agents aim to automate a wide spectrum of human tasks by emulating user interaction. Despite rapid advancements, current approaches are hindered by several critical challenges: data bottleneck in end-to-end training, high cost of delayed error detection, and risk of contradictory guidance. Inspired by the human cognitive loop of Thinking, Alignment, and Reflection, we present D-Artemis -- a novel deliberative framework in this paper. D-Artemis leverages a fine-grained, app-specific tip retrieval mechanism to inform its decision-making process. It also employs a proactive Pre-execution Alignment stage, where Thought-Action Consistency (TAC) Check module and Action Correction Agent (ACA) work in concert to mitigate the risk of execution failures. A post-execution Status Reflection Agent (SRA) completes the cognitive loop, enabling strategic learning from experience. Crucially, D-Artemis enhances the capabilities of general-purpose Multimodal large language models (MLLMs) for GUI tasks without the need for training on complex trajectory datasets, demonstrating strong generalization. D-Artemis establishes new state-of-the-art (SOTA) results across both major benchmarks, achieving a 75.8% success rate on AndroidWorld and 96.8% on ScreenSpot-V2. Extensive ablation studies further demonstrate the significant contribution of each component to the framework.

preprint2026arXiv

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

We present MindLab Toolkit (MinT), a managed infrastructure system for Low-Rank Adaptation (LoRA) post-training and online serving. MinT targets a setting where many trained policies are produced over a small number of expensive base-model deployments. Instead of materializing each policy as a merged full checkpoint, MinT keeps the base model resident and moves exported LoRA adapter revisions through rollout, update, export, evaluation, serving, and rollback, hiding distributed training, serving, scheduling, and data movement behind a service interface. MinT scales this path along three axes. Scale Up extends LoRA RL to frontier-scale dense and MoE architectures, including MLA and DSA attention paths, with training and serving validated beyond 1T total parameters. Scale Down moves only the exported LoRA adapter, which can be under 1% of base-model size in rank-1 settings; adapter-only handoff reduces the measured step by 18.3x on a 4B dense model and 2.85x on a 30B MoE, while concurrent multi-policy GRPO shortens wall time by 1.77x and 1.45x without raising peak memory. Scale Out separates durable policy addressability from CPU/GPU working sets: a tensor-parallel deployment supports 10^6-scale addressable catalogs (measured single-engine sweeps through 100K) and thousand-adapter active waves at cluster scale, with cold loading treated as scheduled service work and packed MoE LoRA tensors improving live engine loading by 8.5-8.7x. MinT thus manages million-scale LoRA policy catalogs while training and serving selected adapter revisions over shared 1T-class base models.

preprint2026arXiv

Optimal Allocations under Strongly Pigou-Dalton Criteria: Hidden Layer Structure & Efficient Combinatorial Approach

We investigate optimal social welfare allocations of $m$ items to $n$ agents with binary additive or submodular valuations. For binary additive valuations, we prove that the set of optimal allocations coincides with the set of so-called \emph{stable allocations}, as long as the employed criterion for evaluating social welfare is strongly Pigou-Dalton (SPD) and symmetric. Many common criteria are SPD and symmetric, such as Nash social welfare, leximax, leximin, Gini index, entropy, and envy sum. We also design efficient algorithms for finding a stable allocation, including an $O(m^2n)$ time algorithm for the case of indivisible items, and an $O(m^2n^5)$ time one for the case of divisible items. The first is faster than the existing algorithms or has a simpler analysis. The latter is the first combinatorial algorithm for that problem. It utilizes a hidden layer partition of items and agents admitted by all stable allocations, and cleverly reduces the case of divisible items to the case of indivisible items.In addition, we show that the profiles of different optimal allocations have a small Chebyshev distance, which is 0 for the case of divisible items under binary additive valuations, and is at most 1 for the case of indivisible items under binary submodular valuations.

Song Cao

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

D-Artemis: A Deliberative Cognitive Framework for Mobile GUI Multi-Agents

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

Optimal Allocations under Strongly Pigou-Dalton Criteria: Hidden Layer Structure & Efficient Combinatorial Approach