Source author record

Yutong Huang

Yutong Huang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Distributed, Parallel, and Cluster Computing Operating Systems

Catalog footprint

What is connected

2works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

TClone: Low-Latency Forking of Live GUI Environments for Computer-Use Agents

Computer-use agents increasingly operate inside live personal workspaces, where their actions can modify files, applications, GUI state, credentials, and authenticated sessions. This creates a tension between safety and quality: agents need isolation and rollback to avoid damaging user state, but also need fast branching to support speculative execution and parallel search. Existing VMs, containers, and checkpoint/restore systems can isolate or recover workloads, but they do not provide low-latency versioning of a full interactive workspace. We present TClone, a forkable personal workspace system for computer-use agents. TClone enables a live GUI workspace to be snapshotted, forked into isolated branches, rolled back, and selectively committed or merged. Its design separates fast branch creation from durable checkpointing, using sibling containers, copy-on-write memory sharing, filesystem versioning, GUI-local execution, and asynchronous checkpointing. In our end-to-end agent-loop measurement, TClone reduces total task latency by 1.9x and 1.5x over KVM and CRIU. By making workspace versioning a first-class systems primitive, TClone supports safer and higher-quality agent execution over real personal computing environments.

preprint2022arXiv

Clio: A Hardware-Software Co-Designed Disaggregated Memory System

Memory disaggregation has attracted great attention recently because of its benefits in efficient memory utilization and ease of management. So far, memory disaggregation research has all taken one of two approaches: building/emulating memory nodes using regular servers or building them using raw memory devices with no processing power. The former incurs higher monetary cost and faces tail latency and scalability limitations, while the latter introduces performance, security, and management problems. Server-based memory nodes and memory nodes with no processing power are two extreme approaches. We seek a sweet spot in the middle by proposing a hardware-based memory disaggregation solution that has the right amount of processing power at memory nodes. Furthermore, we take a clean-slate approach by starting from the requirements of memory disaggregation and designing a memory-disaggregation-native system. We built Clio, a disaggregated memory system that virtualizes, protects, and manages disaggregated memory at hardware-based memory nodes. The Clio hardware includes a new virtual memory system, a customized network system, and a framework for computation offloading. In building Clio, we not only co-design OS functionalities, hardware architecture, and the network system, but also co-design compute nodes and memory nodes. Our FPGA prototype of Clio demonstrates that each memory node can achieve 100 Gbps throughput and an end-to-end latency of 2.5 us at median and 3.2us at the 99th percentile. Clio also scales much better and has orders of magnitude lower tail latency than RDMA. It has 1.1x to 3.4x energy saving compared to CPU-based and SmartNIC-based disaggregated memory systems and is 2.7x faster than software-based SmartNIC solutions.