Source author record

Sun Sun

Sun Sun appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning math.OC Systems and Control Performance

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

ASH: Agents that Self-Hone via Embodied Learning

Long-horizon embodied tasks remain a fundamental challenge in AI, as current methods rely on hand-engineered rewards or action-labeled demonstrations, neither of which scales. We introduce ASH, an agentic system that learns an embodied policy from unlabeled, noisy internet video, without reward shaping or expert annotation. ASH follows a self-improvement loop; when it gets stuck, ASH learns an Inverse Dynamics Model (IDM) from its own trajectories, and uses its IDM to extract supervision from relevant internet video. ASH uses unsupervised learning to identify key moments from large-scale internet video and retains them as long-term memory -- allowing it to tackle long-horizon problems. We evaluate ASH on two complementary environments demanding multi-hour planning: Pokemon Emerald, a turn-based RPG, and The Legend of Zelda: The Minish Cap, a real-time action-adventure game. In both games, behavioral cloning, retrieval-augmented and zero-shot foundation-model baselines plateau, while ASH sustains progression across our 8-hour evaluation. ASH reaches an average of $11.2/12$ milestones in Pokemon Emerald and $9.9/12$ in Legend of Zelda, while the strongest baseline gets stuck in both environments at an average of $6.5/12$ and $6.0/12$ milestones, respectively. We demonstrate that self-improving agents are a scalable recipe for long-horizon embodied learning.

preprint2022arXiv

SoftEdge: Regularizing Graph Classification with Random Soft Edges

Augmented graphs play a vital role in regularizing Graph Neural Networks (GNNs), which leverage information exchange along edges in graphs, in the form of message passing, for learning. Due to their effectiveness, simple edge and node manipulations (e.g., addition and deletion) have been widely used in graph augmentation. Nevertheless, such common augmentation techniques can dramatically change the semantics of the original graph, causing overaggressive augmentation and thus under-fitting in the GNN learning. To address this problem arising from dropping or adding graph edges and nodes, we propose SoftEdge, which assigns random weights to a portion of the edges of a given graph for augmentation. The synthetic graph generated by SoftEdge maintains the same nodes and their connectivities as the original graph, thus mitigating the semantic changes of the original graph. We empirically show that this simple method obtains superior accuracy to popular node and edge manipulation approaches and notable resilience to the accuracy degradation with the GNN depth.

preprint2016arXiv

Distributed Real-Time Power Balancing in Renewable-Integrated Power Grids with Storage and Flexible Loads

The large-scale integration of renewable generation directly affects the reliability of power grids. We investigate the problem of power balancing in a general renewable-integrated power grid with storage and flexible loads. We consider a power grid that is supplied by one conventional generator (CG) and multiple renewable generators (RGs) each co-located with storage,and is connected with external markets. An aggregator operates the power grid to maintain power balance between supply and demand. Aiming at minimizing the long-term system cost, we first propose a real-time centralized power balancing solution, taking into account the uncertainty of the renewable generation, loads, and energy prices. We then provide a distributed implementation algorithm, significantly reducing both computational burden and communication overhead. We demonstrate that our proposed algorithm is asymptotically optimal as the storage capacity increases and the CG ramping constraint loosens. Moreover, the distributed implementation enjoys a fast convergence rate, and enables each RG and the aggregator to make their own decisions. Simulation shows that our proposed algorithm outperforms alternatives and can achieve near-optimal performance for a wide range of storage capacity.

preprint2015arXiv

Phase Balancing Using Energy Storage in Power Grids under Uncertainty

Phase balancing is essential to safe power system operation. We consider a substation connected to multiple phases, each with single-phase loads, generation, and energy storage. A representative of the substation operates the system and aims to minimize the cost of all phases and to balance loads among phases. We first consider ideal energy storage with lossless charging and discharging, and propose both centralized and distributed real-time algorithms taking into account system uncertainty. The proposed algorithm does not require any system statistics and asymptotically achieves the minimum system cost with large energy storage. We then extend the algorithm to accommodate more realistic non-ideal energy storage that has imperfect charging and discharging. The performance of the proposed algorithm is evaluated through extensive simulation and compared with that of a benchmark greedy algorithm. Simulation shows that our algorithm leads to strong performance over a wide range of storage characteristics.

preprint2014arXiv

Real-Time Welfare-Maximizing Regulation Allocation in Dynamic Aggregator-EVs System

The concept of vehicle-to-grid (V2G) has gained recent interest as more and more electric vehicles (EVs) are put to use. In this paper, we consider a dynamic aggregator-EVs system, where an aggregator centrally coordinates a large number of dynamic EVs to perform regulation service. We propose a Welfare-Maximizing Regulation Allocation (WMRA) algorithm for the aggregator to fairly allocate the regulation amount among its EVs. Compared to previous works, WMRA accommodates a wide spectrum of vital system characteristics, including dynamics of EV, limited EV battery size, EV battery degradation cost, and the cost of using external energy sources for the aggregator. The algorithm operates in real time and does not require any prior knowledge of the statistical information of the system. Theoretically, we demonstrate that WMRA is away from the optimum by O(1/V), where V is a controlling parameter depending on EV's battery size. In addition, our simulation results indicate that WMRA can substantially outperform a suboptimal greedy algorithm.