Source author record

Haochen Zhang

Haochen Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence eess.SY Information Theory Machine Learning math.IT physics.app-ph physics.med-ph Systems and Control

Catalog footprint

What is connected

8works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

cotomi Act: Learning to Automate Work by Watching You

What if a browser agent could learn your work simply by watching you do it? We present cotomi Act, a browser-based computer-using agent that combines reliable multi-step task execution with persistent organizational knowledge learned from user behavior. For execution, an agent scaffold with adaptive lazy observation, verbal-diff-based history compression, coarse-grained actions, and test-time scaling via best-of-N action selection achieves 80.4% on the 179-task WebArena human-evaluation subset, exceeding the reported 78.2% human baseline. For organizational knowledge, a behavior-to-knowledge pipeline passively observes the user's browsing and progressively abstracts it into artifacts (task boards, wiki) exposed through a shared workspace editable by both user and agent. A controlled proxy evaluation confirms that task success improves as behavior-derived knowledge accumulates. In our live demonstration, attendees interact with the system in a real browser, issuing tasks and observing end-to-end autonomous execution and shared knowledge management.

preprint2026arXiv

Zero-Shot Transfer Capabilities of the Sundial Foundation Model for Leaf Area Index Forecasting

This work investigates the zero-shot forecasting capability of time series foundation models for Leaf Area Index (LAI) forecasting in agricultural monitoring. Using the HiQ dataset (U.S., 2000-2022), we systematically compare statistical baselines, a fully supervised LSTM, and the Sundial foundation model under multiple evaluation protocols. We find that Sundial, in the zero-shot setting, can outperform a fully trained LSTM provided that the input context window is sufficiently long-specifically, when covering more than one or two full seasonal cycles. We show that a general-purpose foundation model can surpass specialized supervised models on remote-sensing time series prediction without any task-specific tuning. These results highlight the strong potential of pretrained time series foundation models to serve as effective plug-and-play forecasters in agricultural and environmental applications.

preprint2023arXiv

A Novel Estimation Method for Temperature of Magnetic Nanoparticles Dominated by Brownian Relaxation Based on Magnetic Particle Spectroscopy

This paper presents a novel method for estimating the temperature of magnetic nanoparticles (MNPs) based on AC magnetization harmonics of MNPs dominated by Brownian relaxation. The difference in the AC magnetization response and magnetization harmonic between the Fokker-Planck equation and the Langevin function was analyzed, and we studied the relationship between the magnetization harmonic and the key factors, such as Brownian relaxation time, temperature, magnetic field strength, core size and hydrodynamic size of MNPs, excitation frequency, and so on. We proposed a compensation function for AC magnetization harmonic with consideration of the key factors and the difference between the Fokker-Planck equation and the Langevin function. Then a temperature estimation model based on the compensation function and the Langevin function was established. By employing the least squares algorithm, the temperature was successfully calculated. The experimental results show that the temperature error is less than 0.035 K in the temperature range from 310 K to 320 K. The temperature estimation model is expected to improve the performance of the magnetic nanoparticle thermometer and be applied to magnetic nanoparticle-mediated hyperthermia.

preprint2022arXiv

Characterization of GaN-based HEMTs Down to 4.2 K for Cryogenic Applications

The cryogenic performance of GaN-based HEMTs (high-electron-mobility transistors) is systematically investigated by the direct current (DC) and low-frequency noise (LFN) characteristics within the temperature (T) range from 300 K to 4.2 K. The important electrical merits of the device, including drain saturation current (IDsat), on-resistance (RON), transductance, subthreshold swing (SS), gate leakage current, and Schottky barrier height, are comprehensively characterized and their temperature-dependent behavior was statistically analyzed. In addition, the LFN of the device shows an evident behavior of 1/f noise from 10 Hz to 10 kHz in the measured temperature range and can be significantly reduced at cryogenic temperature. These results are of great importance to motivate further studies into the GaN-based cryo-devices and systems.

preprint2022arXiv

Resilient Distribution System Restoration with Communication Recovery by Drone Small Cells

Distribution system (DS) restoration after natural disasters often faces the challenge of communication failures to feeder automation (FA) facilities, resulting in prolonged load pick-up process. This letter discusses the utilization of drone small cells for wireless communication recovery of FA, and proposes an integrated DS restoration strategy with communication recovery. Demonstrative case studies are conducted to validate the proposed model, and its advantages are illustrated by comparing to benchmark strategies.

preprint2020arXiv

Is There Tradeoff between Spatial and Temporal in Video Super-Resolution?

Recent advances of deep learning lead to great success of image and video super-resolution (SR) methods that are based on convolutional neural networks (CNN). For video SR, advanced algorithms have been proposed to exploit the temporal correlation between low-resolution (LR) video frames, and/or to super-resolve a frame with multiple LR frames. These methods pursue higher quality of super-resolved frames, where the quality is usually measured frame by frame in e.g. PSNR. However, frame-wise quality may not reveal the consistency between frames. If an algorithm is applied to each frame independently (which is the case of most previous methods), the algorithm may cause temporal inconsistency, which can be observed as flickering. It is a natural requirement to improve both frame-wise fidelity and between-frame consistency, which are termed spatial quality and temporal quality, respectively. Then we may ask, is a method optimized for spatial quality also optimized for temporal quality? Can we optimize the two quality metrics jointly?

preprint2019arXiv

On The Classification-Distortion-Perception Tradeoff

Signal degradation is ubiquitous and computational restoration of degraded signal has been investigated for many years. Recently, it is reported that the capability of signal restoration is fundamentally limited by the perception-distortion tradeoff, i.e. the distortion and the perceptual difference between the restored signal and the ideal `original' signal cannot be made both minimal simultaneously. Distortion corresponds to signal fidelity and perceptual difference corresponds to perceptual naturalness, both of which are important metrics in practice. Besides, there is another dimension worthy of consideration, namely the semantic quality or the utility for recognition purpose, of the restored signal. In this paper, we extend the previous perception-distortion tradeoff to the case of classification-distortion-perception (CDP) tradeoff, where we introduced the classification error rate of the restored signal in addition to distortion and perceptual difference. Two versions of the CDP tradeoff are considered, one using a predefined classifier and the other dealing with the optimal classifier for the restored signal. For both versions, we can rigorously prove the existence of the CDP tradeoff, i.e. the distortion, perceptual difference, and classification error rate cannot be made all minimal simultaneously. Our findings can be useful especially for computer vision researches where some low-level vision tasks (signal restoration) serve for high-level vision tasks (visual understanding).

preprint2019arXiv

Two-Stream Action Recognition-Oriented Video Super-Resolution

We study the video super-resolution (SR) problem for facilitating video analytics tasks, e.g. action recognition, instead of for visual quality. The popular action recognition methods based on convolutional networks, exemplified by two-stream networks, are not directly applicable on video of low spatial resolution. This can be remedied by performing video SR prior to recognition, which motivates us to improve the SR procedure for recognition accuracy. Tailored for two-stream action recognition networks, we propose two video SR methods for the spatial and temporal streams respectively. On the one hand, we observe that regions with action are more important to recognition, and we propose an optical-flow guided weighted mean-squared-error loss for our spatial-oriented SR (SoSR) network to emphasize the reconstruction of moving objects. On the other hand, we observe that existing video SR methods incur temporal discontinuity between frames, which also worsens the recognition accuracy, and we propose a siamese network for our temporal-oriented SR (ToSR) training that emphasizes the temporal continuity between consecutive frames. We perform experiments using two state-of-the-art action recognition networks and two well-known datasets--UCF101 and HMDB51. Results demonstrate the effectiveness of our proposed SoSR and ToSR in improving recognition accuracy.

Haochen Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

cotomi Act: Learning to Automate Work by Watching You

Zero-Shot Transfer Capabilities of the Sundial Foundation Model for Leaf Area Index Forecasting

A Novel Estimation Method for Temperature of Magnetic Nanoparticles Dominated by Brownian Relaxation Based on Magnetic Particle Spectroscopy

Characterization of GaN-based HEMTs Down to 4.2 K for Cryogenic Applications

Resilient Distribution System Restoration with Communication Recovery by Drone Small Cells

Is There Tradeoff between Spatial and Temporal in Video Super-Resolution?

On The Classification-Distortion-Perception Tradeoff

Two-Stream Action Recognition-Oriented Video Super-Resolution