Source author record

Hong Li

Hong Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Artificial Intelligence astro-ph.CO Computer Vision hep-ph

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

AliCPT Sensitivity to Cosmic Reheating

We present the first assessment of the Ali Cosmic Microwave Background Polarization Telescope's (AliCPT) sensitivity to the reheating epoch after cosmic inflation, based on its ability to detect primordial gravitational waves. We consider three models of inflation, an $α$-attractor T-model, RGI inflation and QCD-driven warm inflation. Assuming a fiducial value of $r=0.01$, we find that AliCPT-1, in its fully loaded focal plane detector configuration and combined with Planck, can provide measurements of the order of magnitude of the reheating temperature with an accuracy around $10\%$. For QCD-driven warm inflation this can be translated into a constraint on the inflaton coupling to gluons, which can be probed independently in axion search experiments. Our results constitute the first demonstration of AliCPT's ability to probe the initial temperature of the hot big bang and the microphysical parameter connecting cosmic inflation and particle physics.

preprint2026arXiv

Code as Agent Harness

Recent large language models (LLMs) have demonstrated strong capabilities in understanding and generating code, from competitive programming to repository-level software engineering. In emerging agentic systems, code is no longer only a target output. It increasingly serves as an operational substrate for agent reasoning, acting, environment modeling, and execution-based verification. We frame this shift through the lens of agent harnesses and introduce code as agent harness: a unified view that centers code as the basis for agent infrastructure. To systematically study this perspective, we organize the survey around three connected layers. First, we study the harness interface, where code connects agents to reasoning, action, and environment modeling. Second, we examine harness mechanisms: planning, memory, and tool use for long-horizon execution, together with feedback-driven control and optimization that make harness reliable and adaptive. Third, we discuss scaling the harness from single-agent systems to multi-agent settings, where shared code artifacts support multi-agent coordination, review, and verification. Across these layers, we summarize representative methods and practical applications of code as agent harness, spanning coding assistants, GUI/OS automation, embodied agents, scientific discovery, personalization and recommendation, DevOps, and enterprise workflows. We further outline open challenges for harness engineering, including evaluation beyond final task success, verification under incomplete feedback, regression-free harness improvement, consistent shared state across multiple agents, human oversight for safety-critical actions, and extensions to multimodal environments. By centering code as the harness of agentic AI, this survey provides a unified roadmap toward executable, verifiable, and stateful AI agent systems.

preprint2026arXiv

Visual Merit or Linguistic Crutch? A Close Look at DeepSeek-OCR

DeepSeek-OCR utilizes an optical 2D mapping approach to achieve high-ratio vision-text compression, claiming to decode text tokens exceeding ten times the input visual tokens. While this suggests a promising solution for the LLM long-context bottleneck, we investigate a critical question: "Visual merit or linguistic crutch - which drives DeepSeek-OCR's performance?" By employing sentence-level and word-level semantic corruption, we isolate the model's intrinsic OCR capabilities from its language priors. Results demonstrate that without linguistic support, DeepSeek-OCR's performance plummets from approximately 90% to 20%. Comparative benchmarking against 13 baseline models reveals that traditional pipeline OCR methods exhibit significantly higher robustness to such semantic perturbations than end-to-end methods. Furthermore, we find that lower visual token counts correlate with increased reliance on priors, exacerbating hallucination risks. Context stress testing also reveals a total model collapse around 10,000 text tokens, suggesting that current optical compression techniques may paradoxically aggravate the long-context bottleneck. This study empirically defines DeepSeek-OCR's capability boundaries and offers essential insights for future optimizations of the vision-text compression paradigm. We release all data, results and scripts used in this study at https://github.com/dududuck00/DeepSeekOCR.

Hong Li

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

AliCPT Sensitivity to Cosmic Reheating

Code as Agent Harness

Visual Merit or Linguistic Crutch? A Close Look at DeepSeek-OCR