Source author record

Kai Yan

Kai Yan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Computation and Language Graphics math.AP

Catalog footprint

What is connected

3works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Real-Time Neural Hair Denoising

We propose a lightweight real-time method for reconstructing strand-based hair G-Buffers from severely undersampled rasterized inputs. Our pipeline first applies neural spatial reconstruction and temporal accumulation to recover hair coverage, i.e., fractional hair visibility within a pixel, and tangent. It then uses a tangent-guided reconstruction step to complete the position, which is subsequently used for physically based deferred hair shading. We evaluate our method across a diverse set of hairstyles, including straight, wavy, afro, and ponytail styles, under both static and dynamic scenarios. Our method achieves higher hair reconstruction quality than existing hair-specific denoising techniques and general industrial neural reconstruction solutions such as DLSS and FSR.

preprint2026arXiv

Transport equation theory in the Triebel-Lizorkin spaces and its applications to the ideal fluid flows

In this paper, we develop a general theory for the transport equation within the framework of Triebel-Lizorkin spaces. We first derive commutator estimates in these spaces, dispensing with the conventional divergence-free condition, via the Bony paraproduct decomposition and vector-valued maximal function inequalities. Building on these estimates and combining the method of characteristics with a compactness argument, we then obtain the new a priori estimates and prove local well-posedness for the transport equation in Triebel-Lizorkin spaces. The resulting theory is applicable to a wide range of evolution equations, including models for incompressible and compressible ideal fluid flows, shallow water waves, among others. As an illustration, we consider the incompressible ideal magnetohydrodynamics (MHD) system. Employing the general transport theory developed here yields a complete local well-posedness result in the sense of Hadamard, covering both sub-critical and critical regularity regimes, and provides corresponding blow-up criteria for the ideal MHD equations in Triebel-Lizorkin spaces. Our results refine and substantially extend earlier work in this direction.

preprint2026arXiv

Visual Merit or Linguistic Crutch? A Close Look at DeepSeek-OCR

DeepSeek-OCR utilizes an optical 2D mapping approach to achieve high-ratio vision-text compression, claiming to decode text tokens exceeding ten times the input visual tokens. While this suggests a promising solution for the LLM long-context bottleneck, we investigate a critical question: "Visual merit or linguistic crutch - which drives DeepSeek-OCR's performance?" By employing sentence-level and word-level semantic corruption, we isolate the model's intrinsic OCR capabilities from its language priors. Results demonstrate that without linguistic support, DeepSeek-OCR's performance plummets from approximately 90% to 20%. Comparative benchmarking against 13 baseline models reveals that traditional pipeline OCR methods exhibit significantly higher robustness to such semantic perturbations than end-to-end methods. Furthermore, we find that lower visual token counts correlate with increased reliance on priors, exacerbating hallucination risks. Context stress testing also reveals a total model collapse around 10,000 text tokens, suggesting that current optical compression techniques may paradoxically aggravate the long-context bottleneck. This study empirically defines DeepSeek-OCR's capability boundaries and offers essential insights for future optimizations of the vision-text compression paradigm. We release all data, results and scripts used in this study at https://github.com/dududuck00/DeepSeekOCR.

Kai Yan

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Real-Time Neural Hair Denoising

Transport equation theory in the Triebel-Lizorkin spaces and its applications to the ideal fluid flows

Visual Merit or Linguistic Crutch? A Close Look at DeepSeek-OCR