Source author record

Jiahao Fan

Jiahao Fan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision astro-ph.EP astro-ph.SR Computation and Language cond-mat.mtrl-sci Multimedia physics.optics

Catalog footprint

What is connected

5works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

An Efficient Additive Kolmogorov-Arnold Transformer for Point-Level Maize Localization in Unmanned Aerial Vehicle Imagery

High-resolution UAV photogrammetry has become a key technology for precision agriculture, enabling centimeter-level crop monitoring and point-level plant localization. However, point-level maize localization in UAV imagery remains challenging due to (1) extremely small object-to-pixel ratios, typically less than 0.1%, (2) prohibitive computational costs of quadratic attention on ultra-high-resolution images larger than 3000 x 4000 pixels, and (3) agricultural scene-specific complexities such as sparse object distribution and environmental variability that are poorly handled by general-purpose vision models. To address these challenges, we propose the Additive Kolmogorov-Arnold Transformer (AKT), which replaces conventional multilayer perceptrons with Pade Kolmogorov-Arnold Network (PKAN) modules to enhance functional expressivity for small-object feature extraction, and introduces PKAN Additive Attention (PAA) to model multiscale spatial dependencies with reduced computational complexity. In addition, we present the Point-based Maize Localization (PML) dataset, consisting of 1,928 high-resolution UAV images with approximately 501,000 point annotations collected under real field conditions. Extensive experiments show that AKT achieves an average F1-score of 62.8%, outperforming state-of-the-art methods by 4.2%, while reducing FLOPs by 12.6% and improving inference throughput by 20.7%. For downstream tasks, AKT attains a mean absolute error of 7.1 in stand counting and a root mean square error of 1.95-1.97 cm in interplant spacing estimation. These results demonstrate that integrating Kolmogorov-Arnold representation theory with efficient attention mechanisms offers an effective framework for high-resolution agricultural remote sensing.

preprint2022arXiv

Temporal coherence of optical fields in the presence of entanglement

In classical coherence theory, coherence time is typically related to the bandwidth of the optical field. Narrowing the bandwidth will result in the lengthening of the coherence time. This will erase temporal distinguishability of photons due to time delay in pulsed photon interference. However, this is changed in an SU(1,1)-type quantum interferometer where quantum entanglement is involved. In this paper, we investigate how the temporal coherence of the fields in a pulse-pumped SU(1,1) interferometer changes with the bandwidth of optical filtering. We find that, because of the quantum entanglement, the coherence of the fields does not improve when optical filtering is applied, in contrary to the classical coherence theory, and quantum entanglement plays a crucial role in quantum interference in addition to distinguishability.

preprint2022arXiv

TESS discovery of a sub-Neptune orbiting a mid-M dwarf TOI-2136

We present the discovery of TOI-2136b, a sub-Neptune planet transiting every 7.85 days a nearby M4.5V-type star, identified through photometric measurements from the TESS mission. The host star is located $33$ pc away with a radius of $R_{\ast} = 0.34\pm0.02\ R_{\odot}$, a mass of $0.34\pm0.02\ M_{\odot}$ and an effective temperature of $\rm 3342\pm100\ K$. We estimate its stellar rotation period to be $75\pm5$ days based on archival long-term photometry. We confirm and characterize the planet based on a series of ground-based multi-wavelength photometry, high-angular-resolution imaging observations, and precise radial velocities from CFHT/SPIRou. Our joint analysis reveals that the planet has a radius of $2.19\pm0.17\ R_{\oplus}$, and a mass measurement of $6.4\pm2.4\ M_{\oplus}$. The mass and radius of TOI2136b is consistent with a broad range of compositions, from water-ice to gas-dominated worlds. TOI-2136b falls close to the radius valley for low-mass stars predicted by the thermally driven atmospheric mass loss models, making it an interesting target for future studies of its interior structure and atmospheric properties.

preprint2022arXiv

Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training

Vision-language pre-training has been an emerging and fast-developing research topic, which transfers multi-modal knowledge from rich-resource pre-training task to limited-resource downstream tasks. Unlike existing works that predominantly learn a single generic encoder, we present a pre-trainable Universal Encoder-DEcoder Network (Uni-EDEN) to facilitate both vision-language perception (e.g., visual question answering) and generation (e.g., image captioning). Uni-EDEN is a two-stream Transformer based structure, consisting of three modules: object and sentence encoders that separately learns the representations of each modality, and sentence decoder that enables both multi-modal reasoning and sentence generation via inter-modal interaction. Considering that the linguistic representations of each image can span different granularities in this hierarchy including, from simple to comprehensive, individual label, a phrase, and a natural sentence, we pre-train Uni-EDEN through multi-granular vision-language proxy tasks: Masked Object Classification (MOC), Masked Region Phrase Generation (MRPG), Image-Sentence Matching (ISM), and Masked Sentence Generation (MSG). In this way, Uni-EDEN is endowed with the power of both multi-modal representation extraction and language modeling. Extensive experiments demonstrate the compelling generalizability of Uni-EDEN by fine-tuning it to four vision-language perception and generation downstream tasks.

preprint2021arXiv

Liganded Xene as a Prototype of Two-Dimensional Stiefel-Whitney Insulators

Two-dimensional (2D) Stiefel-Whitney insulator (SWI), which is characterized by the second Stiefel-Whitney class, is a new class of topological phases with zero Berry curvature. As a novel topological state, it has been well studied in theory but seldom realized in realistic materials. Here we propose that a large class of liganded Xenes, i.e., hydrogenated and halogenated 2D group-IV honeycomb lattices, are 2D SWIs. The nontrivial topology of liganded Xenes is identified by the bulk topological invariant and the existence of protected corner states. Moreover, the large and tunable band gap (up to 3.5 eV) of liganded Xenes will facilitate the experimental characterization of the 2D SWI phase. Our findings not only provide abundant realistic material candidates that are experimentally feasible, but also draw more fundamental research interest towards the topological physics associated with Stiefel-Whitney class in the absence of Berry curvature.

Jiahao Fan

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

An Efficient Additive Kolmogorov-Arnold Transformer for Point-Level Maize Localization in Unmanned Aerial Vehicle Imagery

Temporal coherence of optical fields in the presence of entanglement

TESS discovery of a sub-Neptune orbiting a mid-M dwarf TOI-2136

Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training

Liganded Xene as a Prototype of Two-Dimensional Stiefel-Whitney Insulators