Source author record

Kai Zou

Kai Zou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision physics.optics quant-ph Distributed, Parallel, and Cluster Computing Machine Learning physics.app-ph physics.ins-det

Catalog footprint

What is connected

5works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Advancing Aesthetic Image Generation via Composition Transfer

Composition is a cornerstone of visual aesthetics, influencing the appeal of an image. While its principles operate independently of specific content, in practice, composition is often coupled with semantics. As a result, existing methods often enhance composition either through implicit learning or by semantics-based layout control, rather than explicitly modeling composition itself. To address this gap, we introduce Composer, a framework rooted in aesthetic theory, designed to model composition in a semantic-agnostic manner. First, it supports composition transfer by extracting key composition-aware representations from a reference image and leveraging a tailored conditional guidance module to control composition based on pre-trained diffusion models. Second, when users specify only text themes without a composition reference, Composer supports theme-driven composition retrieval by leveraging the in-context learning capabilities of Large Vision-Language Models (LVLMs), achieving explicit composition planning. To enhance composition in a reference-free mode, we conduct text-to-composition fine-tuning on the trained control module to enable implicit composition planning. Furthermore, we curated a high-quality dataset comprising 2 million image-text pairs using state-of-the-art generative models to support model training. Experimental results demonstrate that Composer significantly enhances aesthetic quality in text-to-image tasks and facilitates personalized composition control and transfer, offering users precision and flexibility in the creative process.

preprint2022arXiv

Fractal superconducting nanowires detect infrared single photons with 84% system detection efficiency, 1.02 polarization sensitivity, and 20.8 ps timing resolution

The near-unity system detection efficiency (SDE) and excellent timing resolution of superconducting nanowire single-photon detectors (SNSPDs), combined with their other merits, have enabled many classical and quantum photonic applications. However, the prevalent design based on meandering nanowires makes SDE dependent on the polarization states of the incident photons; for unpolarized light, the major merit of high SDE would get compromised, which could be detrimental for photon-starved applications. Here, we create SNSPDs with an arced fractal geometry that almost completely eliminates this polarization dependence of the SDE, and we experimentally demonstrate 84$\pm$3$\%$ SDE, 1.02$^{+0.06}_{-0.02}$ polarization sensitivity at the wavelength of 1575 nm, and 20.8 ps timing jitter in a 0.1-W closed-cycle Gifford-McMahon cryocooler, at the base temperature of 2.0 K. This demonstration provides a novel, practical device structure of SNSPDs, allowing for operation in the visible, near-, and mid-infrared spectral ranges, and paves the way for polarization-insensitive single-photon detection with high SDE and high timing resolution

preprint2022arXiv

GX-Plug: a Middleware for Plugging Accelerators to Distributed Graph Processing

Recently, research communities highlight the necessity of formulating a scalability continuum for large-scale graph processing, which gains the scale-out benefits from distributed graph systems, and the scale-up benefits from high-performance accelerators. To this end, we propose a middleware, called the GX-plug, for the ease of integrating the merits of both. As a middleware, the GX-plug is versatile in supporting different runtime environments, computation models, and programming models. More, for improving the middleware performance, we study a series of techniques, including pipeline shuffle, synchronization caching and skipping, and workload balancing, for intra-, inter-, and beyond-iteration optimizations, respectively. Experiments show that our middleware efficiently plugs accelerators to representative distributed graph systems, e.g., GraphX and Powergraph, with up-to 20x acceleration ratio.

preprint2022arXiv

Unsupervised Temporal Video Grounding with Deep Semantic Clustering

Temporal video grounding (TVG) aims to localize a target segment in a video according to a given sentence query. Though respectable works have made decent achievements in this task, they severely rely on abundant video-query paired data, which is expensive and time-consuming to collect in real-world scenarios. In this paper, we explore whether a video grounding model can be learned without any paired annotations. To the best of our knowledge, this paper is the first work trying to address TVG in an unsupervised setting. Considering there is no paired supervision, we propose a novel Deep Semantic Clustering Network (DSCNet) to leverage all semantic information from the whole query set to compose the possible activity in each video for grounding. Specifically, we first develop a language semantic mining module, which extracts implicit semantic features from the whole query set. Then, these language semantic features serve as the guidance to compose the activity in video via a video-based semantic aggregation module. Finally, we utilize a foreground attention branch to filter out the redundant background activities and refine the grounding results. To validate the effectiveness of our DSCNet, we conduct experiments on both ActivityNet Captions and Charades-STA datasets. The results demonstrate that DSCNet achieves competitive performance, and even outperforms most weakly-supervised approaches.

preprint2020arXiv

A platform for high performance photon correlation measurements

A broad range of scientific and industrial disciplines require precise optical measurements at very low light levels. Single-photon detectors combining high efficiency and high time resolution are pivotal in such experiments. By using relatively thick films of NbTiN (8-11\,nm) and improving the pattern fidelity of the nano-structure of the superconducting nanowire single-photon detectors (SNSPD), we fabricated devices demonstrating superior performance over all previously reported detectors in the combination of efficiency and time resolution. Our findings prove that small variations in the nanowire width, in the order of a few nanometers, can lead to a significant penalty on their temporal response. Addressing these issues, we consistently achieved high time resolution (best device 7.7\,ps, other devices $\sim$10-16\,ps) simultaneously with high system detection efficiencies ($80-90\%$) in the wavelength range of 780-1000\,nm, as well as in the telecom bands (1310-1550\,nm). The use of thicker films allowed us to fabricate large-area multi-pixel devices with homogeneous pixel performance. We first fabricated and characterized a $100\times100\, μm^2$ 16-pixel detector and showed there was little variation among individual pixels. Additionally, to showcase the power of our platform, we fabricated and characterized 4-pixel multimode fiber-coupled detectors and carried out photon correlation experiments on a nanowire quantum dot resulting in $g^2(0)$ values lower than 0.04. The multi-pixel detectors alleviate the need for beamsplitters and can be used for higher order correlations with promising prospects not only in the field of quantum optics, but also in bio-imaging applications, such as fluorescence microscopy and positron emission tomography.

Kai Zou

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Advancing Aesthetic Image Generation via Composition Transfer

Fractal superconducting nanowires detect infrared single photons with 84% system detection efficiency, 1.02 polarization sensitivity, and 20.8 ps timing resolution

GX-Plug: a Middleware for Plugging Accelerators to Distributed Graph Processing

Unsupervised Temporal Video Grounding with Deep Semantic Clustering

A platform for high performance photon correlation measurements