Source author record

Wei-Yu Chen

Wei-Yu Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Distributed, Parallel, and Cluster Computing physics.optics eess.IV Machine Learning Programming Languages

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Heterogeneous Model Fusion for Privacy-Aware Multi-Camera Surveillance via Synthetic Domain Adaptation

We propose HeroCrystal, a novel privacy-preserving framework for multi-camera domain-adaptive object detection, addressing challenges such as data privacy, class imbalance, and heterogeneous architectures. Our framework consists of three key stages. In the Generated Stage, we introduce a one-shot, target-aware diffusion-based generation module that learns visual style from a single target-domain image while leveraging prompt-based control to synthesize specific object instances. Unlike conventional style transfer-based methods that require large target datasets and ignore semantic-level discrepancies, our approach enables privacy-preserving augmentation to reduce ethical concerns, and introduces controllable rare object generation to mitigate long-tailed category degradation. In the Federated Stage, we employ probabilistic Faster R-CNN on the client side to improve localization accuracy, and a dynamic model contrastive strategy to suppress domain-specific bias. The server side performs model fusion across heterogeneous architectures without accessing raw data. Finally, in the Distilled Stage, we propose an inconsistent categories integration algorithm to resolve label inconsistency and architecture heterogeneity across clients. Extensive experiments on multiple cross-domain detection benchmarks demonstrate that our method outperforms existing multi-source domain adaptation and federated learning baselines under multi-class, privacy-preserving settings. Our method improves mAP by +2.1% over prior privacy-preserving approaches and achieves a new state-of-the-art mAP of 33.4%, highlighting the effectiveness of HeroCrystal in enabling practical multi-camera AI surveillance systems. The source code is publicly available at https://github.com/ccuvislab/HeroCrystal.

preprint2022arXiv

Enhancing Speckle Statistics for Imaging inside Scattering Media

We exploit memory effect speckle correlations for the imaging of incoherent linear (single-photon) fluorescent sources behind scattering tissue. While memory effect-based imaging techniques have been heavily studied in the past, for thick scattering layers and complex illumination patterns these correlations are weak, limiting the practice applicability of the idea. In this work, we introduce a Spatial Light Modulator (SLM) between the tissue sample and the imaging sensor and capture multiple modulations of the speckle pattern. We show that by correctly designing the modulation pattern and the reconstruction algorithm we can greatly enhance statistical correlations in the data. We exploit this to demonstrate the reconstruction of mega-pixel wide fluorescent patterns behind scattering tissue.

preprint2021arXiv

C-for-Metal: High Performance SIMD Programming on Intel GPUs

The SIMT execution model is commonly used for general GPU development. CUDA and OpenCL developers write scalar code that is implicitly parallelized by compiler and hardware. On Intel GPUs, however, this abstraction has profound performance implications as the underlying ISA is SIMD and important hardware capabilities cannot be fully utilized. To close this performance gap we introduce C-For-Metal (CM), an explicit SIMD programming framework designed to deliver close-to-the-metal performance on Intel GPUs. The CM programming language and its vector/matrix types provide an intuitive interface to exploit the underlying hardware features, allowing fine-grained register management, SIMD size control and cross-lane data sharing. Experimental results show that CM applications from different domains outperform the best-known SIMT-based OpenCL implementations, achieving up to 2.7x speedup on the latest Intel GPU.

preprint2021arXiv

In-plane subwavelength near field optical capsule for lab-on-a-chip optical nano-tweezer

In this letter, we propose a new proof-of-concept of optical nano-tweezer on the basis of a pair of dielectric rectangular rods capable of generating a novel class of controlled finite-volume near field light capsules. The finite-difference time-domain simulations of light spatial structure and optical trapping forces of the gold nanoparticle immersed in water demonstrate the physical concept of an in-plane subwavelength optical capsule, integrated with the microfluidic mesoscale device. It is shown that refractive index and distance between dielectric rectangular rods can control the shape and axial position of the optical capsule. Such an in-plane wavelength-scaled structure provides a new path for manipulating absorbing nano-particles including bio-particles in a compact planar architecture and should thus open promising perspectives in lab-on-a-chip domains.

preprint2020arXiv

A Closer Look at Few-shot Classification

Few-shot classification aims to learn a classifier to recognize unseen classes during training with limited labeled examples. While significant progress has been made, the growing complexity of network designs, meta-learning algorithms, and differences in implementation details make a fair comparison difficult. In this paper, we present 1) a consistent comparative analysis of several representative few-shot classification algorithms, with results showing that deeper backbones significantly reduce the performance differences among methods on datasets with limited domain differences, 2) a modified baseline method that surprisingly achieves competitive performance when compared with the state-of-the-art on both the \miniI and the CUB datasets, and 3) a new experimental setting for evaluating the cross-domain generalization ability for few-shot classification algorithms. Our results reveal that reducing intra-class variation is an important factor when the feature backbone is shallow, but not as critical when using deeper backbones. In a realistic cross-domain evaluation setting, we show that a baseline method with a standard fine-tuning practice compares favorably against other state-of-the-art few-shot learning algorithms.