Source author record

Weimin Wang

Weimin Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Information Theory math.IT physics.optics physics.plasm-ph cond-mat.soft eess.IV physics.app-ph physics.class-ph Robotics Systems and Control

Catalog footprint

What is connected

17works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Learning with Semantic Priors: Stabilizing Point-Supervised Infrared Small Target Detection via Hierarchical Knowledge Distillation

Single-frame Infrared Small Target Detection (ISTD) aims to localize weak targets under heavy background clutter, yet dense pixel-wise annotations are expensive. Point supervision with online label evolution reduces annotation cost; however, lightweight CNN detectors often lack sufficient semantics, leading to noisy pseudo-masks and unstable optimization. To address this, we propose a hierarchical VFM-driven knowledge distillation framework that uses a frozen Vision Foundation Model (VFM) during training. We formulate point-supervised learning as a bilevel optimization process: the inner loop adapts a VFM-embedded teacher on reweighted training samples, while the outer loop transfers validation-guided knowledge to a lightweight student to mitigate pseudo-label noise and training-set bias. We further introduce Semantic-Conditioned Affine Modulation (SCAM) to inject VFM semantics into CNN features at multiple layers. In addition, a dynamic collaborative learning strategy with cluster-level sample reweighting enhances robustness to imperfect pseudo-masks. Experiments on diverse challenging cases across multiple ISTD backbones demonstrate consistent improvements in detection accuracy and training stability. Our code is available at https://github.com/yuanhang-yao/semantic-prior.

preprint2023arXiv

Symmetry breaking and mechanical filter make a pseudo-gimbal-less two-dimensional MEMS scanning mirror with multiple scanning modes

Miniaturized two-dimensional scanning mirror based on microelectromechanical systems (MEMS) technology has great potential in automotive industry, consumer electronics, and biomedicine, etc. Due to its high frequency and large angle, resonant scanning is the mainstream in all MEMS actuation mechanisms, such as harmonic resonant electromagnetic scanner and parametric resonant electrostatic scanner. Although electrostatic scanner has the advantages of low power consumption and IC process compatibility, some shortcomings of parametric resonance, including double frequency of driver electronics and additional feedback control or frequency stabilization system, limit its further application. The symmetry of coplanar electrostatic comb actuator is broken in this paper, and harmonic resonant electrostatic scanner with excellent performance is realized. Further, through adopting mechanical filter, two-dimensional scanning can be achieved through one set of actuators, which avoids the problem that two sets (each for one dimension) of electrostatic actuators must be insulated each other through complicated and expensive processes. A two-dimensional MEMS scanner based on symmetry breaking and mechanical filter was proposed and demonstrated. Multiple scanning modes can be achieved through selective control of a set of four identical actuators.

preprint2022arXiv

Enhancing Local Feature Learning for 3D Point Cloud Processing using Unary-Pairwise Attention

We present a simple but effective attention named the unary-pairwise attention (UPA) for modeling the relationship between 3D point clouds. Our idea is motivated by the analysis that the standard self-attention (SA) that operates globally tends to produce almost the same attention maps for different query positions, revealing difficulties for learning query-independent and query-dependent information jointly. Therefore, we reformulate the SA and propose query-independent (Unary) and query-dependent (Pairwise) components to facilitate the learning of both terms. In contrast to the SA, the UPA ensures query dependence via operating locally. Extensive experiments show that the UPA outperforms the SA consistently on various point cloud understanding tasks including shape classification, part segmentation, and scene segmentation. Moreover, simply equipping the popular PointNet++ method with the UPA even outperforms or is on par with the state-of-the-art attention-based approaches. In addition, the UPA systematically boosts the performance of both standard and modern networks when it is integrated into them as a compositional module.

preprint2022arXiv

Enhancing Local Feature Learning Using Diffusion for 3D Point Cloud Understanding

Learning point clouds is challenging due to the lack of connectivity information, i.e., edges. Although existing edge-aware methods can improve the performance by modeling edges, how edges contribute to the improvement is unclear. In this study, we propose a method that automatically learns to enhance/suppress edges while keeping the its working mechanism clear. First, we theoretically figure out how edge enhancement/suppression works. Second, we experimentally verify the edge enhancement/suppression behavior. Third, we empirically show that this behavior improves performance. In general, we observe that the proposed method achieves competitive performance in point cloud classification and segmentation tasks.

preprint2022arXiv

Enhancing Local Geometry Learning for 3D Point Cloud via Decoupling Convolution

Modeling the local surface geometry is challenging in 3D point cloud understanding due to the lack of connectivity information. Most prior works model local geometry using various convolution operations. We observe that the convolution can be equivalently decomposed as a weighted combination of a local and a global component. With this observation, we explicitly decouple these two components so that the local one can be enhanced and facilitate the learning of local surface geometry. Specifically, we propose Laplacian Unit (LU), a simple yet effective architectural unit that can enhance the learning of local geometry. Extensive experiments demonstrate that networks equipped with LUs achieve competitive or superior performance on typical point cloud understanding tasks. Moreover, through establishing connections between the mean curvature flow, a further investigation of LU based on curvatures is made to interpret the adaptive smoothing and sharpening effect of LU. The code will be available.

preprint2022arXiv

Surgical Skill Assessment via Video Semantic Aggregation

Automated video-based assessment of surgical skills is a promising task in assisting young surgical trainees, especially in poor-resource areas. Existing works often resort to a CNN-LSTM joint framework that models long-term relationships by LSTMs on spatially pooled short-term CNN features. However, this practice would inevitably neglect the difference among semantic concepts such as tools, tissues, and background in the spatial dimension, impeding the subsequent temporal relationship modeling. In this paper, we propose a novel skill assessment framework, Video Semantic Aggregation (ViSA), which discovers different semantic parts and aggregates them across spatiotemporal dimensions. The explicit discovery of semantic parts provides an explanatory visualization that helps understand the neural network's decisions. It also enables us to further incorporate auxiliary information such as the kinematic data to improve representation learning and performance. The experiments on two datasets show the competitiveness of ViSA compared to state-of-the-art methods. Source code is available at: bit.ly/MICCAI2022ViSA.

preprint2022arXiv

Weakly Supervised Silhouette-based Semantic Scene Change Detection

This paper presents a novel semantic scene change detection scheme with only weak supervision. A straightforward approach for this task is to train a semantic change detection network directly from a large-scale dataset in an end-to-end manner. However, a specific dataset for this task, which is usually labor-intensive and time-consuming, becomes indispensable. To avoid this problem, we propose to train this kind of network from existing datasets by dividing this task into change detection and semantic extraction. On the other hand, the difference in camera viewpoints, for example, images of the same scene captured from a vehicle-mounted camera at different time points, usually brings a challenge to the change detection task. To address this challenge, we propose a new siamese network structure with the introduction of correlation layer. In addition, we collect and annotate a publicly available dataset for semantic change detection to evaluate the proposed method. The experimental results verified both the robustness to viewpoint difference in change detection task and the effectiveness for semantic change detection of the proposed networks. Our code and dataset are available at https://kensakurada.github.io/pscd.

preprint2021arXiv

Weighted boxes fusion: Ensembling boxes from different object detection models

In this work, we present a novel method for combining predictions of object detection models: weighted boxes fusion. Our algorithm utilizes confidence scores of all proposed bounding boxes to constructs the averaged boxes. We tested method on several datasets and evaluated it in the context of the Open Images and COCO Object Detection tracks, achieving top results in these challenges. The source code is publicly available at https://github.com/ZFTurbo/Weighted-Boxes-Fusion

preprint2020arXiv

3D Object Detection Method Based on YOLO and K-Means for Image and Point Clouds

Lidar based 3D object detection and classification tasks are essential for autonomous driving(AD). A lidar sensor can provide the 3D point cloud data reconstruction of the surrounding environment. However, real time detection in 3D point clouds still needs a strong algorithmic. This paper proposes a 3D object detection method based on point cloud and image which consists of there parts.(1)Lidar-camera calibration and undistorted image transformation. (2)YOLO-based detection and PointCloud extraction, (3)K-means based point cloud segmentation and detection experiment test and evaluation in depth image. In our research, camera can capture the image to make the Real-time 2D object detection by using YOLO, we transfer the bounding box to node whose function is making 3d object detection on point cloud data from Lidar. By comparing whether 2D coordinate transferred from the 3D point is in the object bounding box or not can achieve High-speed 3D object recognition function in GPU. The accuracy and precision get imporved after k-means clustering in point cloud. The speed of our detection method is a advantage faster than PointNet.

preprint2020arXiv

SOIC: Semantic Online Initialization and Calibration for LiDAR and Camera

This paper presents a novel semantic-based online extrinsic calibration approach, SOIC (so, I see), for Light Detection and Ranging (LiDAR) and camera sensors. Previous online calibration methods usually need prior knowledge of rough initial values for optimization. The proposed approach removes this limitation by converting the initialization problem to a Perspective-n-Point (PnP) problem with the introduction of semantic centroids (SCs). The closed-form solution of this PnP problem has been well researched and can be found with existing PnP methods. Since the semantic centroid of the point cloud usually does not accurately match with that of the corresponding image, the accuracy of parameters are not improved even after a nonlinear refinement process. Thus, a cost function based on the constraint of the correspondence between semantic elements from both point cloud and image data is formulated. Subsequently, optimal extrinsic parameters are estimated by minimizing the cost function. We evaluate the proposed method either with GT or predicted semantics on KITTI dataset. Experimental results and comparisons with the baseline method verify the feasibility of the initialization strategy and the accuracy of the calibration approach. In addition, we release the source code at https://github.com/--/SOIC.

preprint2020arXiv

YOLO and K-Means Based 3D Object Detection Method on Image and Point Cloud

Lidar based 3D object detection and classification tasks are essential for automated driving(AD). A Lidar sensor can provide the 3D point coud data reconstruction of the surrounding environment. But the detection in 3D point cloud still needs a strong algorithmic challenge. This paper consists of three parts.(1)Lidar-camera calib. (2)YOLO, based detection and PointCloud extraction, (3) k-means based point cloud segmentation. In our research, Camera can capture the image to make the Real-time 2D Object Detection by using YOLO, I transfer the bounding box to node whose function is making 3d object detection on point cloud data from Lidar. By comparing whether 2D coordinate transferred from the 3D point is in the object bounding box or not, and doing a k-means clustering can achieve High-speed 3D object recognition function in GPU.

preprint2015arXiv

600-T Magnetic Fields due to Cold Electron Flow in a simple Cu-Coil irradiated by High Power Laser pulses

A new simple mechanism due to cold electron flow to produce strong magnetic field is proposed. A 600-T strong magnetic field is generated in the free space at the laser intensity of 5.7x10^15 Wcm^-2. Theoretical analysis indicates that the magnetic field strength is proportional to laser intensity. Such a strong magnetic field offers a new experimental test bed to study laser-plasma physics, in particular, fast-ignition laser fusion research and laboratory astrophysics.

preprint2015arXiv

Analysis of Information Theoretic Limitation for Linear Time Invariant Feedback Systems

Information-theoretic fundamental limitation in feedback control system is an important topic for decades. In this paper, a new bode-like fundamental inequality in causal feedback control system is developed. This inequality relates directed information, mutual information and bode sensitivity functions. This inequality recovers previous known results in feedback system as special cases.

preprint2015arXiv

Conformal Mapping for Multiple Terminals

Conformal mapping is an important mathematical tool in many physical and engineering fields, especially in electrostatics, fluid mechanics, classical mechanics, and transformation optics. However in the existing textbooks and literatures, it is only adopted to solve the problems which have only two terminals. Two terminals with electric potential differences, pressure difference, optical path difference, etc., can be mapped conformally onto a solvable structure, e.g., a rectangle, where the two terminals are mapped onto two opposite edges of the rectangle. Here we show a conformal mapping method for multiple terminals, which is more common in practical applications. Through accurate analysis of the boundary conditions, additional terminals or boundaries are folded in the inner of the mapped rectangle. Then the solution will not be influenced. The method is described in several typical situations and two application examples are detailed. The first example is an electrostatic actuator with three electrodes. A previous literature dealt with this problem by approximately treat the three electrodes as two electrodes. Based on the proposed method, a preciser result is achieved in our paper. The second example is a light beam splitter designed by transformation optics, which is recently attracting growing interests around the world. The splitter has three ports, one for input and two for output. Based on the proposed method, a relatively simple and precise solution compared with previously reported results is obtained.

preprint2015arXiv

Dynamic control of defective gap mode through defect location

A 1D model is developed for defective gap mode (DGM) with two types of boundary conditions: conducting mesh and conducting sleeve. For a periodically modulated system without defect, the normalized width of spectral gaps equals to the modulation factor, which is consistent with previous studies. For a periodic system with local defects introduced by the boundary conditions, it shows that the conducting-mesh-induced DGM is always well confined by spectral gaps while the conducting-sleeve-induced DGM is not. The defect location can be a useful tool to dynamically control the frequency and spatial periodicity of DGM inside spectral gaps. This controllability can be applied to optical microcavities and waveguides in photonic crystals and the interaction between gap eigenmodes and energetic particles in fusion plasmas.

preprint2015arXiv

Information Rate Decomposition for Feedback Systems with Output Disturbance

This technical note considers the problem of resource allocation in linear feedback control system with output disturbance. By decomposing the information rate in the feedback communication channel, the channel resource allocation is thoroughly analyzed. The results show that certain amount of resource is used to transmit the output disturbance and this resource allocation is independent from feedback controller design.

preprint2013arXiv

Generation of Diffraction-Free Optical Beams Using Wrinkled Membranes

We report the first demonstration of wrinkled membranes as a kind of optical focusing devices, which are low cost, light weight and flexible. Our device consists of concentric wrinkle rings on a gold-PDMS bilayer membrane, which converts collimated illuminations to diffraction-free focused beams. Beam diameters of 300-400 μm have been observed in the visible range. By comparing the theoretically calculated and experimentally measured focal spot profiles, we predict a focal spot size as small as around 50 μm if fabrication eccentricity can be eliminated.

Weimin Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

Learning with Semantic Priors: Stabilizing Point-Supervised Infrared Small Target Detection via Hierarchical Knowledge Distillation

Symmetry breaking and mechanical filter make a pseudo-gimbal-less two-dimensional MEMS scanning mirror with multiple scanning modes

Enhancing Local Feature Learning for 3D Point Cloud Processing using Unary-Pairwise Attention

Enhancing Local Feature Learning Using Diffusion for 3D Point Cloud Understanding

Enhancing Local Geometry Learning for 3D Point Cloud via Decoupling Convolution

Surgical Skill Assessment via Video Semantic Aggregation

Weakly Supervised Silhouette-based Semantic Scene Change Detection

Weighted boxes fusion: Ensembling boxes from different object detection models

3D Object Detection Method Based on YOLO and K-Means for Image and Point Clouds

SOIC: Semantic Online Initialization and Calibration for LiDAR and Camera

YOLO and K-Means Based 3D Object Detection Method on Image and Point Cloud

600-T Magnetic Fields due to Cold Electron Flow in a simple Cu-Coil irradiated by High Power Laser pulses

Analysis of Information Theoretic Limitation for Linear Time Invariant Feedback Systems

Conformal Mapping for Multiple Terminals

Dynamic control of defective gap mode through defect location

Information Rate Decomposition for Feedback Systems with Output Disturbance

Generation of Diffraction-Free Optical Beams Using Wrinkled Membranes