Researcher profile

Ming Du

Ming Du contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

CVEvolve: Autonomous Algorithm Discovery for Unstructured Scientific Data Processing

Scientific data processing often requires task-specific algorithms or AI models, creating a barrier for domain scientists who need to analyze their data but may not have extensive computing or image-processing expertise. This barrier is especially pronounced when data are noisy, have a high dynamic range, are sparsely labeled, or are only loosely specified. We introduce CVEvolve, an autonomous agentic harness with a zero-code interface for scientific data-processing algorithm discovery. CVEvolve combines a multi-round search strategy with tools for code execution, evaluation implementation, history management, holdout testing, and optional inspection of scientific data and visual outputs. The search alternates between discovery and improvement actions, and uses lineage-aware stochastic candidate sampling to balance exploration and exploitation. We demonstrate CVEvolve on x-ray fluorescence microscopy image registration, Bragg peak detection, and high-energy diffraction microscopy image segmentation. Across these tasks, CVEvolve discovers algorithms that improve over baseline methods, while holdout test tracking helps identify candidates that generalize better than later over-optimized alternatives. These results show that zero-code, autonomous LLM-powered algorithm development can help domain scientists turn unstructured scientific image data into practical algorithms and downstream scientific discoveries.

preprint2022arXiv

A Wavelet Transform and self-supervised learning-based framework for bearing fault diagnosis with limited labeled data

Traditional supervised bearing fault diagnosis methods rely on massive labelled data, yet annotations may be very time-consuming or infeasible. The fault diagnosis approach that utilizes limited labelled data is becoming increasingly popular. In this paper, a Wavelet Transform (WT) and self-supervised learning-based bearing fault diagnosis framework is proposed to address the lack of supervised samples issue. Adopting the WT and cubic spline interpolation technique, original measured vibration signals are converted to the time-frequency maps (TFMs) with a fixed scale as inputs. The Vision Transformer (ViT) is employed as the encoder for feature extraction, and the self-distillation with no labels (DINO) algorithm is introduced in the proposed framework for self-supervised learning with limited labelled data and sufficient unlabeled data. Two rolling bearing fault datasets are used for validations. In the case of both datasets only containing 1% labelled samples, utilizing the feature vectors extracted by the trained encoder without fine-tuning, over 90\% average diagnosis accuracy can be obtained based on the simple K-Nearest Neighbor (KNN) classifier. Furthermore, the superiority of the proposed method is demonstrated in comparison with other self-supervised fault diagnosis methods.

preprint2022arXiv

Efficient Reachability Ratio Computation for 2-hop Labeling Scheme

As one of the fundamental graph operations, reachability queries processing has been extensively studied during the past decades. Many approaches followed the line of designing 2-hop labels to make acceleration. Considering that the index size cannot be bounded when using all nodes to construct 2-hop labels, researchers proposed to use a part of important nodes to construct 2-hop labels (partial 2-hop labels) to cover as much reachability information as possible. Then, we may achieve better query performance with limited index size and index construction time. However, partial 2-hop labels do not always perform well on different graphs. In this paper, we focus on the problem of how to efficiently compute reachability ratio, such that to help users determine whether partial 2-hop labels should be used to answer reachability queries for the given graph. Intuitively, reachability ratio denotes the ratio of the number of reachable queries that can be answered by partial 2-hop labels over the total number of reachable queries involved in the given graph. We discuss the difficulties of reachability ratio computation, and propose an incremental-partition algorithm for reachability ratio computation. We show by rich experimental results that our algorithm can efficiently get the result of reachability ratio, and show how the overall query performance is affected by different partial 2-hop labels. Based on the experimental results, we give out our findings on whether partial 2-hop labels should be used to the given graph for reachability queries processing.

preprint2022arXiv

Searching for Apparel Products from Images in the Wild

In this age of social media, people often look at what others are wearing. In particular, Instagram and Twitter influencers often provide images of themselves wearing different outfits and their followers are often inspired to buy similar clothes.We propose a system to automatically find the closest visually similar clothes in the online Catalog (street-to-shop searching). The problem is challenging since the original images are taken under different pose and lighting conditions. The system initially localizes high-level descriptive regions (top, bottom, wristwear. . . ) using multiple CNN detectors such as YOLO and SSD that are trained specifically for apparel domain. It then classifies these regions into more specific regions such as t-shirts, tunic or dresses. Finally, a feature embedding learned using a multi-task function is recovered for every item and then compared with corresponding items in the online Catalog database and ranked according to distance. We validate our approach component-wise using benchmark datasets and end-to-end using human evaluation.

preprint2020arXiv

Relative merits and limiting factors for x-ray and electron microscopy of thick, hydrated organic materials (revised)

Electron and x-ray microscopes allow one to image the entire, unlabeled structure of hydrated materials at a resolution well beyond what visible light microscopes can achieve. However, both approaches involve ionizing radiation, so that radiation damage must be considered as one of the limits to imaging. Drawing upon earlier work, we describe here a unified approach to estimating the image contrast (and thus the required exposure and corresponding radiation dose) in both x-ray and electron microscopy. This approach accounts for factors such as plural and inelastic scattering, and (in electron microscopy) the use of energy filters to obtain so-called "zero loss" images. As expected, it shows that electron microscopy offers lower dose for specimens thinner than about 1 micron (such as for studies of macromolecules, viruses, bacteria and archaebacteria, and thin sectioned material), while x-ray microscopy offers superior characteristics for imaging thicker specimen such as whole eukaryotic cells, thick-sectioned tissues, and organs. The required radiation dose scales strongly as a function of the desired spatial resolution, allowing one to understand the limits of live and frozen hydrated specimen imaging. Finally, we consider the factors limiting x-ray microscopy of thicker materials, suggesting that specimens as thick as a whole mouse brain can be imaged with x-ray microscopes without significant image degradation should appropriate image reconstruction methods be identified. The as-published article [Ultramicroscopy 184, 293--309 (2018); doi:10.1016/j.ultramic.2017.10.003] had some minor mistakes that we correct here, with all changes from the as-published article shown in blue.