Source author record

Min Xu

Min Xu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

44works

26topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

FinDeepResearch: Evaluating Deep Research Agents in Rigorous Financial Analysis

Deep Research (DR) agents, powered by advanced Large Language Models (LLMs), have recently garnered increasing attention for their capability in conducting complex research tasks. However, existing literature lacks a rigorous and systematic evaluation of DR Agent's capabilities in critical research analysis. To address this gap, we first propose HisRubric, a novel evaluation framework with a hierarchical analytical structure and a fine-grained grading rubric for rigorously assessing DR agents' capabilities in corporate financial analysis. This framework mirrors the professional analyst's workflow, progressing from data recognition to metric calculation, and finally to strategic summarization and interpretation. Built on this framework, we construct a FinDeepResearch benchmark that comprises 64 listed companies from 8 financial markets across 4 languages, encompassing a total of 15,808 grading items. We further conduct extensive experiments on the FinDeepResearch using 16 representative methods, including 6 DR agents, 5 LLMs equipped with both deep reasoning and search capabilities, and 5 LLMs with deep reasoning capabilities only. The results reveal the strengths and limitations of these approaches across diverse capabilities, financial markets, and languages, offering valuable insights for future research and development. The benchmark and evaluation code is publicly available at https://OpenFinArena.com/.

preprint2026arXiv

Imaging-anchored Multiomics in Cardiovascular Disease: Integrating Cardiac Imaging, Bulk, Single-cell, and Spatial Transcriptomics

Cardiovascular disease arises from interactions between inherited risk, molecular programmes, and tissue-scale remodelling that are observed clinically through imaging. Health systems now routinely generate large volumes of cardiac MRI, CT and echocardiography together with bulk, single-cell and spatial transcriptomics, yet these data are still analysed in separate pipelines. This review examines joint representations that link cardiac imaging phenotypes to transcriptomic and spatially resolved molecular states. An imaging-anchored perspective is adopted in which echocardiography, cardiac MRI and CT define a spatial phenotype of the heart, and bulk, single-cell and spatial transcriptomics provide cell-type- and location-specific molecular context. The biological and technical characteristics of these modalities are first summarised, and representation-learning strategies for each are outlined. Multimodal fusion approaches are reviewed, with emphasis on handling missing data, limited sample size, and batch effects. Finally, integrative pipelines for radiogenomics, spatial molecular alignment, and image-based prediction of gene expression are discussed, together with common failure modes, practical considerations, and open challenges. Spatial multiomics of human myocardium and atherosclerotic plaque, single-cell and spatial foundation models, and multimodal medical foundation models are collectively bringing imaging-anchored multiomics closer to large-scale cardiovascular translation.

preprint2026arXiv

Noise2Params: Unification and Parameter Determination from Noise via a Probabilistic Event Camera Model

Accurate, unified models for event cameras (ECs) remain elusive, hampering calibration and algorithm design. We develop a foundational probabilistic model for EC event detection, grounded in photon statistics, that unifies the description of static scene noise events and step response curves (S-curves) within a single analytical framework. Three formulations of the probability distributions are derived, spanning all intensity regimes: exact Poisson, saddle-point, and Gaussian. The model reveals the underlying connection between these otherwise disparate EC behaviors and clarifies the interpretation of S-curves, which we show is more nuanced than selecting a fixed probability threshold. Based on this model, we propose Noise2Params, a method for determining camera-specific values of the log-contrast threshold $B$, the lux-to-photon conversion factor $α$, and the leakage term $θ$ (found to be intensity dependent), via error minimization against observed noise-event distributions. Noise2Params requires only recordings of static, uniform scenes, offering an experimentally accessible alternative to approaches that demand specialized dynamic light sources. We further support the validity the model by training convolutional neural networks (CNNs) on synthetic noise images generated from our distributions and evaluating their ability to reconstruct static scenes from experimental data. We further demonstrate the utility of our model by showing that CNNs incorporating synthetic data outperform those trained solely on experimental data. Our framework provides a quantitative foundation for EC calibration, noise-aware algorithm design, and applications in photon-limited regimes.

preprint2026arXiv

Unsupervised SE(3) Disentanglement for in situ Macromolecular Morphology Identification from Cryo-Electron Tomography

Cryo-electron tomography (cryo-ET) provides direct 3D visualization of macromolecules inside the cell, enabling analysis of their in situ morphology. This morphology can be regarded as an SE(3)-invariant, denoised volumetric representation of subvolumes extracted from tomograms. Inferring morphology is therefore an inverse problem of estimating both a template morphology and its SE(3) transformation. Existing expectation-maximization based solution to this problem often misses rare but important morphologies and requires extensive manual hyperparameter tuning. Addressing this issue, we present a disentangled deep representation learning framework that separates SE(3) transformations from morphological content in the representation space. The framework includes a novel multi-choice learning module that enables this disentanglement for highly noisy cryo-ET data, and the learned morphological content is used to generate template morphologies. Experiments on simulated and real cryo-ET datasets demonstrate clear improvements over prior methods, including the discovery of previously unidentified macromolecular morphologies.

preprint2023arXiv

Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation

The Segment Anything Model (SAM) has recently gained popularity in the field of image segmentation due to its impressive capabilities in various segmentation tasks and its prompt-based interface. However, recent studies and individual experiments have shown that SAM underperforms in medical image segmentation, since the lack of the medical specific knowledge. This raises the question of how to enhance SAM's segmentation capability for medical images. In this paper, instead of fine-tuning the SAM model, we propose the Medical SAM Adapter (Med-SA), which incorporates domain-specific medical knowledge into the segmentation model using a light yet effective adaptation technique. In Med-SA, we propose Space-Depth Transpose (SD-Trans) to adapt 2D SAM to 3D medical images and Hyper-Prompting Adapter (HyP-Adpt) to achieve prompt-conditioned adaptation. We conduct comprehensive evaluation experiments on 17 medical image segmentation tasks across various image modalities. Med-SA outperforms several state-of-the-art (SOTA) medical image segmentation methods, while updating only 2\% of the parameters. Our code is released at https://github.com/KidsWithTokens/Medical-SAM-Adapter.

preprint2022arXiv

A generic framework for coded caching and distributed computation schemes

Several network communication problems are highly related such as coded caching and distributed computation. The centralized coded caching focuses on reducing the network burden in peak times in a wireless network system and the coded distributed computation studies the tradeoff between computation and communication in distributed system. In this paper, motivated by the study of the only rainbow $3$-term arithmetic progressions set, we propose a unified framework for constructing coded caching schemes. This framework builds bridges between coded caching schemes and lots of combinatorial objects due to the freedom of the choices of families and operations. We prove that any scheme based on a placement delivery array (PDA) can be represented by a rainbow scheme under this framework and lots of other known schemes can also be included in this framework. Moreover, we also present a new coded caching scheme with linear subpacketization and near constant rate using the only rainbow $3$-term arithmetic progressions set. Next, we modify the framework to be applicable to the distributed computing problem. We present a new transmission scheme in the shuffle phase and show that in certain cases it could have a lower communication load than the schemes based on PDAs or resolvable designs with the same number of files.

preprint2022arXiv

An Efficient Multitask Neural Network for Face Alignment, Head Pose Estimation and Face Tracking

While Convolutional Neural Networks (CNNs) have significantly boosted the performance of face related algorithms, maintaining accuracy and efficiency simultaneously in practical use remains challenging. The state-of-the-art methods employ deeper networks for better performance, which makes it less practical for mobile applications because of more parameters and higher computational complexity. Therefore, we propose an efficient multitask neural network, Alignment & Tracking & Pose Network (ATPN) for face alignment, face tracking and head pose estimation. Specifically, to achieve better performance with fewer layers for face alignment, we introduce a shortcut connection between shallow-layer and deep-layer features. We find the shallow-layer features are highly correspond to facial boundaries that can provide the structural information of face and it is crucial for face alignment. Moreover, we generate a cheap heatmap based on the face alignment result and fuse it with features to improve the performance of the other two tasks. Based on the heatmap, the network can utilize both geometric information of landmarks and appearance information for head pose estimation. The heatmap also provides attention clues for face tracking. The face tracking task also saves us the face detection procedure for each frame, which also significantly boost the real-time capability for video-based tasks. We experimentally validate ATPN on four benchmark datasets, WFLW, 300VW, WIDER Face and 300W-LP. The experimental results demonstrate that it achieves better performance with much less parameters and lower computational complexity compared to other light models.

preprint2022arXiv

BenchPress: A Deep Active Benchmark Generator

We develop BenchPress, the first ML benchmark generator for compilers that is steerable within feature space representations of source code. BenchPress synthesizes compiling functions by adding new code in any part of an empty or existing sequence by jointly observing its left and right context, achieving excellent compilation rate. BenchPress steers benchmark generation towards desired target features that has been impossible for state of the art synthesizers (or indeed humans) to reach. It performs better in targeting the features of Rodinia benchmarks in 3 different feature spaces compared with (a) CLgen - a state of the art ML synthesizer, (b) CLSmith fuzzer, (c) SRCIROR mutator or even (d) human-written code from GitHub. BenchPress is the first generator to search the feature space with active learning in order to generate benchmarks that will improve a downstream task. We show how using BenchPress, Grewe's et al. CPU vs GPU heuristic model can obtain a higher speedup when trained on BenchPress's benchmarks compared to other techniques. BenchPress is a powerful code generator: Its generated samples compile at a rate of 86%, compared to CLgen's 2.33%. Starting from an empty fixed input, BenchPress produces 10x more unique, compiling OpenCL benchmarks than CLgen, which are significantly larger and more feature diverse.

preprint2022arXiv

Boosting Active Learning via Improving Test Performance

Central to active learning (AL) is what data should be selected for annotation. Existing works attempt to select highly uncertain or informative data for annotation. Nevertheless, it remains unclear how selected data impacts the test performance of the task model used in AL. In this work, we explore such an impact by theoretically proving that selecting unlabeled data of higher gradient norm leads to a lower upper-bound of test loss, resulting in better test performance. However, due to the lack of label information, directly computing gradient norm for unlabeled data is infeasible. To address this challenge, we propose two schemes, namely expected-gradnorm and entropy-gradnorm. The former computes the gradient norm by constructing an expected empirical loss while the latter constructs an unsupervised loss with entropy. Furthermore, we integrate the two schemes in a universal AL framework. We evaluate our method on classical image classification and semantic segmentation tasks. To demonstrate its competency in domain applications and its robustness to noise, we also validate our method on a cellular imaging analysis task, namely cryo-Electron Tomography subtomogram classification. Results demonstrate that our method achieves superior performance against the state of the art. Our source code is available at https://github.com/xulabs/aitom/blob/master/doc/projects/al_gradnorm.md.

preprint2022arXiv

Color Space-based HoVer-Net for Nuclei Instance Segmentation and Classification

Nuclei segmentation and classification is the first and most crucial step that is utilized for many different microscopy medical analysis applications. However, it suffers from many issues such as the segmentation of small objects, imbalance, and fine-grained differences between types of nuclei. In this paper, multiple different contributions were done tackling these problems present. Firstly, the recently released "ConvNeXt" was used as the encoder for HoVer-Net model since it leverages the key components of transformers that make them perform well. Secondly, to enhance the visual differences between nuclei, a multi-channel color space-based approach is used to aid the model in extracting distinguishing features. Thirdly, Unified Focal loss (UFL) was used to tackle the background-foreground imbalance. Finally, Sharpness-Aware Minimization (SAM) was used to ensure generalizability of the model. Overall, we were able to outperform the current state-of-the-art (SOTA), HoVer-Net, on the preliminary test set of the CoNiC Challenge 2022 by 12.489% mPQ+.

preprint2022arXiv

Estimating Instance-dependent Bayes-label Transition Matrix using a Deep Neural Network

In label-noise learning, estimating the transition matrix is a hot topic as the matrix plays an important role in building statistically consistent classifiers. Traditionally, the transition from clean labels to noisy labels (i.e., clean-label transition matrix (CLTM)) has been widely exploited to learn a clean label classifier by employing the noisy data. Motivated by that classifiers mostly output Bayes optimal labels for prediction, in this paper, we study to directly model the transition from Bayes optimal labels to noisy labels (i.e., Bayes-label transition matrix (BLTM)) and learn a classifier to predict Bayes optimal labels. Note that given only noisy data, it is ill-posed to estimate either the CLTM or the BLTM. But favorably, Bayes optimal labels have less uncertainty compared with the clean labels, i.e., the class posteriors of Bayes optimal labels are one-hot vectors while those of clean labels are not. This enables two advantages to estimate the BLTM, i.e., (a) a set of examples with theoretically guaranteed Bayes optimal labels can be collected out of noisy data; (b) the feasible solution space is much smaller. By exploiting the advantages, we estimate the BLTM parametrically by employing a deep neural network, leading to better generalization and superior classification performance.

preprint2022arXiv

Objects in Semantic Topology

A more realistic object detection paradigm, Open-World Object Detection, has arisen increasing research interests in the community recently. A qualified open-world object detector can not only identify objects of known categories, but also discover unknown objects, and incrementally learn to categorize them when their annotations progressively arrive. Previous works rely on independent modules to recognize unknown categories and perform incremental learning, respectively. In this paper, we provide a unified perspective: Semantic Topology. During the life-long learning of an open-world object detector, all object instances from the same category are assigned to their corresponding pre-defined node in the semantic topology, including the `unknown' category. This constraint builds up discriminative feature representations and consistent relationships among objects, thus enabling the detector to distinguish unknown objects out of the known categories, as well as making learned features of known objects undistorted when learning new categories incrementally. Extensive experiments demonstrate that semantic topology, either randomly-generated or derived from a well-trained language model, could outperform the current state-of-the-art open-world object detectors by a large margin, e.g., the absolute open-set error is reduced from 7832 to 2546, exhibiting the inherent superiority of semantic topology on open-world object detection.

preprint2022arXiv

SHREC 2021: Classification in cryo-electron tomograms

Cryo-electron tomography (cryo-ET) is an imaging technique that allows three-dimensional visualization of macro-molecular assemblies under near-native conditions. Cryo-ET comes with a number of challenges, mainly low signal-to-noise and inability to obtain images from all angles. Computational methods are key to analyze cryo-electron tomograms. To promote innovation in computational methods, we generate a novel simulated dataset to benchmark different methods of localization and classification of biological macromolecules in tomograms. Our publicly available dataset contains ten tomographic reconstructions of simulated cell-like volumes. Each volume contains twelve different types of complexes, varying in size, function and structure. In this paper, we have evaluated seven different methods of finding and classifying proteins. Seven research groups present results obtained with learning-based methods and trained on the simulated dataset, as well as a baseline template matching (TM), a traditional method widely used in cryo-ET research. We show that learning-based approaches can achieve notably better localization and classification performance than TM. We also experimentally confirm that there is a negative relationship between particle size and performance for all methods.

preprint2022arXiv

Sparse Local Patch Transformer for Robust Face Alignment and Landmarks Inherent Relation Learning

Heatmap regression methods have dominated face alignment area in recent years while they ignore the inherent relation between different landmarks. In this paper, we propose a Sparse Local Patch Transformer (SLPT) for learning the inherent relation. The SLPT generates the representation of each single landmark from a local patch and aggregates them by an adaptive inherent relation based on the attention mechanism. The subpixel coordinate of each landmark is predicted independently based on the aggregated feature. Moreover, a coarse-to-fine framework is further introduced to incorporate with the SLPT, which enables the initial landmarks to gradually converge to the target facial landmarks using fine-grained features from dynamically resized local patches. Extensive experiments carried out on three popular benchmarks, including WFLW, 300W and COFW, demonstrate that the proposed method works at the state-of-the-art level with much less computational complexity by learning the inherent relation between facial landmarks. The code is available at the project website.

preprint2021arXiv

A Visual Analytics Approach to Facilitate the Proctoring of Online Exams

Online exams have become widely used to evaluate students' performance in mastering knowledge in recent years, especially during the pandemic of COVID-19. However, it is challenging to conduct proctoring for online exams due to the lack of face-to-face interaction. Also, prior research has shown that online exams are more vulnerable to various cheating behaviors, which can damage their credibility. This paper presents a novel visual analytics approach to facilitate the proctoring of online exams by analyzing the exam video records and mouse movement data of each student. Specifically, we detect and visualize suspected head and mouse movements of students in three levels of detail, which provides course instructors and teachers with convenient, efficient and reliable proctoring for online exams. Our extensive evaluations, including usage scenarios, a carefully-designed user study and expert interviews, demonstrate the effectiveness and usability of our approach.

preprint2021arXiv

Characterizing the Landscape of COVID-19 Themed Cyberattacks and Defenses

COVID-19 (Coronavirus) hit the global society and economy with a big surprise. In particular, work-from-home has become a new norm for employees. Despite the fact that COVID-19 can equally attack innocent people and cybercriminals, it is ironic to see surges in cyberattacks leveraging COVID-19 as a theme, dubbed COVID-19 themed cyberattacks or COVID-19 attacks for short, which represent a new phenomenon that has yet to be systematically understood. In this paper, we make the first step towards fully characterizing the landscape of these attacks, including their sophistication via the Cyber Kill Chain model. We also explore the solution space of defenses against these attacks.

preprint2021arXiv

Data-Driven Characterization and Detection of COVID-19 Themed Malicious Websites

COVID-19 has hit hard on the global community, and organizations are working diligently to cope with the new norm of "work from home". However, the volume of remote work is unprecedented and creates opportunities for cyber attackers to penetrate home computers. Attackers have been leveraging websites with COVID-19 related names, dubbed COVID-19 themed malicious websites. These websites mostly contain false information, fake forms, fraudulent payments, scams, or malicious payloads to steal sensitive information or infect victims' computers. In this paper, we present a data-driven study on characterizing and detecting COVID-19 themed malicious websites. Our characterization study shows that attackers are agile and are deceptively crafty in designing geolocation targeted websites, often leveraging popular domain registrars and top-level domains. Our detection study shows that the Random Forest classifier can detect COVID-19 themed malicious websites based on the lexical and WHOIS features defined in this paper, achieving a 98% accuracy and 2.7% false-positive rate.

preprint2021arXiv

Inference on the History of a Randomly Growing Tree

The spread of infectious disease in a human community or the proliferation of fake news on social media can be modeled as a randomly growing tree-shaped graph. The history of the random growth process is often unobserved but contains important information such as the source of the infection. We consider the problem of statistical inference on aspects of the latent history using only a single snapshot of the final tree. Our approach is to apply random labels to the observed unlabeled tree and analyze the resulting distribution of the growth process, conditional on the final outcome. We show that this conditional distribution is tractable under a shape-exchangeability condition, which we introduce here, and that this condition is satisfied for many popular models for randomly growing trees such as uniform attachment, linear preferential attachment and uniform attachment on a $D$-regular tree. For inference of the root under shape-exchangeability, we propose O(n log n) time algorithms for constructing confidence sets with valid frequentist coverage as well as bounds on the expected size of the confidence sets. We also provide efficient sampling algorithms that extend our methods to a wide class of inference problems.

preprint2021arXiv

Self-supervised Pretraining of Visual Features in the Wild

Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods. These results have been achieved in a control environment, that is the highly curated ImageNet dataset. However, the premise of self-supervised learning is that it can learn from any random image and from any unbounded dataset. In this work, we explore if self-supervision lives to its expectation by training large models on random, uncurated images with no supervision. Our final SElf-supERvised (SEER) model, a RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves 84.2% top-1 accuracy, surpassing the best self-supervised pretrained model by 1% and confirming that self-supervised learning works in a real world setting. Interestingly, we also observe that self-supervised models are good few-shot learners achieving 77.9% top-1 with access to only 10% of ImageNet. Code: https://github.com/facebookresearch/vissl

preprint2021arXiv

Single-View 3D Object Reconstruction from Shape Priors in Memory

Existing methods for single-view 3D object reconstruction directly learn to transform image features into 3D representations. However, these methods are vulnerable to images containing noisy backgrounds and heavy occlusions because the extracted image features do not contain enough information to reconstruct high-quality 3D shapes. Humans routinely use incomplete or noisy visual cues from an image to retrieve similar 3D shapes from their memory and reconstruct the 3D shape of an object. Inspired by this, we propose a novel method, named Mem3D, that explicitly constructs shape priors to supplement the missing information in the image. Specifically, the shape priors are in the forms of "image-voxel" pairs in the memory network, which is stored by a well-designed writing strategy during training. We also propose a voxel triplet loss function that helps to retrieve the precise 3D shapes that are highly related to the input image from shape priors. The LSTM-based shape encoder is introduced to extract information from the retrieved 3D shapes, which are useful in recovering the 3D shape of an object that is heavily occluded or in complex environments. Experimental results demonstrate that Mem3D significantly improves reconstruction quality and performs favorably against state-of-the-art methods on the ShapeNet and Pix3D datasets.

preprint2021arXiv

SSFG: Stochastically Scaling Features and Gradients for Regularizing Graph Convolutional Networks

Graph convolutional networks have been successfully applied in various graph-based tasks. In a typical graph convolutional layer, node features are updated by aggregating neighborhood information. Repeatedly applying graph convolutions can cause the oversmoothing issue, i.e., node features at deep layers converge to similar values. Previous studies have suggested that oversmoothing is one of the major issues that restrict the performance of graph convolutional networks. In this paper, we propose a stochastic regularization method to tackle the oversmoothing problem. In the proposed method, we stochastically scale features and gradients (SSFG) by a factor sampled from a probability distribution in the training procedure. By explicitly applying a scaling factor to break feature convergence, the oversmoothing issue is alleviated. We show that applying stochastic scaling at the gradient level is complementary to that applied at the feature level to improve the overall performance. Our method does not increase the number of trainable parameters. When used together with ReLU, our SSFG can be seen as a stochastic ReLU activation function. We experimentally validate our SSFG regularization method on three commonly used types of graph networks. Extensive experimental results on seven benchmark datasets for four graph-based tasks demonstrate that our SSFG regularization is effective in improving the overall performance of the baseline graph networks.

preprint2020arXiv

Experimental Analysis of Legendre Decomposition in Machine Learning

In this technical report, we analyze Legendre decomposition for non-negative tensor in theory and application. In theory, the properties of dual parameters and dually flat manifold in Legendre decomposition are reviewed, and the process of tensor projection and parameter updating is analyzed. In application, a series of verification experiments and clustering experiments with parameters on submanifold were carried out, hoping to find an effective lower dimensional representation of the input tensor. The experimental results show that the parameters on submanifold have no ability to be directly used as low-rank representations. Combined with analysis, we connect Legendre decomposition with neural networks and low-rank representation applications, and put forward some promising prospects.

preprint2020arXiv

Few shot domain adaptation for in situ macromolecule structural classification in cryo-electron tomograms

Motivation: Cryo-Electron Tomography (cryo-ET) visualizes structure and spatial organization of macromolecules and their interactions with other subcellular components inside single cells in the close-to-native state at sub-molecular resolution. Such information is critical for the accurate understanding of cellular processes. However, subtomogram classification remains one of the major challenges for the systematic recognition and recovery of the macromolecule structures in cryo-ET because of imaging limits and data quantity. Recently, deep learning has significantly improved the throughput and accuracy of large-scale subtomogram classification. However often it is difficult to get enough high-quality annotated subtomogram data for supervised training due to the enormous expense of labeling. To tackle this problem, it is beneficial to utilize another already annotated dataset to assist the training process. However, due to the discrepancy of image intensity distribution between source domain and target domain, the model trained on subtomograms in source domainmay perform poorly in predicting subtomogram classes in the target domain. Results: In this paper, we adapt a few shot domain adaptation method for deep learning based cross-domain subtomogram classification. The essential idea of our method consists of two parts: 1) take full advantage of the distribution of plentiful unlabeled target domain data, and 2) exploit the correlation between the whole source domain dataset and few labeled target domain data. Experiments conducted on simulated and real datasets show that our method achieves significant improvement on cross domain subtomogram classification compared with baseline methods.

preprint2020arXiv

Improving Lesion Segmentation for Diabetic Retinopathy using Adversarial Learning

Diabetic Retinopathy (DR) is a leading cause of blindness in working age adults. DR lesions can be challenging to identify in fundus images, and automatic DR detection systems can offer strong clinical value. Of the publicly available labeled datasets for DR, the Indian Diabetic Retinopathy Image Dataset (IDRiD) presents retinal fundus images with pixel-level annotations of four distinct lesions: microaneurysms, hemorrhages, soft exudates and hard exudates. We utilize the HEDNet edge detector to solve a semantic segmentation task on this dataset, and then propose an end-to-end system for pixel-level segmentation of DR lesions by incorporating HEDNet into a Conditional Generative Adversarial Network (cGAN). We design a loss function that adds adversarial loss to segmentation loss. Our experiments show that the addition of the adversarial loss improves the lesion segmentation performance over the baseline.

preprint2020arXiv

Improving Utility and Security of the Shuffler-based Differential Privacy

When collecting information, local differential privacy (LDP) alleviates privacy concerns of users because their private information is randomized before being sent it to the central aggregator. LDP imposes large amount of noise as each user executes the randomization independently. To address this issue, recent work introduced an intermediate server with the assumption that this intermediate server does not collude with the aggregator. Under this assumption, less noise can be added to achieve the same privacy guarantee as LDP, thus improving utility for the data collection task. This paper investigates this multiple-party setting of LDP. We analyze the system model and identify potential adversaries. We then make two improvements: a new algorithm that achieves a better privacy-utility tradeoff; and a novel protocol that provides better protection against various attacks. Finally, we perform experiments to compare different methods and demonstrate the benefits of using our proposed method.

preprint2016arXiv

Modelling Temporal Information Using Discrete Fourier Transform for Recognizing Emotions in User-generated Videos

With the widespread of user-generated Internet videos, emotion recognition in those videos attracts increasing research efforts. However, most existing works are based on framelevel visual features and/or audio features, which might fail to model the temporal information, e.g. characteristics accumulated along time. In order to capture video temporal information, in this paper, we propose to analyse features in frequency domain transformed by discrete Fourier transform (DFT features). Frame-level features are firstly extract by a pre-trained deep convolutional neural network (CNN). Then, time domain features are transferred and interpolated into DFT features. CNN and DFT features are further encoded and fused for emotion classification. By this way, static image features extracted from a pre-trained deep CNN and temporal information represented by DFT features are jointly considered for video emotion recognition. Experimental results demonstrate that combining DFT features can effectively capture temporal information and therefore improve emotion recognition performance. Our approach has achieved a state-of-the-art performance on the largest video emotion dataset (VideoEmotion-8 dataset), improving accuracy from 51.1% to 62.6%.

preprint2016arXiv

Shift equivalence in the generalized factor order

We provide a geometric condition that guarantees strong Wilf equivalence in the generalized factor order. This provides a powerful tool for proving specific and general Wilf equivalence results, and several such examples are given.

preprint2016arXiv

The super spanning connectivity of arrangement graph

A $k$-container $C(u, v)$ of a graph $G$ is a set of $k$ internally disjoint paths between $u$ and $v$. A $k$-container $C(u, v)$ of $G$ is a $k^*$-container if it is a spanning subgraph of $G$. A graph $G$ is $k^*$-connected if there exists a $k^*$-container between any two different vertices of G. A $k$-regular graph $G$ is super spanning connected if $G$ is $i^*$-container for all $1\le i\le k$. In this paper, we prove that the arrangement graph $A_{n, k}$ is super spanning connected if $n\ge 4$ and $n-k\ge 2$.

preprint2015arXiv

Automatic tracking of protein vesicles

With the advance of fluorescence imaging technologies, recently cell biologists are able to record the movement of protein vesicles within a living cell. Automatic tracking of the movements of these vesicles become key for qualitative analysis of dynamics of theses vesicles. In this thesis, we formulate such tracking problem as video object tracking problem, and design a dynamic programming method for tracking single object. Our experiments on simulation data show that the method can identify a track with high accuracy which is robust to the choose of tracking parameters and presence of high level noise. We then extend this method to the tracking multiple objects using the track elimination strategy. In multiple object tracking, the above approach often fails to correctly identify a track when two tracks cross. We solve this problem by incorporating the Kalman filter into the dynamic programming framework. Our experiments on simulated data show that the tracking accuracy is significantly improved.

preprint2015arXiv

Gene selection for cancer classification using a hybrid of univariate and multivariate feature selection methods

Various approaches to gene selection for cancer classification based on microarray data can be found in the literature and they may be grouped into two categories: univariate methods and multivariate methods. Univariate methods look at each gene in the data in isolation from others. They measure the contribution of a particular gene to the classification without considering the presence of the other genes. In contrast, multivariate methods measure the relative contribution of a gene to the classification by taking the other genes in the data into consideration. Multivariate methods select fewer genes in general. However, the selection process of multivariate methods may be sensitive to the presence of irrelevant genes, noises in the expression and outliers in the training data. At the same time, the computational cost of multivariate methods is high. To overcome the disadvantages of the two types of approaches, we propose a hybrid method to obtain gene sets that are small and highly discriminative. We devise our hybrid method from the univariate Maximum Likelihood method (LIK) and the multivariate Recursive Feature Elimination method (RFE). We analyze the properties of these methods and systematically test the effectiveness of our proposed method on two cancer microarray datasets. Our experiments on a leukemia dataset and a small, round blue cell tumors dataset demonstrate the effectiveness of our hybrid method. It is able to discover sets consisting of fewer genes than those reported in the literature and at the same time achieve the same or better prediction accuracy.

preprint2015arXiv

Global Gene Expression Analysis Using Machine Learning Methods

Microarray is a technology to quantitatively monitor the expression of large number of genes in parallel. It has become one of the main tools for global gene expression analysis in molecular biology research in recent years. The large amount of expression data generated by this technology makes the study of certain complex biological problems possible and machine learning methods are playing a crucial role in the analysis process. At present, many machine learning methods have been or have the potential to be applied to major areas of gene expression analysis. These areas include clustering, classification, dynamic modeling and reverse engineering. In this thesis, we focus our work on using machine learning methods to solve the classification problems arising from microarray data. We first identify the major types of the classification problems; then apply several machine learning methods to solve the problems and perform systematic tests on real and artificial datasets. We propose improvement to existing methods. Specifically, we develop a multivariate and a hybrid feature selection method to obtain high classification performance for high dimension classification problems. Using the hybrid feature selection method, we are able to identify small sets of features that give predictive accuracy that is as good as that from other methods which require many more features.

preprint2015arXiv

Integrative analysis of gene expression and phenotype data

The linking genotype to phenotype is the fundamental aim of modern genetics. We focus on study of links between gene expression data and phenotype data through integrative analysis. We propose three approaches. 1) The inherent complexity of phenotypes makes high-throughput phenotype profiling a very difficult and laborious process. We propose a method of automated multi-dimensional profiling which uses gene expression similarity. Large-scale analysis show that our method can provide robust profiling that reveals different phenotypic aspects of samples. This profiling technique is also capable of interpolation and extrapolation beyond the phenotype information given in training data. It can be used in many applications, including facilitating experimental design and detecting confounding factors. 2) Phenotype association analysis problems are complicated by small sample size and high dimensionality. Consequently, phenotype-associated gene subsets obtained from training data are very sensitive to selection of training samples, and the constructed sample phenotype classifiers tend to have poor generalization properties. To eliminate these obstacles, we propose a novel approach that generates sequences of increasingly discriminative gene cluster combinations. Our experiments on both simulated and real datasets show robust and accurate classification performance. 3) Many complex phenotypes, such as cancer, are the product of not only gene expression, but also gene interaction. We propose an integrative approach to find gene network modules that activate under different phenotype conditions. Using our method, we discovered cancer subtype-specific network modules, as well as the ways in which these modules coordinate. In particular, we detected a breast-cancer specific tumor suppressor network module with a hub gene, PDGFRL, which may play an important role in this module.

preprint2014arXiv

Connectivity-preserving Geometry Images

We propose connectivity-preserving geometry images (CGIMs), which map a three-dimensional mesh onto a rectangular regular array of an image, such that the reconstructed mesh produces no sampling errors, but merely round-off errors. We obtain a V-matrix with respect to the original mesh, whose elements are vertices of the mesh, which intrinsically preserves the vertex-set and the connectivity of the original mesh in the sense of allowing round-off errors. We generate a CGIM array by using the Cartesian coordinates of corresponding vertices of the V-matrix. To reconstruct a mesh, we obtain a vertex-set and an edge-set by collecting all the elements with different pixels, and all different pairwise adjacent elements from the CGIM array respectively. Compared with traditional geometry images, CGIMs achieve minimum reconstruction errors with an efficient parametrization-free algorithm via elementary permutation techniques. We apply CGIMs to lossy compression of meshes, and the experimental results show that CGIMs perform well in reconstruction precision and detail preservation.

preprint2014arXiv

Efficient Hybrid Inline and Out-of-line Deduplication for Backup Storage

Backup storage systems often remove redundancy across backups via inline deduplication, which works by referring duplicate chunks of the latest backup to those of existing backups. However, inline deduplication degrades restore performance of the latest backup due to fragmentation, and complicates deletion of ex- pired backups due to the sharing of data chunks. While out-of-line deduplication addresses the problems by forward-pointing existing duplicate chunks to those of the latest backup, it introduces additional I/Os of writing and removing duplicate chunks. We design and implement RevDedup, an efficient hybrid inline and out-of-line deduplication system for backup storage. It applies coarse-grained inline deduplication to remove duplicates of the latest backup, and then fine-grained out-of-line reverse deduplication to remove duplicates from older backups. Our reverse deduplication design limits the I/O overhead and prepares for efficient deletion of expired backups. Through extensive testbed experiments using synthetic and real-world datasets, we show that RevDedup can bring high performance to the backup, restore, and deletion operations, while maintaining high storage efficiency comparable to conventional inline deduplication.

preprint2014arXiv

Faithful Variable Screening for High-Dimensional Convex Regression

We study the problem of variable selection in convex nonparametric regression. Under the assumption that the true regression function is convex and sparse, we develop a screening procedure to select a subset of variables that contains the relevant variables. Our approach is a two-stage quadratic programming method that estimates a sum of one-dimensional convex functions, followed by one-dimensional concave regression fits on the residuals. In contrast to previous methods for sparse additive models, the optimization is finite dimensional and requires no tuning parameters for smoothness. Under appropriate assumptions, we prove that the procedure is faithful in the population setting, yielding no false negatives. We give a finite sample statistical analysis, and introduce algorithms for efficiently carrying out the required quadratic programs. The approach leads to computational and statistical advantages over fitting a full model, and provides an effective, practical approach to variable screening in convex regression.

preprint2012arXiv

Conditional Sparse Coding and Grouped Multivariate Regression

We study the problem of multivariate regression where the data are naturally grouped, and a regression matrix is to be estimated for each group. We propose an approach in which a dictionary of low rank parameter matrices is estimated across groups, and a sparse linear combination of the dictionary elements is estimated to form a model within each group. We refer to the method as conditional sparse coding since it is a coding procedure for the response vectors Y conditioned on the covariate vectors X. This approach captures the shared information across the groups while adapting to the structure within each group. It exploits the same intuition behind sparse coding that has been successfully developed in computer vision and computational neuroscience. We propose an algorithm for conditional sparse coding, analyze its theoretical properties in terms of predictive accuracy, and present the results of simulation and brain imaging experiments that compare the new technique to reduced rank regression.

preprint2012arXiv

Efficient Active Algorithms for Hierarchical Clustering

Advances in sensing technologies and the growth of the internet have resulted in an explosion in the size of modern datasets, while storage and processing power continue to lag behind. This motivates the need for algorithms that are efficient, both in terms of the number of measurements needed and running time. To combat the challenges associated with large datasets, we propose a general framework for active hierarchical clustering that repeatedly runs an off-the-shelf clustering algorithm on small subsets of the data and comes with guarantees on performance, measurement complexity and runtime complexity. We instantiate this framework with a simple spectral clustering algorithm and provide concrete results on its performance, showing that, under some assumptions, this algorithm recovers all clusters of size ?(log n) using O(n log^2 n) similarities and runs in O(n log^3 n) time for a dataset of n objects. Through extensive experimentation we also demonstrate that this framework is practically alluring.

preprint2012arXiv

The Forwarding Indices of Graphs -- a Survey

A routing $R$ of a given connected graph $G$ of order $n$ is a collection of $n(n-1)$ simple paths connecting every ordered pair of vertices of $G$. The vertex-forwarding index $ξ(G,R)$ of $G$ with respect to $R$ is defined as the maximum number of paths in $R$ passing through any vertex of $G$. The vertex-forwarding index $ξ(G)$ of $G$ is defined as the minimum $ξ(G,R)$ over all routing $R$'s of $G$. Similarly, the edge-forwarding index $ π(G,R)$ of $G$ with respect to $R$ is the maximum number of paths in $R$ passing through any edge of $G$. The edge-forwarding index $π(G)$ of $G$ is the minimum $π(G,R)$ over all routing $R$'s of $G$. The vertex-forwarding index or the edge-forwarding index corresponds to the maximum load of the graph. Therefore, it is important to find routings minimizing these indices and thus has received much research attention in the past ten years and more. In this paper we survey some known results on these forwarding indices, further research problems and several conjectures.

preprint2011arXiv

Electronic structure of BaNi$_2$As$_2$

BaNi$_2$As$_2$, with a first order phase transition around 131 K, is studied by the angle-resolved photoemission spectroscopy. The measured electronic structure is compared to the local density approximation calculations, revealing similar large electronlike bands around $\bar{M}$ and differences along $\barΓ$-$\bar{X}$. We further show that the electronic structure of BaNi$_2$As$_2$ is distinct from that of the sibling iron pnictides. Particularly, there is no signature of band folding, indicating no collinear SDW related magnetic ordering. Moreover, across the strong first order phase transition, the band shift exhibits a hysteresis, which is directly related to the significant lattice distortion in BaNi$_2$As$_2$.

preprint2011arXiv

High-dimensional covariance estimation based on Gaussian graphical models

Undirected graphs are often used to describe high dimensional distributions. Under sparsity conditions, the graph can be estimated using $\ell_1$-penalization methods. We propose and study the following method. We combine a multiple regression approach with ideas of thresholding and refitting: first we infer a sparse undirected graphical model structure via thresholding of each among many $\ell_1$-norm penalized regression functions; we then estimate the covariance matrix and its inverse using the maximum likelihood estimator. We show that under suitable conditions, this approach yields consistent estimation in terms of graphical structure and fast convergence rates with respect to the operator and Frobenius norm for the covariance matrix and its inverse. We also derive an explicit bound for the Kullback Leibler divergence.

preprint2011arXiv

Identifying codes of lexicographic product of graphs

Gravier et al. investigated the identifying codes of Cartesian product of two graphs. In this paper we consider the identifying codes of lexicographic product G[H] of a connected graph G and an arbitrary graph H, and obtain the minimum cardinality of identifying codes of G[H] in terms of some parameters of G and H.

preprint2011arXiv

On the metric dimension of line graphs

Let $G$ be a (di)graph. A set $W$ of vertices in $G$ is a \emph{resolving set} of $G$ if every vertex $u$ of $G$ is uniquely determined by its vector of distances to all the vertices in $W$. The \emph{metric dimension} $μ(G)$ of $G$ is the minimum cardinality of all the resolving sets of $G$. Cáceres et al. \cite{Ca2} computed the metric dimension of the line graphs of complete bipartite graphs. Recently, Bailey and Cameron \cite{Ba} computed the metric dimension of the line graphs of complete graphs. In this paper we study the metric dimension of the line graph $L(G)$ of $G$. In particular, we show that $μ(L(G))=|E(G)|-|V(G)|$ for a strongly connected digraph $G$ except for directed cycles, where $V(G)$ is the vertex set and $E(G)$ is the edge set of $G$. As a corollary, the metric dimension of de Brujin digraphs and Kautz digraphs is given. Moreover, we prove that $\lceil\log_2Δ(G)\rceil\leqμ(L(G))\leq |V(G)|-2$ for a simple connected graph $G$ with at least five vertices, where $Δ(G)$ is the maximum degree of $G$. Finally, we obtain the metric dimension of the line graph of a tree in terms of its parameters.

preprint2010arXiv

Forest Density Estimation

We study graph estimation and density estimation in high dimensions, using a family of density estimators based on forest structured undirected graphical models. For density estimation, we do not assume the true distribution corresponds to a forest; rather, we form kernel density estimates of the bivariate and univariate marginals, and apply Kruskal's algorithm to estimate the optimal forest on held out data. We prove an oracle inequality on the excess risk of the resulting estimator relative to the risk of the best forest. For graph estimation, we consider the problem of estimating forests with restricted tree sizes. We prove that finding a maximum weight spanning forest with restricted tree size is NP-hard, and develop an approximation algorithm for this problem. Viewing the tree size as a complexity parameter, we then select a forest using data splitting, and prove bounds on excess risk and structure selection consistency of the procedure. Experiments with simulated data and microarray data indicate that the methods are a practical alternative to Gaussian graphical models.

preprint2010arXiv

High-resolution angle-resolved photoemission spectroscopy study of the electronic structure of EuFe2As2

We report the high-resolution angle-resolved photoemission spectroscopy studies of electronic structure of EuFe2As2. The paramagnetic state data are found to be consistent with density-functional calculations. In the antiferromagnetic ordering state of Fe, our results show that the band splitting, folding, and hybridization evolve with temperature, which cannot be explained by a simple folding picture. Detailed measurements reveal that a tiny electron Fermi pocket and a tiny hole pocket are formed near (pi,pi) in the (0,0)-(pi,pi) direction, which qualitatively agree with the results of quantum oscillations, considering kz variation of Fermi surface. Furthermore, no noticeable change within the energy resolution is observed across the antiferromagnetic transition of Eu2+ ordering, suggesting weak coupling between Eu sublattice and FeAs sublattice.

Min Xu

What is connected

Connect this record

See the researcher in context

Building this map preview

44 published item(s)

FinDeepResearch: Evaluating Deep Research Agents in Rigorous Financial Analysis

Imaging-anchored Multiomics in Cardiovascular Disease: Integrating Cardiac Imaging, Bulk, Single-cell, and Spatial Transcriptomics

Noise2Params: Unification and Parameter Determination from Noise via a Probabilistic Event Camera Model

Unsupervised SE(3) Disentanglement for in situ Macromolecular Morphology Identification from Cryo-Electron Tomography

Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation

A generic framework for coded caching and distributed computation schemes

An Efficient Multitask Neural Network for Face Alignment, Head Pose Estimation and Face Tracking

BenchPress: A Deep Active Benchmark Generator

Boosting Active Learning via Improving Test Performance

Color Space-based HoVer-Net for Nuclei Instance Segmentation and Classification

Estimating Instance-dependent Bayes-label Transition Matrix using a Deep Neural Network

Objects in Semantic Topology

SHREC 2021: Classification in cryo-electron tomograms

Sparse Local Patch Transformer for Robust Face Alignment and Landmarks Inherent Relation Learning

A Visual Analytics Approach to Facilitate the Proctoring of Online Exams

Characterizing the Landscape of COVID-19 Themed Cyberattacks and Defenses

Data-Driven Characterization and Detection of COVID-19 Themed Malicious Websites

Inference on the History of a Randomly Growing Tree

Self-supervised Pretraining of Visual Features in the Wild

Single-View 3D Object Reconstruction from Shape Priors in Memory

SSFG: Stochastically Scaling Features and Gradients for Regularizing Graph Convolutional Networks

Experimental Analysis of Legendre Decomposition in Machine Learning

Few shot domain adaptation for in situ macromolecule structural classification in cryo-electron tomograms

Improving Lesion Segmentation for Diabetic Retinopathy using Adversarial Learning

Improving Utility and Security of the Shuffler-based Differential Privacy

Modelling Temporal Information Using Discrete Fourier Transform for Recognizing Emotions in User-generated Videos

Shift equivalence in the generalized factor order

The super spanning connectivity of arrangement graph

Automatic tracking of protein vesicles

Gene selection for cancer classification using a hybrid of univariate and multivariate feature selection methods

Global Gene Expression Analysis Using Machine Learning Methods

Integrative analysis of gene expression and phenotype data

Connectivity-preserving Geometry Images

Efficient Hybrid Inline and Out-of-line Deduplication for Backup Storage

Faithful Variable Screening for High-Dimensional Convex Regression

Conditional Sparse Coding and Grouped Multivariate Regression

Efficient Active Algorithms for Hierarchical Clustering

The Forwarding Indices of Graphs -- a Survey

Electronic structure of BaNi$_2$As$_2$

High-dimensional covariance estimation based on Gaussian graphical models

Identifying codes of lexicographic product of graphs

On the metric dimension of line graphs

Forest Density Estimation

High-resolution angle-resolved photoemission spectroscopy study of the electronic structure of EuFe2As2