Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
42works
0followers
27topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

42 published item(s)

preprint2026arXiv

Effect of accretion on scalar superradiant instability

Superradiance can lead to the formation of a black hole (BH) condensate system. We thoroughly investigate the accretion effect on the evolution of this system, and the gravitational wave signals it emits in the presence of multiple superradiance modes. Assuming the multiplication of the BH mass and scalar mass as a small number, we obtain the analytical approximations of all important quantities, which can be directly applied to phenomenological studies. In addition, we confirm that accretion could significantly enhance the gravitational wave (GW) emission and reduce its duration, and show that the GW beat signature is similarly modified.

preprint2026arXiv

Local Group Velocity Distribution inside Superradiant Condensates

Superradiance enables scalar fields to extract energy and angular momentum from a rotating black hole (BH), leading to the formation of a BH-condensate system. Previous studies mainly focus on the phase velocity, which propagates in the azimuthal direction. In this work, we show that the superradiant scalar condensate presents a nontrivial group velocity distribution. In the region sufficiently far from the BH, the condensate exhibits a radial velocity magnitude that approaches $ (r_gμ/2) \sin (2ωt-2 φ)$, while the polar and azimuthal velocity magnitudes asymptotically decline as $\propto 1/r$.

preprint2025arXiv

Learnable Query Aggregation with KV Routing for Cross-view Geo-localisation

Cross-view geo-localisation (CVGL) aims to estimate the geographic location of a query image by matching it with images from a large-scale database. However, the significant view-point discrepancies present considerable challenges for effective feature aggregation and alignment. To address these challenges, we propose a novel CVGL system that incorporates three key improvements. Firstly, we leverage the DINOv2 backbone with a convolution adapter fine-tuning to enhance model adaptability to cross-view variations. Secondly, we propose a multi-scale channel reallocation module to strengthen the diversity and stability of spatial representations. Finally, we propose an improved aggregation module that integrates a Mixture-of-Experts (MoE) routing into the feature aggregation process. Specifically, the module dynamically selects expert subspaces for the keys and values in a cross-attention framework, enabling adaptive processing of heterogeneous input domains. Extensive experiments on the University-1652 and SUES-200 datasets demonstrate that our method achieves competitive performance with fewer trained parameters.

preprint2024arXiv

CANAMRF: An Attention-Based Model for Multimodal Depression Detection

Multimodal depression detection is an important research topic that aims to predict human mental states using multimodal data. Previous methods treat different modalities equally and fuse each modality by naïve mathematical operations without measuring the relative importance between them, which cannot obtain well-performed multimodal representations for downstream depression tasks. In order to tackle the aforementioned concern, we present a Cross-modal Attention Network with Adaptive Multi-modal Recurrent Fusion (CANAMRF) for multimodal depression detection. CANAMRF is constructed by a multimodal feature extractor, an Adaptive Multimodal Recurrent Fusion module, and a Hybrid Attention Module. Through experimentation on two benchmark datasets, CANAMRF demonstrates state-of-the-art performance, underscoring the effectiveness of our proposed approach.

preprint2023arXiv

Study Duration Prediction for Clinical Trials with Time-to-Event Endpoints Using Mixture Distributions Accounting for Heterogeneous Population

In the era of precision medicine, more and more clinical trials are now driven or guided by biomarkers, which are patient characteristics objectively measured and evaluated as indicators of normal biological processes, pathogenic processes, or pharmacologic responses to therapeutic interventions. With the overarching objective to optimize and personalize disease management, biomarker-guided clinical trials increase the efficiency by appropriately utilizing prognostic or predictive biomarkers in the design. However, the efficiency gain is often not quantitatively compared to the traditional all-comers design, in which a faster enrollment rate is expected (e.g. due to no restriction to biomarker positive patients) potentially leading to a shorter duration. To accurately predict biomarker-guided trial duration, we propose a general framework using mixture distributions accounting for heterogeneous population. Extensive simulations are performed to evaluate the impact of heterogeneous population and the dynamics of biomarker characteristics and disease on the study duration. Several influential parameters including median survival time, enrollment rate, biomarker prevalence and effect size are identitied. Re-assessments of two publicly available trials are conducted to empirically validate the prediction accuracy and to demonstrate the practical utility. The R package \emph{detest} is developed to implement the proposed method and is publicly available on CRAN.

preprint2022arXiv

A general scheme of differential imaging employing weak measurement

We propose and experimentally realize a general scheme of differential imaging employing the idea of weak measurement. We show that the weak coupling between the system of interest and a two-level ancilla can introduce a two-beam circuit after an arbitrary pre-selection of the ancilla. By choosing the post-selection orthogonal to the pre-selection measurement, an effective imaging platform based on differential operations is shown achieved. Experimental results on both the Sagnac interferometer and ultra-thin Wollaston prism demonstrate that our imaging scheme successfully yields the boundary information of complex geometric configurations.

preprint2022arXiv

A robust and lightweight deep attention multiple instance learning algorithm for predicting genetic alterations

Deep-learning models based on whole-slide digital pathology images (WSIs) become increasingly popular for predicting molecular biomarkers. Instance-based models has been the mainstream strategy for predicting genetic alterations using WSIs although bag-based models along with self-attention mechanism-based algorithms have been proposed for other digital pathology applications. In this paper, we proposed a novel Attention-based Multiple Instance Mutation Learning (AMIML) model for predicting gene mutations. AMIML was comprised of successive 1-D convolutional layers, a decoder, and a residual weight connection to facilitate further integration of a lightweight attention mechanism to detect the most predictive image patches. Using data for 24 clinically relevant genes from four cancer cohorts in The Cancer Genome Atlas (TCGA) studies (UCEC, BRCA, GBM and KIRC), we compared AMIML with one popular instance-based model and four recently published bag-based models (e.g., CHOWDER, HE2RNA, etc.). AMIML demonstrated excellent robustness, not only outperforming all the five baseline algorithms in the vast majority of the tested genes (17 out of 24), but also providing near-best-performance for the other seven genes. Conversely, the performance of the baseline published algorithms varied across different cancers/genes. In addition, compared to the published models for genetic alterations, AMIML provided a significant improvement for predicting a wide range of genes (e.g., KMT2C, TP53, and SETD2 for KIRC; ERBB2, BRCA1, and BRCA2 for BRCA; JAK1, POLE, and MTOR for UCEC) as well as produced outstanding predictive models for other clinically relevant gene mutations, which have not been reported in the current literature. Furthermore, with the flexible and interpretable attention-based MIL pooling mechanism, AMIML could further zero-in and detect predictive image patches.

preprint2022arXiv

Are We Ready for Robust and Resilient SLAM? A Framework For Quantitative Characterization of SLAM Datasets

Reliability of SLAM systems is considered one of the critical requirements in modern autonomous systems. This directed the efforts to developing many state-of-the-art systems, creating challenging datasets, and introducing rigorous metrics to measure SLAM performance. However, the link between datasets and performance in the robustness/resilience context has rarely been explored. In order to fill this void, characterization of the operating conditions of SLAM systems is essential in order to provide an environment for quantitative measurement of robustness and resilience. In this paper, we argue that for proper evaluation of SLAM performance, the characterization of SLAM datasets serves as a critical first step. The study starts by reviewing previous efforts for quantitative characterization of SLAM datasets. Then, the problem of perturbation characterization is discussed and the linkage to SLAM robustness/resilience is established. After that, we propose a novel, generic and extendable framework for quantitative analysis and comparison of SLAM datasets. Additionally, a description of different characterization parameters is provided. Finally, we demonstrate the application of our framework by presenting the characterization results of three SLAM datasets: KITTI, EuroC-MAV, and TUM-VI highlighting the level of insights achieved by the proposed framework.

preprint2022arXiv

Cell-average based neural network method for high dimensional parabolic differential equations

In this paper, we introduce cell-average based neural network (CANN) method to solve high-dimensional parabolic partial differential equations. The method is based on the integral or weak formulation of partial differential equations. A feedforward network is considered to train the solution average of cells in neighboring time. Initial values and approximate solution at $t=Δt$ obtained by high order numerical method are taken as the inputs and outputs of network, respectively. We use supervised training combined with a simple backpropagation algorithm to train the network parameters. We find that the neural network has been trained to optimality for high-dimensional problems, the CFL condition is not strictly limited for CANN method and the trained network is used to solve the same problem with different initial values. For the high-dimensional parabolic equations, the convergence is observed and the errors are shown related to spatial mesh size but independent of time step size.

preprint2022arXiv

Condition-Invariant and Compact Visual Place Description by Convolutional Autoencoder

Visual place recognition (VPR) in condition-varying environments is still an open problem. Popular solutions are CNN-based image descriptors, which have been shown to outperform traditional image descriptors based on hand-crafted visual features. However, there are two drawbacks of current CNN-based descriptors: a) their high dimension and b) lack of generalization, leading to low efficiency and poor performance in applications. In this paper, we propose to use a convolutional autoencoder (CAE) to tackle this problem. We employ a high-level layer of a pre-trained CNN to generate features, and train a CAE to map the features to a low-dimensional space to improve the condition invariance property of the descriptor and reduce its dimension at the same time. We verify our method in three challenging datasets involving significant illumination changes, and our method is shown to be superior to the state-of-the-art. For the benefit of the community, we make public the source code.

preprint2022arXiv

Deep Polarimetric HDR Reconstruction

This paper proposes a novel learning based high-dynamic-range (HDR) reconstruction method using a polarization camera. We utilize a previous observation that polarization filters with different orientations can attenuate natural light differently, and we treat the multiple images acquired by the polarization camera as a set acquired under different exposure times, to introduce the development of solutions for the HDR reconstruction problem. We propose a deep HDR reconstruction framework with a feature masking mechanism that uses polarimetric cues available from the polarization camera, called Deep Polarimetric HDR Reconstruction (DPHR). The proposed DPHR obtains polarimetric information to propagate valid features through the network more effectively to regress the missing pixels. We demonstrate through both qualitative and quantitative evaluations that the proposed DPHR performs favorably than state-of-the-art HDR reconstruction algorithms.

preprint2022arXiv

DePS: An improved deep learning model for de novo peptide sequencing

De novo peptide sequencing from mass spectrometry data is an important method for protein identification. Recently, various deep learning approaches were applied for de novo peptide sequencing and DeepNovoV2 is one of the represetative models. In this study, we proposed an enhanced model, DePS, which can improve the accuracy of de novo peptide sequencing even with missing signal peaks or large number of noisy peaks in tandem mass spectrometry data. It is showed that, for the same test set of DeepNovoV2, the DePS model achieved excellent results of 74.22%, 74.21% and 41.68% for amino acid recall, amino acid precision and peptide recall respectively. Furthermore, the results suggested that DePS outperforms DeepNovoV2 on the cross species dataset.

preprint2022arXiv

Diagnostic Communication and Visual System based on Vehicle UDS Protocol

Unified Diagnostic Services (UDS) is a diagnostic communication protocol used in electronic control units (ECUs) within automotive electronics, which is specified in the ISO 14229-1. It is derived from ISO 14230-3 (KWP2000) and the now obsolete ISO 15765-3 (Diagnostic Communication over Controller Area Network (DoCAN). 'Unified' in this context means that it is an international and not a company-specific standard. By now this communication protocol is used in all new ECUs made by Tier 1 suppliers of Original Equipment Manufacturer (OEM), and is incorporated into other standards, such as AUTOSAR. The ECUs in modern vehicles control nearly all functions, including electronic fuel injection (EFI), engine control, the transmission, anti-lock braking system, door locks, braking, window operation, and more.

preprint2022arXiv

Efficient inference of parental origin effects using case-control mother-child genotype data

Parental origin effects play an important role in mammal development and disorder. Case-control mother-child pair genotype data can be used to detect parental origin effects and is often convenient to collect in practice. Most existing methods for assessing parental origin effects do not incorporate any covariates, which may be required to control for confounding factors. We propose to model the parental origin effects through a logistic regression model, with predictors including maternal and child genotypes, parental origins, and covariates. The parental origins may not be fully inferred from genotypes of a target genetic marker, so we propose to use genotypes of markers tightly linked to the target marker to increase inference efficiency. A computationally robust statistical inference procedure is developed based on a modified profile likelihood in a retrospective way. A computationally feasible expectation-maximization algorithm is devised to estimate all unknown parameters involved in the modified profile likelihood. This algorithm differs from the conventional expectation-maximization algorithm in the sense that it is based on a modified instead of the original profile likelihood function. The convergence of the algorithm is established under some mild regularity conditions. This expectation-maximization algorithm also allows convenient handling of missing child genotypes. Large sample properties, including weak consistency, asymptotic normality, and asymptotic efficiency, are established for the proposed estimator under some mild regularity conditions. Finite sample properties are evaluated through extensive simulation studies and the application to a real dataset.

preprint2022arXiv

FedDUAP: Federated Learning with Dynamic Update and Adaptive Pruning Using Shared Data on the Server

Despite achieving remarkable performance, Federated Learning (FL) suffers from two critical challenges, i.e., limited computational resources and low training efficiency. In this paper, we propose a novel FL framework, i.e., FedDUAP, with two original contributions, to exploit the insensitive data on the server and the decentralized data in edge devices to further improve the training efficiency. First, a dynamic server update algorithm is designed to exploit the insensitive data on the server, in order to dynamically determine the optimal steps of the server update for improving the convergence and accuracy of the global model. Second, a layer-adaptive model pruning method is developed to perform unique pruning operations adapted to the different dimensions and importance of multiple layers, to achieve a good balance between efficiency and effectiveness. By integrating the two original techniques together, our proposed FL model, FedDUAP, significantly outperforms baseline approaches in terms of accuracy (up to 4.8% higher), efficiency (up to 2.8 times faster), and computational cost (up to 61.9% smaller).

preprint2022arXiv

FinBERT-MRC: financial named entity recognition using BERT under the machine reading comprehension paradigm

Financial named entity recognition (FinNER) from literature is a challenging task in the field of financial text information extraction, which aims to extract a large amount of financial knowledge from unstructured texts. It is widely accepted to use sequence tagging frameworks to implement FinNER tasks. However, such sequence tagging models cannot fully take advantage of the semantic information in the texts. Instead, we formulate the FinNER task as a machine reading comprehension (MRC) problem and propose a new model termed FinBERT-MRC. This formulation introduces significant prior information by utilizing well-designed queries, and extracts start index and end index of target entities without decoding modules such as conditional random fields (CRF). We conduct experiments on a publicly available Chinese financial dataset ChFinAnn and a real-word bussiness dataset AdminPunish. FinBERT-MRC model achieves average F1 scores of 92.78% and 96.80% on the two datasets, respectively, with average F1 gains +3.94% and +0.89% over some sequence tagging models including BiLSTM-CRF, BERT-Tagger, and BERT-CRF. The source code is available at https://github.com/zyz0000/FinBERT-MRC.

preprint2022arXiv

Fire Together Wire Together: A Dynamic Pruning Approach with Self-Supervised Mask Prediction

Dynamic model pruning is a recent direction that allows for the inference of a different sub-network for each input sample during deployment. However, current dynamic methods rely on learning a continuous channel gating through regularization by inducing sparsity loss. This formulation introduces complexity in balancing different losses (e.g task loss, regularization loss). In addition, regularization based methods lack transparent tradeoff hyperparameter selection to realize a computational budget. Our contribution is two-fold: 1) decoupled task and pruning losses. 2) Simple hyperparameter selection that enables FLOPs reduction estimation before training. Inspired by the Hebbian theory in Neuroscience: "neurons that fire together wire together", we propose to predict a mask to process k filters in a layer based on the activation of its previous layer. We pose the problem as a self-supervised binary classification problem. Each mask predictor module is trained to predict if the log-likelihood for each filter in the current layer belongs to the top-k activated filters. The value k is dynamically estimated for each input based on a novel criterion using the mass of heatmaps. We show experiments on several neural architectures, such as VGG, ResNet and MobileNet on CIFAR and ImageNet datasets. On CIFAR, we reach similar accuracy to SOTA methods with 15% and 24% higher FLOPs reduction. Similarly in ImageNet, we achieve lower drop in accuracy with up to 13% improvement in FLOPs reduction.

preprint2022arXiv

Following Closely: A Robust Monocular Person Following System for Mobile Robot

Monocular person following (MPF) is a capability that supports many useful applications of a mobile robot. However, existing MPF solutions are not completely satisfactory. Firstly, they often fail to track the target at a close distance either because they are based on a visual servo or they need the observation of the full body by the robot. Secondly, their target Re-IDentification (Re-ID) abilities are weak in cases of target appearance change and highly similar appearance of distracting people. To remove the assumption of full-body observation, we propose a width-based tracking module, which relies on the target width, which can be observed even at a close distance. For handling issues related to appearance variation, we use a global CNN (convolutional neural network) descriptor to represent the target and a ridge regression model to learn a target appearance model online. We adopt a sampling strategy for online classifier learning, in which both long-term and short-term samples are involved. We evaluate our method in two datasets including a public person following dataset and a custom-built one with challenging target appearance and target distance. Our method achieves state-of-the-art (SOTA) results on both datasets. For the benefit of the community, we make public the dataset and the source code.

preprint2022arXiv

Highlight Specular Reflection Separation based on Tensor Low-rank and Sparse Decomposition Using Polarimetric Cues

This paper is concerned with specular reflection removal based on tensor low-rank decomposition framework with the help of polarization information. Our method is motivated by the observation that the specular highlight of an image is sparsely distributed while the remaining diffuse reflection can be well approximated by a linear combination of several distinct colors using a low-rank and sparse decomposition framework. Unlike current solutions, our tensor low-rank decomposition keeps the spatial structure of specular and diffuse information which enables us to recover the diffuse image under strong specular reflection or in saturated regions. We further define and impose a new polarization regularization term as constraint on color channels. This regularization boosts the performance of the method to recover an accurate diffuse image by handling the color distortion, a common problem of chromaticity-based methods, especially in case of strong specular reflection. Through comprehensive experiments on both synthetic and real polarization images, we demonstrate that our method is able to significantly improve the accuracy of highlight specular removal, and outperform the competitive methods to recover the diffuse image, especially in regions of strong specular reflection or in saturated areas.

preprint2022arXiv

Improving Feature Extraction from Histopathological Images Through A Fine-tuning ImageNet Model

Due to lack of annotated pathological images, transfer learning has been the predominant approach in the field of digital pathology.Pre-trained neural networks based on ImageNet database are often used to extract "off the shelf" features, achieving great success in predicting tissue types, molecular features, and clinical outcomes, etc. We hypothesize that fine-tuning the pre-trained models using histopathological images could further improve feature extraction, and downstream prediction performance.We used 100,000 annotated HE image patches for colorectal cancer (CRC) to finetune a pretrained Xception model via a twostep approach.The features extracted from finetuned Xception (FTX2048) model and Imagepretrained (IMGNET2048) model were compared through: (1) tissue classification for HE images from CRC, same image type that was used for finetuning; (2) prediction of immunerelated gene expression and (3) gene mutations for lung adenocarcinoma (LUAD).Fivefold cross validation was used for model performance evaluation. The extracted features from the finetuned FTX2048 exhibited significantly higher accuracy for predicting tisue types of CRC compared to the off the shelf feature directly from Xception based on ImageNet database. Particularly, FTX2048 markedly improved the accuracy for stroma from 87% to 94%. Similarly, features from FTX2048 boosted the prediction of transcriptomic expression of immunerelated genesin LUAD. For the genes that had signigicant relationships with image fetures, the features fgrom the finetuned model imprroved the prediction for the majority of the genes. Inaddition, fetures from FTX2048 improved prediction of mutation for 5 out of 9 most frequently mutated genes in LUAD.

preprint2022arXiv

Mapping While Following: 2D LiDAR SLAM in Indoor Dynamic Environments with a Person Tracker

2D LiDAR SLAM (Simultaneous Localization and Mapping) is widely used in indoor environments due to its stability and flexibility. However, its mapping procedure is usually operated by a joystick in static environments, while indoor environments often are dynamic with moving objects such as people. The generated map with noisy points due to the dynamic objects is usually incomplete and distorted. To address this problem, we propose a framework of 2D-LiDAR-based SLAM without manual control that effectively excludes dynamic objects (people) and simplify the process for a robot to map an environment. The framework, which includes three parts: people tracking, filtering and following. We verify our proposed framework in experiments with two classic 2D-LiDAR-based SLAM algorithms in indoor environments. The results show that this framework is effective in handling dynamic objects and reducing the mapping error.

preprint2022arXiv

MonoDistill: Learning Spatial Features for Monocular 3D Object Detection

3D object detection is a fundamental and challenging task for 3D scene understanding, and the monocular-based methods can serve as an economical alternative to the stereo-based or LiDAR-based methods. However, accurately detecting objects in the 3D space from a single image is extremely difficult due to the lack of spatial cues. To mitigate this issue, we propose a simple and effective scheme to introduce the spatial information from LiDAR signals to the monocular 3D detectors, without introducing any extra cost in the inference phase. In particular, we first project the LiDAR signals into the image plane and align them with the RGB images. After that, we use the resulting data to train a 3D detector (LiDAR Net) with the same architecture as the baseline model. Finally, this LiDAR Net can serve as the teacher to transfer the learned knowledge to the baseline model. Experimental results show that the proposed method can significantly boost the performance of the baseline model and ranks the $1^{st}$ place among all monocular-based methods on the KITTI benchmark. Besides, extensive ablation studies are conducted, which further prove the effectiveness of each part of our designs and illustrate what the baseline model has learned from the LiDAR Net. Our code will be released at \url{https://github.com/monster-ghost/MonoDistill}.

preprint2022arXiv

Online Mutual Adaptation of Deep Depth Prediction and Visual SLAM

The ability of accurate depth prediction by a convolutional neural network (CNN) is a major challenge for its wide use in practical visual simultaneous localization and mapping (SLAM) applications, such as enhanced camera tracking and dense mapping. This paper is set out to answer the following question: Can we tune a depth prediction CNN with the help of a visual SLAM algorithm even if the CNN is not trained for the current operating environment in order to benefit the SLAM performance? To this end, we propose a novel online adaptation framework consisting of two complementary processes: a SLAM algorithm that is used to generate keyframes to fine-tune the depth prediction and another algorithm that uses the online adapted depth to improve map quality. Once the potential noisy map points are removed, we perform global photometric bundle adjustment (BA) to improve the overall SLAM performance. Experimental results on both benchmark datasets and a real robot in our own experimental environments show that our proposed method improves the overall SLAM accuracy. While regularization has been shown to be effective in multi-task classification problems, we present experimental results and an ablation study to show the effectiveness of regularization in preventing catastrophic forgetting in the online adaptation of depth prediction, a single-task regression problem. In addition, we compare our online adaptation framework against the state-of-the-art pre-trained depth prediction CNNs to show that our online adapted depth prediction CNN outperforms the depth prediction CNNs that have been trained on a large collection of datasets.

preprint2022arXiv

Optimal Checkpointing for Adjoint Multistage Time-Stepping Schemes

We consider checkpointing strategies that minimize the number of recomputations needed when performing discrete adjoint computations using multistage time-stepping schemes, which requires computing several substeps within one complete time step. In this case we propose two algorithms that can generate optimal checkpointing schedules under weak assumptions. The first is an extension of the seminal Revolve algorithm adapted to multistage schemes. The second algorithm, named CAMS, is developed based on dynamic programming, and it requires the least number of recomputations when compared with other algorithms. The CAMS algorithm is made publicly available in a library with bindings to C and Python. Numerical results illustrate that the proposed algorithms can deliver up to two times the speedup compared with that of classical Revolve. Moreover, we discuss a tailored implementation of an adjoint computation that is arguably better suited for mature scientific computing libraries by avoiding the central control assumed by the original checkpointing strategy. The proposed algorithms have been adopted by the PETSc TSAdjoint library. Their performance has been demonstrated with a large-scale PDE-constrained optimization problem on a leadership-class supercomputer.

preprint2022arXiv

Optimize Deep Learning Models for Prediction of Gene Mutations Using Unsupervised Clustering

Deep learning has become the mainstream methodological choice for analyzing and interpreting whole-slide digital pathology images (WSIs). It is commonly assumed that tumor regions carry most predictive information. In this paper, we proposed an unsupervised clustering-based multiple-instance learning, and apply our method to develop deep-learning models for prediction of gene mutations using WSIs from three cancer types in The Cancer Genome Atlas (TCGA) studies (CRC, LUAD, and HNSCC). We showed that unsupervised clustering of image patches could help identify predictive patches, exclude patches lack of predictive information, and therefore improve prediction on gene mutations in all three different cancer types, compared with the WSI based method without selection of image patches and models based on only tumor regions. Additionally, our proposed algorithm outperformed two recently published baseline algorithms leveraging unsupervised clustering to assist model prediction. The unsupervised-clustering-based approach for mutation prediction allows identification of the spatial regions related to mutation of a specific gene via the resolved probability scores, highlighting the heterogeneity of a predicted genotype in the tumor microenvironment. Finally, our study also demonstrated that selection of tumor regions of WSIs is not always the best way to identify patches for prediction of gene mutations, and other tissue types in the tumor micro-environment may provide better prediction ability for gene mutations than tumor tissues.

preprint2022arXiv

Perspective Phase Angle Model for Polarimetric 3D Reconstruction

Current polarimetric 3D reconstruction methods, including those in the well-established shape from polarization literature, are all developed under the orthographic projection assumption. In the case of a large field of view, however, this assumption does not hold and may result in significant reconstruction errors in methods that make this assumption. To address this problem, we present the perspective phase angle (PPA) model that is applicable to perspective cameras. Compared with the orthographic model, the proposed PPA model accurately describes the relationship between polarization phase angle and surface normal under perspective projection. In addition, the PPA model makes it possible to estimate surface normals from only one single-view phase angle map and does not suffer from the so-called $π$-ambiguity problem. Experiments on real data show that the PPA model is more accurate for surface normal estimation with a perspective camera than the orthographic model.

preprint2022arXiv

PL-VINS: Real-Time Monocular Visual-Inertial SLAM with Point and Line Features

Leveraging line features to improve localization accuracy of point-based visual-inertial SLAM (VINS) is gaining interest as they provide additional constraints on scene structure. However, real-time performance when incorporating line features in VINS has not been addressed. This paper presents PL-VINS, a real-time optimization-based monocular VINS method with point and line features, developed based on the state-of-the-art point-based VINS-Mono \cite{vins}. We observe that current works use the LSD \cite{lsd} algorithm to extract line features; however, LSD is designed for scene shape representation instead of the pose estimation problem, which becomes the bottleneck for the real-time performance due to its high computational cost. In this paper, a modified LSD algorithm is presented by studying a hidden parameter tuning and length rejection strategy. The modified LSD can run at least three times as fast as LSD. Further, by representing space lines with the Plücker coordinates, the residual error in line estimation is modeled in terms of the point-to-line distance, which is then minimized by iteratively updating the minimum four-parameter orthonormal representation of the Plücker coordinates. Experiments in a public benchmark dataset show that the localization error of our method is 12-16\% less than that of VINS-Mono at the same pose update frequency. %For the benefit of the community, The source code of our method is available at: https://github.com/cnqiangfu/PL-VINS.

preprint2022arXiv

Prognostic Significance of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images in Colorectal Cancers

Purpose Tumor-infiltrating lymphocytes (TILs) have significant prognostic values in cancers. However, very few automated, deep-learning-based TIL scoring algorithms have been developed for colorectal cancers (CRC). Methods We developed an automated, multiscale LinkNet workflow for quantifying cellular-level TILs for CRC tumors using H&E-stained images. The predictive performance of the automatic TIL scores (TIL) for disease progression and overall survival was evaluate using two international datasets, including 554 CRC patients from The Cancer Genome Atlas (TCGA) and 1130 CRC patients from Molecular and Cellular Oncology (MCO). Results The LinkNet model provided an outstanding precision (0.9508), recall (0.9185), and overall F1 score (0.9347). Clear dose-response relationships were observed between TILs and risk of disease progression or death decreased in both TCGA and MCO cohorts. Both univariate and multivariate Cox regression analyses for the TCGA data demonstrated that patients with high TILs had significant (approx. 75%) reduction of risk for disease progression. In both MCO and TCGA studies, the TIL-high group was significantly associated with improved overall survival in univariate analysis (30% and 54% reduction in risk, respectively). However, potential confounding was observed in the MCO dataset. The favorable effects of high TILs were consistently observed in different subgroups according to know risk factors. Conclusion A deep-learning workflow for automatic TIL quantification based on LinkNet was successfully developed.

preprint2022arXiv

Sparse Optical Flow-Based Line Feature Tracking

In this paper we propose a novel sparse optical flow (SOF)-based line feature tracking method for the camera pose estimation problem. This method is inspired by the point-based SOF algorithm and developed based on an observation that two adjacent images in time-varying image sequences satisfy brightness invariant. Based on this observation, we re-define the goal of line feature tracking: track two endpoints of a line feature instead of the entire line based on gray value matching instead of descriptor matching. To achieve this goal, an efficient two endpoint tracking (TET) method is presented: first, describe a given line feature with its two endpoints; next, track the two endpoints based on SOF to obtain two new tracked endpoints by minimizing a pixel-level grayscale residual function; finally, connect the two tracked endpoints to generate a new line feature. The correspondence is established between the given and the new line feature. Compared with current descriptor-based methods, our TET method needs not to compute descriptors and detect line features repeatedly. Naturally, it has an obvious advantage over computation. Experiments in several public benchmark datasets show our method yields highly competitive accuracy with an obvious advantage over speed.

preprint2022arXiv

The PETSc Community Is the Infrastructure

The communities who develop and support open source scientific software packages are crucial to the utility and success of such packages. Moreover, these communities form an important part of the human infrastructure that enables scientific progress. This paper discusses aspects of the PETSc (Portable Extensible Toolkit for Scientific Computation) community, its organization, and technical approaches that enable community members to help each other efficiently.

preprint2022arXiv

Trust-region approximation of extreme trajectories in power system dynamics

In this work we present a novel technique, based on a trust-region optimization algorithm and second-order trajectory sensitivities, to compute the extreme trajectories of power system dynamic simulations given a bounded set that represents parametric uncertainty. We show how this method, while remaining computationally efficient compared with sampling-based techniques, overcomes the limitations of previous sensitivity-based techniques to approximate the bounds of the trajectories, when the local approximation loses validity because of the nonlinearity. In addition, we show how this method can be adapted to account for those cases in which the initial conditions depend on the uncertain parameter. To conclude, we present several numerical experiments that showcase the accuracy and scalability of the technique, including a demonstration on the IEEE New England test system.

preprint2021arXiv

A Comprehensive Review of Computer-aided Whole-slide Image Analysis: from Datasets to Feature Extraction, Segmentation, Classification, and Detection Approaches

With the development of computer-aided diagnosis (CAD) and image scanning technology, Whole-slide Image (WSI) scanners are widely used in the field of pathological diagnosis. Therefore, WSI analysis has become the key to modern digital pathology. Since 2004, WSI has been used more and more in CAD. Since machine vision methods are usually based on semi-automatic or fully automatic computers, they are highly efficient and labor-saving. The combination of WSI and CAD technologies for segmentation, classification, and detection helps histopathologists obtain more stable and quantitative analysis results, save labor costs and improve diagnosis objectivity. This paper reviews the methods of WSI analysis based on machine learning. Firstly, the development status of WSI and CAD methods are introduced. Secondly, we discuss publicly available WSI datasets and evaluation metrics for segmentation, classification, and detection tasks. Then, the latest development of machine learning in WSI segmentation, classification, and detection are reviewed continuously. Finally, the existing methods are studied, the applicabilities of the analysis methods are analyzed, and the application prospects of the analysis methods in this field are forecasted.

preprint2021arXiv

A new type-II lepidocrocite-type TiO2/GaSe heterostructure: Electronic and optical properties, bandgap engineering, interaction with ultrafast laser pulses

Recently, van der Waals heterostructure has attracted interest both theoretically and experimentally for their potential applications in photoelectronic devices, photovoltaic devices, plasmonic devices and photocatalysis. Inspired by this, we design a lepidocrocite-type TiO2/GaSe heterostructure. Via first-principles simulations, we show that such a heterostructure is a direct bandgap semiconductor with a strong and broad optical absorption, ranging from visible light to UV region, exhibiting its potential application in photoelectronic and photovoltaic devices. With the planar-averaged electron density difference and Bader charge analysis, the heterostructure shows a strong capacity of enhancing the charge redistribution especially at the interface, prolonging the lifetime of excitons, and hence improving photocatalytic performance. By applying biaxial strain and interlayer coupling, the heterostructure exhibits a direct-indirect bandgap transition and shows a potential for mechanical sensors due to the smooth and linear variation of bandgaps. Furthermore, our result indicates that a lower interlayer distance leads to a stronger charge redistribution. The calculation of irradiating ultrafast on the heterostructure further reveals a semiconductor-metal transition for the heterostructure. Moreover, we find an enhanced induced plasmonic current in the heterostructure under both x-polarized and z-polarized laser, which is beneficial to plasmonic devices designs. Our research provides valuable insight in applying the lepidocrocite-type TiO2/GaSe heterostructure in photoelectronic, photovoltaic, photocatalytic, mechanical sensing and plasmonic realms.

preprint2021arXiv

Polarimetric Monocular Dense Mapping Using Relative Deep Depth Prior

This paper is concerned with polarimetric dense map reconstruction based on a polarization camera with the help of relative depth information as a prior. In general, polarization imaging is able to reveal information about surface normal such as azimuth and zenith angles, which can support the development of solutions to the problem of dense reconstruction, especially in texture-poor regions. However, polarimetric shape cues are ambiguous due to two types of polarized reflection (specular/diffuse). Although methods have been proposed to address this issue, they either are offline and therefore not practical in robotics applications, or use incomplete polarimetric cues, leading to sub-optimal performance. In this paper, we propose an online reconstruction method that uses full polarimetric cues available from the polarization camera. With our online method, we can propagate sparse depth values both along and perpendicular to iso-depth contours. Through comprehensive experiments on challenging image sequences, we demonstrate that our method is able to significantly improve the accuracy of the depthmap as well as increase its density, specially in regions of poor texture.

preprint2020arXiv

Accurate $p$-Value Calculation for Generalized Fisher's Combination Tests Under Dependence

Combining dependent tests of significance has broad applications but the $p$-value calculation is challenging. Current moment-matching methods (e.g., Brown's approximation) for Fisher's combination test tend to significantly inflate the type I error rate at the level less than 0.05. It could lead to significant false discoveries in big data analyses. This paper provides several more accurate and computationally efficient $p$-value calculation methods for a general family of Fisher type statistics, referred as the GFisher. The GFisher covers Fisher's combination, Good's statistic, Lancaster's statistic, weighted Z-score combination, etc. It allows a flexible weighting scheme, as well as an omnibus procedure that automatically adapts proper weights and degrees of freedom to a given data. The new $p$-value calculation methods are based on novel ideas of moment-ratio matching and joint-distribution surrogating. Systematic simulations show that they are accurate under multivariate Gaussian, and robust under the generalized linear model and the multivariate $t$-distribution, down to at least $10^{-6}$ level. We illustrate the usefulness of the GFisher and the new $p$-value calculation methods in analyzing both simulated and real data of gene-based SNP-set association studies in genetics. Relevant computation has been implemented into R package $GFisher$.

preprint2020arXiv

Gated Path Selection Network for Semantic Segmentation

Semantic segmentation is a challenging task that needs to handle large scale variations, deformations and different viewpoints. In this paper, we develop a novel network named Gated Path Selection Network (GPSNet), which aims to learn adaptive receptive fields. In GPSNet, we first design a two-dimensional multi-scale network - SuperNet, which densely incorporates features from growing receptive fields. To dynamically select desirable semantic context, a gate prediction module is further introduced. In contrast to previous works that focus on optimizing sample positions on the regular grids, GPSNet can adaptively capture free form dense semantic contexts. The derived adaptive receptive fields are data-dependent, and are flexible that can model different object geometric transformations. On two representative semantic segmentation datasets, i.e., Cityscapes, and ADE20K, we show that the proposed approach consistently outperforms previous methods and achieves competitive performance without bells and whistles.

preprint2020arXiv

Implicit Semantic Data Augmentation for Deep Networks

In this paper, we propose a novel implicit semantic data augmentation (ISDA) approach to complement traditional augmentation techniques like flipping, translation or rotation. Our work is motivated by the intriguing property that deep networks are surprisingly good at linearizing features, such that certain directions in the deep feature space correspond to meaningful semantic transformations, e.g., adding sunglasses or changing backgrounds. As a consequence, translating training samples along many semantic directions in the feature space can effectively augment the dataset to improve generalization. To implement this idea effectively and efficiently, we first perform an online estimate of the covariance matrix of deep features for each class, which captures the intra-class semantic variations. Then random vectors are drawn from a zero-mean normal distribution with the estimated covariance to augment the training data in that class. Importantly, instead of augmenting the samples explicitly, we can directly minimize an upper bound of the expected cross-entropy (CE) loss on the augmented training set, leading to a highly efficient algorithm. In fact, we show that the proposed ISDA amounts to minimizing a novel robust CE loss, which adds negligible extra computational cost to a normal training procedure. Although being simple, ISDA consistently improves the generalization performance of popular deep models (ResNets and DenseNets) on a variety of datasets, e.g., CIFAR-10, CIFAR-100 and ImageNet. Code for reproducing our results is available at https://github.com/blackfeather-wang/ISDA-for-Deep-Networks.

preprint2020arXiv

Material and debris transport patterns in Moreton Bay, Australia: The influence of Lagrangian coherent structures

Coastal tidal estuaries are vital to the exchange of energy and material between inland waters and the open ocean. Debris originating from the land and ocean enter this environment and are transported by currents (river outflow and tide), wind, waves and density gradients. Understanding and predicting the source and fate of such debris has considerable environmental, economic and visual importance. We show that this issue can be addressed using the Lagrangian coherent structures (LCS) technique which is highly robust to hydrodynamic model uncertainties. Here we present a comprehensive study showing the utility of this approach to describe the fate of floating material in a coastal tidal embayment. An example is given from Moreton Bay, a semi-enclosed subtropical embayment with high morphologic, ecological and economic significance to Southeast Queensland, Australia. Transport barriers visualised by the LCS create pathways and barriers for material transport in the embayment. It was found that the wind field modified both the rate attraction and location of the transport barriers. One of the key outcomes is the demonstration of the significant role of islands in partitioning the transport of material and mixing within the embayment. The distribution of the debris sources along the shoreline are explained by the relative location of the LCS to the shoreline. Therefore, extraction of LCS can help to predict sources and fate of anthropogenic marine debris and thus, serve as a useful way for effective management of vulnerable regions and marine protected areas.

preprint2020arXiv

Regularized finite difference methods for the logarithmic Klein-Gordon equation

We propose and analyze two regularized finite difference methods for the logarithmic Klein-Gordon equation (LogKGE). Due to the blowup phenomena caused by the logarithmic nonlinearity of the LogKGE, it is difficult to construct numerical schemes and establish their error bounds. In order to avoid singularity, we present a regularized logarithmic Klein-Gordon equation (RLogKGE) with a small regularized parameter $0<\varepsilon\ll1$. Besides, two finite difference methods are adopted to solve the regularized logarithmic Klein-Gordon equation (RLogKGE) and rigorous error bounds are estimated in terms of the mesh size $h$, time step $τ$, and the small regularized parameter $\varepsilon$. Finally, numerical experiments are carried out to verify our error estimates of the two numerical methods and the convergence results from the LogKGE to the RLogKGE with the linear convergence order $O(\varepsilon)$.

preprint2020arXiv

Two regularized energy-preserving finite difference methods for the logarithmic Klein-Gordon equation

We present and analyze two regularized finite difference methods which preserve energy of the logarithmic Klein-Gordon equation (LogKGE). In order to avoid singularity caused by the logarithmic nonlinearity of the LogKGE, we propose a regularized logarithmic Klein-Gordon equation (RLogKGE) with a small regulation parameter $0<\varepsilon\ll1$ to approximate the LogKGE with the convergence order $O(\varepsilon)$. By adopting the energy method, the inverse inequality, and the cut-off technique of the nonlinearity to bound the numerical solution, the error bound $O(h^{2}+\frac{τ^{2}}{\varepsilon^{2}})$ of the two schemes with the mesh size $h$, the time step $τ$ and the parameter $\varepsilon$. Numerical results are reported to support our conclusions.

preprint2019arXiv

Lightweight Monocular Depth Estimation Model by Joint End-to-End Filter pruning

Convolutional neural networks (CNNs) have emerged as the state-of-the-art in multiple vision tasks including depth estimation. However, memory and computing power requirements remain as challenges to be tackled in these models. Monocular depth estimation has significant use in robotics and virtual reality that requires deployment on low-end devices. Training a small model from scratch results in a significant drop in accuracy and it does not benefit from pre-trained large models. Motivated by the literature of model pruning, we propose a lightweight monocular depth model obtained from a large trained model. This is achieved by removing the least important features with a novel joint end-to-end filter pruning. We propose to learn a binary mask for each filter to decide whether to drop the filter or not. These masks are trained jointly to exploit relations between filters at different layers as well as redundancy within the same layer. We show that we can achieve around 5x compression rate with small drop in accuracy on the KITTI driving dataset. We also show that masking can improve accuracy over the baseline with fewer parameters, even without enforcing compression loss.

preprint2019arXiv

Prescribed $Q$-curvature flow on closed manifolds of even dimension

On a closed Riemannian manifold $(M,g_0)$ of even dimension $n \geqslant 4$, the well-known prescribed $Q$-curvature problem asks whether or not there is a metric $g$ comformal to $g_0$ such that its $Q$-curvature, associated with the GJMS operator $\mathbf P_g$, is equal to a given function $f$. Letting $g = e^{2u}g_0$, this problem is equivalent to solving \[ \mathbf P_{g_0} u+Q_{g_0} = f e^{nu}, \] where $Q_{g_0}$ denotes the $Q$-curvature of $g_0$. The primary objective of the paper is to introduce the following negative gradient flow of the time dependent metric $g(t)$ conformal to $g_0$, \[ \frac{\partial g (t)}{\partial t}= -2\Big(Q_{g (t)} - \frac{\int_M f Q_{g(t)} dμ_{g(t)} }{\int_M f^2 dμ_{g(t)} }f \Big)g(t) \quad \text{ for } t >0, \] to study the problem of prescribing $Q$-curvature. Since $\int_M Q_g dμ_g$ is conformally invariant, our analysis depends on the size of $\int_M Q_{g_0} dμ_{g_0}$, which is assumed to satisfy \[ \int_M Q_0 dμ_{g_0} \ne k (n-1)! \, {\rm vol}(\mathbb S^n) \quad \text{ for all } \; k = 2,3,... \] The paper is twofold. First, we identify suitable conditions on $f$ such that the gradient flow defined as above is defined to all time and convergent, as time goes to infinity, sequentially or uniformly. Second, we show that various existence theorems for prescribed $Q$-curvature problem can be derived from the convergence of the flow.