Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
78works
0followers
37topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

78 published item(s)

preprint2026arXiv

A Readiness-Driven Runtime for Pipeline-Parallel Training under Runtime Variability

Pipeline parallelism is a key technique for scaling large-model training, but modern workloads exhibit runtime variability in computation and communication. Existing pipeline systems typically consume static, profiled, or adaptively generated schedules as pre-committed execution orders. When realized task readiness diverges from the pre-committed order, stages may wait for not-yet-ready work even though other executable work is available, creating stage misalignment, idle bubbles, and reduced utilization. We present Runtime-Readiness-First Pipeline (RRFP), a readiness-driven runtime for pipeline-parallel training. RRFP changes how schedules are consumed at runtime: instead of treating a schedule as a sequence that stages must wait to follow, it treats the schedule as a non-binding hint order for ranking currently ready work. To support this model, RRFP combines message-driven asynchronous communication, lightweight tensor-parallel coordination for collective consistency, and ready-set arbitration for low-overhead dispatch. We implement RRFP in a Megatron-based training framework and evaluate it on language-only and multimodal workloads at up to 128 GPUs. RRFP improves over fixed-order pipeline baselines across all settings. Using the BFW hint, RRFP achieves up to 1.77$\times$ speedup on language-only workloads and up to 2.77$\times$ on multimodal workloads. In cross-framework comparisons, RRFP with the default BF hint outperforms the faster available external system by up to 1.84$\times$ while preserving training correctness.

preprint2026arXiv

Asking Back: Interaction-Layer Antidistillation Watermarks

Detecting unauthorized knowledge distillation from a deployed LLM API is hard because the defender controls neither the attacker's training pipeline nor the next-token logits. Existing defenses operate on the teacher's output tokens -- biasing the next-token distribution (green-list watermarks, cryptographic schemes, antidistillation sampling) or rewriting outputs after generation. Recent work shows a paraphrasing attacker can strip these signals without losing the underlying knowledge. We propose interaction-layer antidistillation watermarks, which move the trace one layer higher, into the teacher's interaction behavior: the defender wraps the teacher with a system prompt that intermittently induces a behavioral marker -- an explicit follow-up question, a low-frequency variant, or a declarative restatement. An oblivious distiller inherits the behavior, and the defender audits via black-box queries with a human-validated LLM-as-judge (Cohen's kappa = 0.84/0.78 on strong/style rubrics). Across 63 LoRA-distilled students under a Llama-3.3-70B-Instruct teacher (35,343 judged samples), behavioral watermarks transfer at 88.9% (Gemma) / 80.9% (OLMo) / 45.2% (Qwen) relative fidelity (H1, H2). Under non-adaptive DIPPER paraphrasing, robustness decomposes into a teacher-self ceiling (about 66.4%) and student-relative retention of 21-112%, with OLMo preserving the watermark above the teacher itself (H3, F-Amp). Low-density (about 20%) explicit and implicit declarative variants transfer above per-family baseline (H4, F-Style). An N=20 in-lab study (pre-registered Latin-square) shows all marker variants within 0.22 Likert step of baseline; TOST, Friedman, and Bonferroni-Wilcoxon support H5. The interaction layer is a viable design locus for antidistillation watermarking, complementary to token-, model-, and reasoning-trace-layer defenses.

preprint2026arXiv

Autoregressive Visual Generation Needs a Prologue

In this work, we propose Prologue, an approach to bridging the reconstruction-generation gap in autoregressive (AR) image generation. Instead of modifying visual tokens to satisfy both reconstruction and generation, Prologue generates a small set of prologue tokens prepended to the visual token sequence. These prologue tokens are trained exclusively with the AR cross-entropy (CE) loss, while visual tokens remain dedicated to reconstruction. This decoupled design lets us optimize generation through the AR model's true distribution without affecting reconstruction quality, which we further formalize from an ELBO perspective. On ImageNet 256x256, Prologue-Base reduces gFID from 21.01 to 10.75 without classifier-free guidance while keeping reconstruction almost unchanged; Prologue-Large reaches a competitive rFID of 0.99 and gFID of 1.46 using a standard AR model without auxiliary semantic supervision. Interestingly, driven only by AR gradients, prologue tokens exhibit emergent semantic structure: linear probing on 16 prologue tokens reaches 35.88% Top-1, far above the 23.71% of the first 16 tokens from a standard tokenizer; resampling with fixed prologue tokens preserves a similar high-level semantic layout. Our results suggest a new direction: generation quality can be improved by introducing a separate learned generative representation while leaving the original representation intact.

preprint2026arXiv

Doppler-Resilient LEO Satellite OFDM Transmission with Affine Frequency Domain Pilot

Orthogonal frequency division multiplexing (OFDM) based low Earth orbit (LEO) satellite communication system suffers from severe Doppler shifts, while {the Doppler-resilient affine frequency-division multiplexing (AFDM) transmission suffers from significantly high processing complexity in data detection}. In this paper, we explore the channel estimation gain of affine frequency (AF) domain pilot to enhance the OFDM transmission under high mobility. Specifically, we propose a novel AF domain pilot embedding scheme for satellite-ground downlink OFDM systems for capturing the channel characteristics. By exploiting the autoregressive (AR) property of adjacent channels, a long short-term memory (LSTM) based predictor is designed to replace conventional interpolation operation in OFDM channel estimation. Simulation results show that the proposed transmission scheme significantly outperforms conventional OFDM scheme in terms of bit error rate (BER) under high Doppler scenarios, thus paving a new way for the design of next generation non-terrestrial network (NTN) communication systems.

preprint2026arXiv

ExecuTorch -- A Unified PyTorch Solution to Run AI Models On-Device

Local execution of AI on edge devices is important for low latency and offline operation. However, deploying models on diverse hardware remains fragmented, often requiring model conversion or complete reimplementation outside the PyTorch ecosystem where the model was originally authored. We introduce ExecuTorch, a unified PyTorch-native deployment framework for edge AI. ExecuTorch enables seamless deployment of machine learning models across heterogeneous compute environments. It scales from embedded microcontrollers to complex system-on-chips (SoCs) with dedicated accelerators, powering devices ranging from wearables and smartphones to large compute clusters. ExecuTorch preserves PyTorch semantics while allowing customization, support for optimizations like quantization, and pluggable execution "backends". These features together enable fast experimentation, allowing researchers to validate deployment behavior entirely within PyTorch, bridging the gap between research and production.

preprint2026arXiv

From Mirage to Grounding: Towards Reliable Multimodal Circuit-to-Verilog Code Generation

Multimodal large language models (MLLMs) are increasingly used to translate visual artifacts into code, from UI mockups into HTML to scientific plots into Python scripts. A circuit diagram can be viewed as a visual domain-specific language for hardware: it encodes timing, topology, and bit level semantics that are invisible to casual inspection yet safety critical once fabricated in silicon. Translating such diagrams into register-transfer-level(RTL) code therefore represents an extreme reliability test for vision-to-code generation. We reveal a phenomenon we call Mirage: replacing a circuit diagram with a blank image leaves Pass@k unchanged or even higher, because models bypass the visual input and instead exploit identifier semantics in the module header to retrieve canonical RTL templates. This constitutes a new, highly covert class of defect in AI-assisted code generation that directly undermines MLLMs' trustworthiness. To quantify the effect, we construct C2VEVAL and evaluate eight MLLMs under a paired Normal/Anony protocol in which Anony mode anonymizes all identifiers in both the diagram and the module header; Anony-mode scores drop sharply across all models, confirming that high Normal-mode accuracy is largely a Mirage. We then propose VeriGround (4B), trained with identifier anonymization, refusal augmentation, and D-ORPO (Decision-Focused ORPO) preference alignment that up-weights pivotal generate-or-refuse tokens. VeriGround achieves Functional Pass@1 of 46.11%/42.51%(Normal/Anony) with a False Refusal Rate of only 1.20%/0.00%, while maintaining >92% Refusal Rate on blank images. With only 4B parameters, VeriGround performs on par with GPT-5.4 under Normal and significantly outperforms all baselines under Anony, confirming genuine visual grounding.

preprint2026arXiv

Performance Benchmarks for 2-View and 3-View Fiber-Projection Fine-Grained Particle Detectors

Fine-grained scintillator detectors are critical for precision measurements in nuclear and particle physics, where accurate reconstruction of interaction vertices and secondary particle directions enables separation of signal from background events. A well-known design choice is the fiber readout geometry: traditional 2-View systems use orthogonal X and Y fibers, while next-generation 3-View designs add a third Z-fiber layer that provides unambiguous 3D voxel identification. The 2-View approach suffers from combinatorial ghost hits, that the false 3D candidates arising from fiber projection ambiguities, degrading reconstruction performance in high-multiplicity events. This paper presents comprehensive simulation benchmarks quantifying the performance difference between 2-View and 3-View geometries across key metrics. We find that the 3-View geometry reduces ghost hits by 30--90\% depending on event topology, provides robust vertex resolution across complex topologies, and maintains superior angular resolution for shower direction reconstruction. These benchmarks inform the design optimization of future detectors and provide quantitative guidance for reconstruction algorithm development across a broad range of experiments including neutrino physics, rare kaon/pion decays, and collider calorimetry.

preprint2026arXiv

Regularized Centered Emphatic Temporal Difference Learning

Off-policy temporal-difference (TD) learning with function approximation faces a structural tradeoff among stability, projection geometry, and variance control. Emphatic TD (ETD) improves the off-policy projection geometry through follow-on emphasis, but the follow-on trace can have high variance. We revisit this tradeoff through Bellman-error centering. Although centering naturally removes a common drift term from TD errors, we show that a naive centered emphatic extension introduces an auxiliary coupling that can destroy the positive-definiteness of the ETD key matrix. We propose \emph{Regularized Emphatic Temporal-Difference Learning} (RETD), which preserves the follow-on trace and regularizes only the auxiliary centering recursion, corresponding to lifting the lower-right block of the coupled key matrix from \(1\) to \(1+c\). We derive the RETD core matrix, prove convergence under a conservative sufficient regularization condition, and evaluate the method on diagnostic linear off-policy prediction tasks. The experiments show that RETD avoids the instability of naive centered emphatic learning, preserves favorable emphatic geometry, and exhibits a robust intermediate regime for the regularization parameter \(c\) across the diagnostics.

preprint2026arXiv

Self-Supervised Spatial And Zero-Shot Angular Super-Resolution by Spatial-Angular Implicit Representation For Rotating-View SNR-Efficient Diffusion MRI

Rotating-view thick-slice acquisition is highly SNR-efficient for mesoscale diffusion MRI (dMRI) but requires numerous rotating views to satisfy Nyquist sampling, resulting in long scan time. We propose a self-supervised Spatial-Angular Implicit Neural Representation (SA-INR) that reconstructs high-resolution dMRI from a single view per diffusion direction, representing a massive acceleration. Our model, an MLP conditioned on a b=0 structural prior and the b-direction via FiLM, is trained end-to-end on the anisotropic input. The framework not only accurately reconstructs the trained b-directions (spatial SR) but also learns a continuous q-space representation, enabling high-fidelity "zero-shot" synthesis of unseen b-directions (angular SR). On simulated data, our method achieved high fidelity for both trained (34.82 dB) and unseen (33.08 dB) directions. Most importantly, the synthesized angular data also improved the quantitative accuracy of downstream DTI model fitting. Our SA-INR framework breaks the classical sampling limits, paving the way for fast, quantitative high-resolution dMRI.

preprint2026arXiv

Sub-Laplacian generalized curvature dimension inequalities on Riemannian foliations

We develop a Bochner theory and Bakry-Emery calculus for horizontal Laplacians associated with general Riemannian foliations. No bundle-like assumption on the metric, nor any total geodesicity or minimality condition on the leaves is imposed. Using a metric connection adapted to the horizontal-vertical splitting, we derive explicit Bochner formulas for the horizontal Laplacian acting on horizontal and vertical gradients, as well as a unified identity for the full gradient. These formulas involve horizontal Ricci curvature, torsion and vertical mean curvature terms intrinsic to the foliated structure. From these identities, we establish generalized curvature dimension inequalities, extending earlier results in sub-Riemannian geometry. As applications, we obtain horizontal Laplacian comparison theorems, Bonnet-Myers type compactness results with explicit diameter bounds, stochastic completeness, first eigenvalue estimates and gradient and regularization estimates for the horizontal heat semigroup. The framework applies, in particular, to contact manifolds and Carnot groups of arbitrary step.

preprint2026arXiv

Taming the Entropy Cliff: Variable Codebook Size Quantization for Autoregressive Visual Generation

Most discrete visual tokenizers rely on a default design: every position in the sequence shares the same codebook. Researchers try to scale the codebook size $K$ to get better reconstruction performance. Such a constant-codebook design hits a fundamental information-theoretic limit. We observe that the per-position conditional entropy of the training set decays so quickly along the sequence that, after a few positions, the conditional distribution becomes essentially deterministic. On ImageNet with $K=16384$, this happens within only 2 out of 256 positions, turning the remaining 254 into a memorization problem. We call this phenomenon the Entropy Cliff and formalize it with a simple expression: $t^{*} = \lceil \log_2 N / \log_2 K \rceil$. Interestingly, this phenomenon is not observed in language, as its natural structure keeps the effective entropy per position well below the codebook capacity. To address this, we propose Variable Codebook Size Quantization (VCQ), where the codebook size $K_t$ grows monotonically along the sequence from $K_{\min}=2$ to $K_{\max}$, leaving the loss function, parameter count, and AR training procedure unchanged. With a vanilla autoregressive Transformer and standard next-token prediction, a base version of VCQ reduces gFID w/o CFG from 27.98 to 14.80 on ImageNet $256\times256$ over the baseline. Scaled up, it reaches gFID 1.71 with 684M autoregressive parameters, without any extra training techniques such as semantic regularization or causal alignment. The extreme information bottleneck at $K_{\min}=2$ naturally induces a coarse-to-fine semantic hierarchy: a linear probe on only the first 10 tokens reaches 43.8% top-1 accuracy on ImageNet, compared to 27.1% for uniform codebooks. Ultimately, these results show that what matters is not only the total capacity of the codebook, but also how that capacity is distributed and organized.

preprint2026arXiv

Test-time generative augmentation for medical image segmentation

Medical image segmentation is critical for clinical diagnosis, treatment planning, and monitoring, yet segmentation models often struggle with uncertainties stemming from occlusions, ambiguous boundaries, and variations in imaging devices. Traditional test-time augmentation (TTA) techniques typically rely on predefined geometric and photometric transformations, limiting their adaptability and effectiveness in complex medical scenarios. In this study, we introduced Test-Time Generative Augmentation (TTGA), a novel augmentation strategy specifically tailored for medical image segmentation at inference time. Different from conventional augmentation strategies that suffer from excessive randomness or limited flexibility, TTGA leverages a domain-fine-tuned generative model to produce contextually relevant and diverse augmentations tailored to the characteristics of each test image. Built upon diffusion model inversion, a masked null-text inversion method is proposed to enable region-specific augmentations during sampling. Furthermore, a dual denoising pathway is designed to balance precise identity preservation with controlled variability. We demonstrate the efficacy of our TTGA through extensive experiments across three distinct segmentation tasks spanning nine datasets. Our results consistently demonstrate that TTGA not only improves segmentation accuracy (with DSC gains ranging from 0.1% to 2.3% over the baseline) but also offers pixel-wise error estimation (with DSC gains ranging from 1.1% to 29.0% over the baseline). The source code and demonstration are available at: https://github.com/maxiao0234/TTGA.

preprint2026arXiv

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Autonomous systems are increasingly deployed in open and dynamic environments -- from city streets to aerial and indoor spaces -- where perception models must remain reliable under sensor noise, environmental variation, and platform shifts. However, even state-of-the-art methods often degrade under unseen conditions, highlighting the need for robust and generalizable robot sensing. The RoboSense 2025 Challenge is designed to advance robustness and adaptability in robot perception across diverse sensing scenarios. It unifies five complementary research tracks spanning language-grounded decision making, socially compliant navigation, sensor configuration generalization, cross-view and cross-modal correspondence, and cross-platform 3D perception. Together, these tasks form a comprehensive benchmark for evaluating real-world sensing reliability under domain shifts, sensor failures, and platform discrepancies. RoboSense 2025 provides standardized datasets, baseline models, and unified evaluation protocols, enabling large-scale and reproducible comparison of robust perception methods. The challenge attracted 143 teams from 85 institutions across 16 countries, reflecting broad community engagement. By consolidating insights from 23 winning solutions, this report highlights emerging methodological trends, shared design principles, and open challenges across all tracks, marking a step toward building robots that can sense reliably, act robustly, and adapt across platforms in real-world environments.

preprint2026arXiv

Towards Fine-Grained and Verifiable Concept Bottleneck Models

Concept Bottleneck Models (CBMs) offer interpretable alternatives to black-box predictors by introducing human-relatable concepts before the final output. However, existing CBMs struggle to verify whether predicted concepts correspond to the correct visual evidence, limiting their reliability. We propose a fine-grained CBM framework that grounds each concept in localized visual evidence, enabling direct inspection of where and how concepts are encoded. This design allows users to interpret predictions and verify that the model learns intended concepts rather than spurious correlations. Experiments on medical imaging benchmarks show that our learned concept space is information-complete and achieves predictive performance comparable to standard CBMs, while substantially improving transparency. Unlike post-hoc attribution methods, our framework validates both the presence and correctness of concept representations, bridging interpretability with verifiability. Our approach enhances the trustworthiness of CBMs and establishes a principled mechanism for human-model interaction at the concept level, paving the way toward more reliable and clinically actionable concept-based learning systems.

preprint2023arXiv

Quantifying quantum coherence of multiple-charge states in tunable Josephson junctions

Coherence and tunneling play central roles in quantum phenomena. In a tunneling event, the time that a particle spends inside the barrier has been fiercely debated. This problem becomes more complex when tunneling repeatedly occurs back and forth, and when involving many particles. Here we report the measurement of the coherence time of various charge states tunneling in a nanowire-based tunable Josephson junction; including single charges, multiple charges, and Cooper pairs. We studied all the charge tunneling processes using Landau-Zener-Stückelberg-Majorana (LZSM) interferometry, and observed high-quality interference patterns. In particular, the coherence time of the charge states was extracted from the interference fringes in Fourier space. In addition, our measurements show the break-up of Cooper pairs, from a macroscopic quantum coherent state to individual particle states. Besides the fundamental research interest, our results also establish LZSM interferometry as a powerful technique to explore the coherence time of charges in hybrid devices.

preprint2022arXiv

A Novel Automated Classification and Segmentation for COVID-19 using 3D CT Scans

Medical image classification and segmentation based on deep learning (DL) are emergency research topics for diagnosing variant viruses of the current COVID-19 situation. In COVID-19 computed tomography (CT) images of the lungs, ground glass turbidity is the most common finding that requires specialist diagnosis. Based on this situation, some researchers propose the relevant DL models which can replace professional diagnostic specialists in clinics when lacking expertise. However, although DL methods have a stunning performance in medical image processing, the limited datasets can be a challenge in developing the accuracy of diagnosis at the human level. In addition, deep learning algorithms face the challenge of classifying and segmenting medical images in three or even multiple dimensions and maintaining high accuracy rates. Consequently, with a guaranteed high level of accuracy, our model can classify the patients' CT images into three types: Normal, Pneumonia and COVID. Subsequently, two datasets are used for segmentation, one of the datasets even has only a limited amount of data (20 cases). Our system combined the classification model and the segmentation model together, a fully integrated diagnostic model was built on the basis of ResNet50 and 3D U-Net algorithm. By feeding with different datasets, the COVID image segmentation of the infected area will be carried out according to classification results. Our model achieves 94.52% accuracy in the classification of lung lesions by 3 types: COVID, Pneumonia and Normal. For future medical use, embedding the model into the medical facilities might be an efficient way of assisting or substituting doctors with diagnoses, therefore, a broader range of the problem of variant viruses in the COVID-19 situation may also be successfully solved.

preprint2022arXiv

A Version of Hörmander's Theorem for Markovian Rough Paths

We consider a rough differential equation of the form \(dY_t=\sum_i V_i(Y_t)d\boldsymbol{X}^i_t+V_0(Y_t)dt \), where \(\boldsymbol{X}_t \) is a Markovian rough path. We demonstrate that if the vector fields \((V_i)_{0\leq i\leq d} \) satisfy Hörmander's bracket generating condition, then \(Y_t\) admits a smooth density with a Gaussian type upper bound, given that the generator of \(X_t\) satisfy certain non-degenerate conditions. The main new ingredient of this paper is the study of non-degenerate property of the Jacobian process of \(X_t\).

preprint2022arXiv

AI-based Medical e-Diagnosis for Fast and Automatic Ventricular Volume Measurement in the Patients with Normal Pressure Hydrocephalus

Based on CT and MRI images acquired from normal pressure hydrocephalus (NPH) patients, using machine learning methods, we aim to establish a multi-modal and high-performance automatic ventricle segmentation method to achieve efficient and accurate automatic measurement of the ventricular volume. First, we extract the brain CT and MRI images of 143 definite NPH patients. Second, we manually label the ventricular volume (VV) and intracranial volume (ICV). Then, we use machine learning method to extract features and establish automatic ventricle segmentation model. Finally, we verify the reliability of the model and achieved automatic measurement of VV and ICV. In CT images, the Dice similarity coefficient (DSC), Intraclass Correlation Coefficient (ICC), Pearson correlation, and Bland-Altman analysis of the automatic and manual segmentation result of the VV were 0.95, 0.99, 0.99, and 4.2$\pm$2.6 respectively. The results of ICV were 0.96, 0.99, 0.99, and 6.0$\pm$3.8 respectively. The whole process takes 3.4$\pm$0.3 seconds. In MRI images, the DSC, ICC, Pearson correlation, and Bland-Altman analysis of the automatic and manual segmentation result of the VV were 0.94, 0.99, 0.99, and 2.0$\pm$0.6 respectively. The results of ICV were 0.93, 0.99, 0.99, and 7.9$\pm$3.8 respectively. The whole process took 1.9$\pm$0.1 seconds. We have established a multi-modal and high-performance automatic ventricle segmentation method to achieve efficient and accurate automatic measurement of the ventricular volume of NPH patients. This can help clinicians quickly and accurately understand the situation of NPH patient's ventricles.

preprint2022arXiv

An improved approach to manufacture CNT reinforced magnesium AZ91 composites with increased strength and ductility

Multiwalled carbon nanotubes (MWCNTs) are decorated with Pt nanoparticles by a "layer-by-layer" approach using poly (sodium 4-styrene sulfonate) (PSS) and poly (diallyl dimethylammonium chloride) (PDDA). Transmission electron microscopy (TEM) images and Energy Dispersive X-Ray (EDX) analysis of the samples confirm Pt deposition on surfaces of CNTs. Dispersibility and dispersion stability of MWCNTs in the solvents are enhanced when MWCNTs are coated with Pt nanoparticles. Mg AZ91 composites reinforced with MWCNTs are then produced by a melt stirring process. Compression tests of the composites show that adding 0.05\% wt Pt-coated MWCNTs in AZ91 improves the composite's mechanical properties compared to the pure AZ91 and pristine MWCNT/AZ91. Fracture surface analysis of the composite using a scanning electron microscope (SEM) shows individuals pulled out MWCNTs in the case of the Pt-coated MWCNT/AZ91 composites. We attribute this finding to the uniform dispersion of Pt-coated MWCNTs in Mg due to the improved wettability of Pt-coated MWCNTs in Mg melts. Molecular dynamics (MD) simulations of the interaction between Pt-coated MWCNTs and Mg support this interpretation.

preprint2022arXiv

Asymmetric Fraunhofer pattern in Josephson junctions from heterodimensional superlattice V$_5$S$_8$

Introduction of spin-orbit coupling (SOC) in a Josephson junction (JJ) gives rise to unusual Josephson effects. We investigate JJs based on a newly discovered heterodimensional superlattice V$_5$S$_8$ with a special form of SOC. The unique homointerface of our JJs enables elimination of extrinsic effects due to interfaces and disorder. We observe asymmetric Fraunhofer patterns with respect to both the perpendicular magnetic field and the current. The asymmetry is influenced by an in-plane magnetic field. Analysis of the pattern points to a nontrivial spatial distribution of the Josephson current that is intrinsic to the SOC in V$_5$S$_8$.

preprint2022arXiv

Automatic Fine-grained Glomerular Lesion Recognition in Kidney Pathology

Recognition of glomeruli lesions is the key for diagnosis and treatment planning in kidney pathology; however, the coexisting glomerular structures such as mesangial regions exacerbate the difficulties of this task. In this paper, we introduce a scheme to recognize fine-grained glomeruli lesions from whole slide images. First, a focal instance structural similarity loss is proposed to drive the model to locate all types of glomeruli precisely. Then an Uncertainty Aided Apportionment Network is designed to carry out the fine-grained visual classification without bounding-box annotations. This double branch-shaped structure extracts common features of the child class from the parent class and produces the uncertainty factor for reconstituting the training dataset. Results of slide-wise evaluation illustrate the effectiveness of the entire scheme, with an 8-22% improvement of the mean Average Precision compared with remarkable detection methods. The comprehensive results clearly demonstrate the effectiveness of the proposed method.

preprint2022arXiv

BashExplainer: Retrieval-Augmented Bash Code Comment Generation based on Fine-tuned CodeBERT

Developers use shell commands for many tasks, such as file system management, network control, and process management. Bash is one of the most commonly used shells and plays an important role in Linux system development and maintenance. Due to the language flexibility of Bash code, developers who are not familiar with Bash often have difficulty understanding the purpose and functionality of Bash code. In this study, we study Bash code comment generation problem and proposed an automatic method BashExplainer based on two-stage training strategy. In the first stage, we train a Bash encoder by fine-tuning CodeBERT on our constructed Bash code corpus. In the second stage, we first retrieve the most similar code from the code repository for the target code based on semantic and lexical similarity. Then we use the trained Bash encoder to generate two vector representations. Finally, we fuse these two vector representations via the fusion layer and generate the code comment through the decoder. To show the competitiveness of our proposed method, we construct a high-quality corpus by combining the corpus shared in the previous NL2Bash study and the corpus shared in the NLC2CMD competition. This corpus contains 10,592 Bash codes and corresponding comments. Then we selected ten baselines from previous studies on automatic code comment generation, which cover information retrieval methods, deep learning methods, and hybrid methods.

preprint2022arXiv

Continuously Doping Bi 2 Sr 2 CaCu 2 O 8+δ into Electron-Doped Superconductor by CaH 2 Annealing Method

As a typical hole-doped cuprate superconductor, Bi 2 Sr 2 CaCu 2 O 8+δ (Bi2212) carrier doping is mostly determined by its oxygen content. Traditional doping methods can regulate its doping level within the range of hole doping. Here we report the first application of CaH 2 annealing method in regulating the doping level of Bi2212. By continuously controlling the anneal time, a series of differently doped samples can be obtained. The combined experimental results of x-ray diffraction, scanning transmission electron microscopy, resistance and Hall measurements demonstrate that the CaH 2 induced topochemical reaction can effectively change the oxygen content of Bi2212 within a very wide range, even switching from hole doping to electron doping. We also found evidence of a low-T c superconducting phase in the electron doping side.

preprint2022arXiv

CS$^2$: A Controllable and Simultaneous Synthesizer of Images and Annotations with Minimal Human Intervention

The destitution of image data and corresponding expert annotations limit the training capacities of AI diagnostic models and potentially inhibit their performance. To address such a problem of data and label scarcity, generative models have been developed to augment the training datasets. Previously proposed generative models usually require manually adjusted annotations (e.g., segmentation masks) or need pre-labeling. However, studies have found that these pre-labeling based methods can induce hallucinating artifacts, which might mislead the downstream clinical tasks, while manual adjustment could be onerous and subjective. To avoid manual adjustment and pre-labeling, we propose a novel controllable and simultaneous synthesizer (dubbed CS$^2$) in this study to generate both realistic images and corresponding annotations at the same time. Our CS$^2$ model is trained and validated using high resolution CT (HRCT) data collected from COVID-19 patients to realize an efficient infections segmentation with minimal human intervention. Our contributions include 1) a conditional image synthesis network that receives both style information from reference CT images and structural information from unsupervised segmentation masks, and 2) a corresponding segmentation mask synthesis network to automatically segment these synthesized images simultaneously. Our experimental studies on HRCT scans collected from COVID-19 patients demonstrate that our CS$^2$ model can lead to realistic synthesized datasets and promising segmentation results of COVID infections compared to the state-of-the-art nnUNet trained and fine-tuned in a fully supervised manner.

preprint2022arXiv

Data and Physics Driven Learning Models for Fast MRI -- Fundamentals and Methodologies from CNN, GAN to Attention and Transformers

Research studies have shown no qualms about using data driven deep learning models for downstream tasks in medical image analysis, e.g., anatomy segmentation and lesion detection, disease diagnosis and prognosis, and treatment planning. However, deep learning models are not the sovereign remedy for medical image analysis when the upstream imaging is not being conducted properly (with artefacts). This has been manifested in MRI studies, where the scanning is typically slow, prone to motion artefacts, with a relatively low signal to noise ratio, and poor spatial and/or temporal resolution. Recent studies have witnessed substantial growth in the development of deep learning techniques for propelling fast MRI. This article aims to (1) introduce the deep learning based data driven techniques for fast MRI including convolutional neural network and generative adversarial network based methods, (2) survey the attention and transformer based models for speeding up MRI reconstruction, and (3) detail the research in coupling physics and data driven models for MRI acceleration. Finally, we will demonstrate through a few clinical applications, explain the importance of data harmonisation and explainable models for such fast MRI techniques in multicentre and multi-scanner studies, and discuss common pitfalls in current research and recommendations for future research directions.

preprint2022arXiv

Data Harmonisation for Information Fusion in Digital Healthcare: A State-of-the-Art Systematic Review, Meta-Analysis and Future Research Directions

Removing the bias and variance of multicentre data has always been a challenge in large scale digital healthcare studies, which requires the ability to integrate clinical features extracted from data acquired by different scanners and protocols to improve stability and robustness. Previous studies have described various computational approaches to fuse single modality multicentre datasets. However, these surveys rarely focused on evaluation metrics and lacked a checklist for computational data harmonisation studies. In this systematic review, we summarise the computational data harmonisation approaches for multi-modality data in the digital healthcare field, including harmonisation strategies and evaluation metrics based on different theories. In addition, a comprehensive checklist that summarises common practices for data harmonisation studies is proposed to guide researchers to report their research findings more effectively. Last but not least, flowcharts presenting possible ways for methodology and metric selection are proposed and the limitations of different methods have been surveyed for future research.

preprint2022arXiv

Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

Despite the success, the process of fine-tuning large-scale PLMs brings prohibitive adaptation costs. In fact, fine-tuning all the parameters of a colossal model and retaining separate instances for different tasks are practically infeasible. This necessitates a new branch of research focusing on the parameter-efficient adaptation of PLMs, dubbed as delta tuning in this paper. In contrast with the standard fine-tuning, delta tuning only fine-tunes a small portion of the model parameters while keeping the rest untouched, largely reducing both the computation and storage costs. Recent studies have demonstrated that a series of delta tuning methods with distinct tuned parameter selection could achieve performance on a par with full-parameter fine-tuning, suggesting a new promising way of stimulating large-scale PLMs. In this paper, we first formally describe the problem of delta tuning and then comprehensively review recent delta tuning approaches. We also propose a unified categorization criterion that divide existing delta tuning methods into three groups: addition-based, specification-based, and reparameterization-based methods. Though initially proposed as an efficient method to steer large models, we believe that some of the fascinating evidence discovered along with delta tuning could help further reveal the mechanisms of PLMs and even deep neural networks. To this end, we discuss the theoretical principles underlying the effectiveness of delta tuning and propose frameworks to interpret delta tuning from the perspective of optimization and optimal control, respectively. Furthermore, we provide a holistic empirical study of representative methods, where results on over 100 NLP tasks demonstrate a comprehensive performance comparison of different approaches. The experimental results also cover the analysis of combinatorial, scaling and transferable properties of delta tuning.

preprint2022arXiv

Delving into the Frequency: Temporally Consistent Human Motion Transfer in the Fourier Space

Human motion transfer refers to synthesizing photo-realistic and temporally coherent videos that enable one person to imitate the motion of others. However, current synthetic videos suffer from the temporal inconsistency in sequential frames that significantly degrades the video quality, yet is far from solved by existing methods in the pixel domain. Recently, some works on DeepFake detection try to distinguish the natural and synthetic images in the frequency domain because of the frequency insufficiency of image synthesizing methods. Nonetheless, there is no work to study the temporal inconsistency of synthetic videos from the aspects of the frequency-domain gap between natural and synthetic videos. In this paper, we propose to delve into the frequency space for temporally consistent human motion transfer. First of all, we make the first comprehensive analysis of natural and synthetic videos in the frequency domain to reveal the frequency gap in both the spatial dimension of individual frames and the temporal dimension of the video. To close the frequency gap between the natural and synthetic videos, we propose a novel Frequency-based human MOtion TRansfer framework, named FreMOTR, which can effectively mitigate the spatial artifacts and the temporal inconsistency of the synthesized videos. FreMOTR explores two novel frequency-based regularization modules: 1) the Frequency-domain Appearance Regularization (FAR) to improve the appearance of the person in individual frames and 2) Temporal Frequency Regularization (TFR) to guarantee the temporal consistency between adjacent frames. Finally, comprehensive experiments demonstrate that the FreMOTR not only yields superior performance in temporal consistency metrics but also improves the frame-level visual quality of synthetic videos. In particular, the temporal consistency metrics are improved by nearly 30% than the state-of-the-art model.

preprint2022arXiv

DocBed: A Multi-Stage OCR Solution for Documents with Complex Layouts

Digitization of newspapers is of interest for many reasons including preservation of history, accessibility and search ability, etc. While digitization of documents such as scientific articles and magazines is prevalent in literature, one of the main challenges for digitization of newspaper lies in its complex layout (e.g. articles spanning multiple columns, text interrupted by images) analysis, which is necessary to preserve human read-order. This work provides a major breakthrough in the digitization of newspapers on three fronts: first, releasing a dataset of 3000 fully-annotated, real-world newspaper images from 21 different U.S. states representing an extensive variety of complex layouts for document layout analysis; second, proposing layout segmentation as a precursor to existing optical character recognition (OCR) engines, where multiple state-of-the-art image segmentation models and several post-processing methods are explored for document layout segmentation; third, providing a thorough and structured evaluation protocol for isolated layout segmentation and end-to-end OCR.

preprint2022arXiv

DRAG: Dynamic Region-Aware GCN for Privacy-Leaking Image Detection

The daily practice of sharing images on social media raises a severe issue about privacy leakage. To address the issue, privacy-leaking image detection is studied recently, with the goal to automatically identify images that may leak privacy. Recent advance on this task benefits from focusing on crucial objects via pretrained object detectors and modeling their correlation. However, these methods have two limitations: 1) they neglect other important elements like scenes, textures, and objects beyond the capacity of pretrained object detectors; 2) the correlation among objects is fixed, but a fixed correlation is not appropriate for all the images. To overcome the limitations, we propose the Dynamic Region-Aware Graph Convolutional Network (DRAG) that dynamically finds out crucial regions including objects and other important elements, and models their correlation adaptively for each input image. To find out crucial regions, we cluster spatially-correlated feature channels into several region-aware feature maps. Further, we dynamically model the correlation with the self-attention mechanism and explore the interaction among the regions with a graph convolutional network. The DRAG achieved an accuracy of 87% on the largest dataset for privacy-leaking image detection, which is 10 percentage points higher than the state of the art. The further case study demonstrates that it found out crucial regions containing not only objects but other important elements like textures.

preprint2022arXiv

Explainable AI (XAI) in Biomedical Signal and Image Processing: Promises and Challenges

Artificial intelligence has become pervasive across disciplines and fields, and biomedical image and signal processing is no exception. The growing and widespread interest on the topic has triggered a vast research activity that is reflected in an exponential research effort. Through study of massive and diverse biomedical data, machine and deep learning models have revolutionized various tasks such as modeling, segmentation, registration, classification and synthesis, outperforming traditional techniques. However, the difficulty in translating the results into biologically/clinically interpretable information is preventing their full exploitation in the field. Explainable AI (XAI) attempts to fill this translational gap by providing means to make the models interpretable and providing explanations. Different solutions have been proposed so far and are gaining increasing interest from the community. This paper aims at providing an overview on XAI in biomedical data processing and points to an upcoming Special Issue on Deep Learning in Biomedical Image and Signal Processing of the IEEE Signal Processing Magazine that is going to appear in March 2022.

preprint2022arXiv

Explainable COVID-19 Infections Identification and Delineation Using Calibrated Pseudo Labels

The upheaval brought by the arrival of the COVID-19 pandemic has continued to bring fresh challenges over the past two years. During this COVID-19 pandemic, there has been a need for rapid identification of infected patients and specific delineation of infection areas in computed tomography (CT) images. Although deep supervised learning methods have been established quickly, the scarcity of both image-level and pixel-level labels as well as the lack of explainable transparency still hinder the applicability of AI. Can we identify infected patients and delineate the infections with extreme minimal supervision? Semi-supervised learning has demonstrated promising performance under limited labelled data and sufficient unlabelled data. Inspired by semi-supervised learning, we propose a model-agnostic calibrated pseudo-labelling strategy and apply it under a consistency regularization framework to generate explainable identification and delineation results. We demonstrate the effectiveness of our model with the combination of limited labelled data and sufficient unlabelled data or weakly-labelled data. Extensive experiments have shown that our model can efficiently utilize limited labelled data and provide explainable classification and segmentation results for decision-making in clinical routine. The code is available at https://github.com/ayanglab/XAI COVID-19.

preprint2022arXiv

Fast MRI Reconstruction: How Powerful Transformers Are?

Magnetic resonance imaging (MRI) is a widely used non-radiative and non-invasive method for clinical interrogation of organ structures and metabolism, with an inherently long scanning time. Methods by k-space undersampling and deep learning based reconstruction have been popularised to accelerate the scanning process. This work focuses on investigating how powerful transformers are for fast MRI by exploiting and comparing different novel network architectures. In particular, a generative adversarial network (GAN) based Swin transformer (ST-GAN) was introduced for the fast MRI reconstruction. To further preserve the edge and texture information, edge enhanced GAN based Swin transformer (EES-GAN) and texture enhanced GAN based Swin transformer (TES-GAN) were also developed, where a dual-discriminator GAN structure was applied. We compared our proposed GAN based transformers, standalone Swin transformer and other convolutional neural networks based GAN model in terms of the evaluation metrics PSNR, SSIM and FID. We showed that transformers work well for the MRI reconstruction from different undersampling conditions. The utilisation of GAN's adversarial structure improves the quality of images reconstructed when undersampled for 30% or higher. The code is publicly available at https://github.com/ayanglab/SwinGANMR.

preprint2022arXiv

Faster Diffusion Cardiac MRI with Deep Learning-based breath hold reduction

Diffusion Tensor Cardiac Magnetic Resonance (DT-CMR) enables us to probe the microstructural arrangement of cardiomyocytes within the myocardium in vivo and non-invasively, which no other imaging modality allows. This innovative technology could revolutionise the ability to perform cardiac clinical diagnosis, risk stratification, prognosis and therapy follow-up. However, DT-CMR is currently inefficient with over six minutes needed to acquire a single 2D static image. Therefore, DT-CMR is currently confined to research but not used clinically. We propose to reduce the number of repetitions needed to produce DT-CMR datasets and subsequently de-noise them, decreasing the acquisition time by a linear factor while maintaining acceptable image quality. Our proposed approach, based on Generative Adversarial Networks, Vision Transformers, and Ensemble Learning, performs significantly and considerably better than previous proposed approaches, bringing single breath-hold DT-CMR closer to reality.

preprint2022arXiv

Fitting AGN/galaxy X-ray-to-radio SEDs with CIGALE and improvement of the code

Modern and future surveys effectively provide a panchromatic view for large numbers of extragalactic objects. Consistently modeling these multiwavelength survey data is a critical but challenging task for extragalactic studies. The Code Investigating GALaxy Emission (CIGALE) is an efficient PYTHON code for spectral energy distribution (SED) fitting of galaxies and active galactic nuclei (AGNs). Recently, a major extension of CIGALE (named X-CIGALE) has been developed to account for AGN/galaxy X-ray emission and improve AGN modeling at UV-to-IR wavelengths. Here, we apply X-CIGALE to different samples, including COSMOS spectroscopic type 2 AGNs, CDF-S X-ray detected normal galaxies, SDSS quasars, and COSMOS radio objects. From these tests, we identify several weaknesses of X-CIGALE and improve the code accordingly. These improvements are mainly related to AGN intrinsic X-ray anisotropy, X-ray binary emission, AGN accretion-disk SED shape, and AGN radio emission. These updates improve the fit quality and allow new interpretation of the results, based on which we discuss physical implications. For example, we find that AGN intrinsic X-ray anisotropy is moderate, and can be modeled as $L_X(θ) \propto 1+\cos θ$, where $θ$ is the viewing angle measured from the AGN axis. We merge the new code into the major branch of CIGALE, and publicly release this new version as CIGALE v2022.0 on https://cigale.lam.fr

preprint2022arXiv

Fuzzy Attention Neural Network to Tackle Discontinuity in Airway Segmentation

Airway segmentation is crucial for the examination, diagnosis, and prognosis of lung diseases, while its manual delineation is unduly burdensome. To alleviate this time-consuming and potentially subjective manual procedure, researchers have proposed methods to automatically segment airways from computerized tomography (CT) images. However, some small-sized airway branches (e.g., bronchus and terminal bronchioles) significantly aggravate the difficulty of automatic segmentation by machine learning models. In particular, the variance of voxel values and the severe data imbalance in airway branches make the computational module prone to discontinuous and false-negative predictions. especially for cohorts with different lung diseases. Attention mechanism has shown the capacity to segment complex structures, while fuzzy logic can reduce the uncertainty in feature representations. Therefore, the integration of deep attention networks and fuzzy theory, given by the fuzzy attention layer, should be an escalated solution for better generalization and robustness. This paper presents an efficient method for airway segmentation, comprising a novel fuzzy attention neural network and a comprehensive loss function to enhance the spatial continuity of airway segmentation. The deep fuzzy set is formulated by a set of voxels in the feature map and a learnable Gaussian membership function. Different from the existing attention mechanism, the proposed channel-specific fuzzy attention addresses the issue of heterogeneous features in different channels. Furthermore, a novel evaluation metric is proposed to assess both the continuity and completeness of airway structures. The efficiency, generalization and robustness of the proposed method have been proved by training on normal lung disease while testing on datasets of lung cancer, COVID-19 and pulmonary fibrosis.

preprint2022arXiv

HDL: Hybrid Deep Learning for the Synthesis of Myocardial Velocity Maps in Digital Twins for Cardiac Analysis

Synthetic digital twins based on medical data accelerate the acquisition, labelling and decision making procedure in digital healthcare. A core part of digital healthcare twins is model-based data synthesis, which permits the generation of realistic medical signals without requiring to cope with the modelling complexity of anatomical and biochemical phenomena producing them in reality. Unfortunately, algorithms for cardiac data synthesis have been so far scarcely studied in the literature. An important imaging modality in the cardiac examination is three-directional CINE multi-slice myocardial velocity mapping (3Dir MVM), which provides a quantitative assessment of cardiac motion in three orthogonal directions of the left ventricle. The long acquisition time and complex acquisition produce make it more urgent to produce synthetic digital twins of this imaging modality. In this study, we propose a hybrid deep learning (HDL) network, especially for synthetic 3Dir MVM data. Our algorithm is featured by a hybrid UNet and a Generative Adversarial Network with a foreground-background generation scheme. The experimental results show that from temporally down-sampled magnitude CINE images (six times), our proposed algorithm can still successfully synthesise high temporal resolution 3Dir MVM CMR data (PSNR=42.32) with precise left ventricle segmentation (DICE=0.92). These performance scores indicate that our proposed HDL algorithm can be implemented in real-world digital twins for myocardial velocity mapping data simulation. To the best of our knowledge, this work is the first one in the literature investigating digital twins of the 3Dir MVM CMR, which has shown great potential for improving the efficiency of clinical studies via synthesised cardiac data.

preprint2022arXiv

ME-Net: Multi-Encoder Net Framework for Brain Tumor Segmentation

Glioma is the most common and aggressive brain tumor. Magnetic resonance imaging (MRI) plays a vital role to evaluate tumors for the arrangement of tumor surgery and the treatment of subsequent procedures. However, the manual segmentation of the MRI image is strenuous, which limits its clinical application. With the development of deep learning, a large number of automatic segmentation methods have been developed, but most of them stay in 2D images, which leads to subpar performance. Moreover, the serious voxel imbalance between the brain tumor and the background as well as the different sizes and locations of the brain tumor makes the segmentation of 3D images a challenging problem. Aiming at segmenting 3D MRI, we propose a model for brain tumor segmentation with multiple encoders. The structure contains four encoders and one decoder. The four encoders correspond to the four modalities of the MRI image, perform one-to-one feature extraction, and then merge the feature maps of the four modalities into the decoder. This method reduces the difficulty of feature extraction and greatly improves model performance. We also introduced a new loss function named "Categorical Dice", and set different weights for different segmented regions at the same time, which solved the problem of voxel imbalance. We evaluated our approach using the online BraTS 2020 Challenge verification. Our proposed method can achieve promising results in the validation set compared to the state-of-the-art approaches with Dice scores of 0.70249, 0.88267, and 0.73864 for the intact tumor, tumor core, and enhanced tumor, respectively.

preprint2022arXiv

Nitrogen decoration of basal plane dislocations in 4H-SiC

Basal-plane dislocations (BPDs) pose a great challenge to the reliability of bipolar power devices based on the 4H silicon carbide (4H-SiC). It is well established that heavy nitrogen (N) doping promotes the conversion of BPDs to threading edge dislocations (TEDs) and improves the reliability of 4H-SiC-based bipolar power devices. However, the interaction between N and BPDs, and the effect of N on the electronic properties of BPDs are still ambiguous, which significantly hinder the understanding on the electron-transport mechanism of 4H-SiC-based bipolar power devices. Combining molten-alkali etching and the Kelvin probe force microscopy (KPFM) analysis, we demonstrate that BPDs create acceptor-like states in undoped 4H-SiC, while acting as donors in N-doped 4H-SiC. First-principles calculations verify that BPDs create occupied defect states above the valence band maximum (VBM) and unoccupied defect states under the conduction-band minimum (CBM) of undoped 4H-SiC. The electron transfer from the defect states of intrinsic defects and native impurities to the unoccupied defect states of BPDs gives rise to the acceptor-like behavior of BPDs in undoped 4H-SiC. Defect formation energies indicate that N atoms can spontaneously decorate BPDs during the N doping of 4H-SiC. The binding between N and BPD is strong against decomposition. The accumulation of N dopants at the core of BPDs results in the accumulation of donor-like states at the core of BPDs in N-doped 4H-SiC. This work not only enriches the understanding on the electronic behavior of BPDs in N-doped 4H-SiC, but also helps understand the electron transport mechanism of 4H-SiC-based bipolar power devices.

preprint2022arXiv

Online Attentive Kernel-Based Temporal Difference Learning

With rising uncertainty in the real world, online Reinforcement Learning (RL) has been receiving increasing attention due to its fast learning capability and improving data efficiency. However, online RL often suffers from complex Value Function Approximation (VFA) and catastrophic interference, creating difficulty for the deep neural network to be applied to an online RL algorithm in a fully online setting. Therefore, a simpler and more adaptive approach is introduced to evaluate value function with the kernel-based model. Sparse representations are superior at handling interference, indicating that competitive sparse representations should be learnable, non-prior, non-truncated and explicit when compared with current sparse representation methods. Moreover, in learning sparse representations, attention mechanisms are utilized to represent the degree of sparsification, and a smooth attentive function is introduced into the kernel-based VFA. In this paper, we propose an Online Attentive Kernel-Based Temporal Difference (OAKTD) algorithm using two-timescale optimization and provide convergence analysis of our proposed algorithm. Experimental evaluations showed that OAKTD outperformed several Online Kernel-based Temporal Difference (OKTD) learning algorithms in addition to the Temporal Difference (TD) learning algorithm with Tile Coding on public Mountain Car, Acrobot, CartPole and Puddle World tasks.

preprint2022arXiv

Region-Based Evidential Deep Learning to Quantify Uncertainty and Improve Robustness of Brain Tumor Segmentation

Despite recent advances in the accuracy of brain tumor segmentation, the results still suffer from low reliability and robustness. Uncertainty estimation is an efficient solution to this problem, as it provides a measure of confidence in the segmentation results. The current uncertainty estimation methods based on quantile regression, Bayesian neural network, ensemble, and Monte Carlo dropout are limited by their high computational cost and inconsistency. In order to overcome these challenges, Evidential Deep Learning (EDL) was developed in recent work but primarily for natural image classification. In this paper, we proposed a region-based EDL segmentation framework that can generate reliable uncertainty maps and robust segmentation results. We used the Theory of Evidence to interpret the output of a neural network as evidence values gathered from input features. Following Subjective Logic, evidence was parameterized as a Dirichlet distribution, and predicted probabilities were treated as subjective opinions. To evaluate the performance of our model on segmentation and uncertainty estimation, we conducted quantitative and qualitative experiments on the BraTS 2020 dataset. The results demonstrated the top performance of the proposed method in quantifying segmentation uncertainty and robustly segmenting tumors. Furthermore, our proposed new framework maintained the advantages of low computational cost and easy implementation and showed the potential for clinical application.

preprint2022arXiv

Spectral Energy Distributions in Three Deep-Drilling Fields of the Vera C. Rubin Observatory Legacy Survey of Space and Time: Source Classification and Galaxy Properties

W-CDF-S, ELAIS-S1, and XMM-LSS will be three Deep-Drilling Fields (DDFs) of the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST), but their extensive multi-wavelength data have not been fully utilized as done in the COSMOS field, another LSST DDF. To prepare for future science, we fit source spectral energy distributions (SEDs) from X-ray to far-infrared in these three fields mainly to derive galaxy stellar masses and star-formation rates. We use CIGALE v2022.0, a code that has been regularly developed and evaluated, for the SED fitting. Our catalog includes 0.8 million sources covering $4.9~\mathrm{deg^2}$ in W-CDF-S, 0.8 million sources covering $3.4~\mathrm{deg^2}$ in ELAIS-S1, and 1.2 million sources covering $4.9~\mathrm{deg^2}$ in XMM-LSS. Besides fitting normal galaxies, we also select candidates that may host active galactic nuclei (AGNs) or are experiencing recent star-formation variations and use models specifically designed for these sources to fit their SEDs; this increases the utility of our catalog for various projects in the future. We calibrate our measurements by comparison with those in well-studied smaller regions and briefly discuss the implications of our results. We also perform detailed tests of the completeness and purity of SED-selected AGNs. Our data can be retrieved from a public website.

preprint2022arXiv

Swin Deformable Attention U-Net Transformer (SDAUT) for Explainable Fast MRI

Fast MRI aims to reconstruct a high fidelity image from partially observed measurements. Exuberant development in fast MRI using deep learning has been witnessed recently. Meanwhile, novel deep learning paradigms, e.g., Transformer based models, are fast-growing in natural language processing and promptly developed for computer vision and medical image analysis due to their prominent performance. Nevertheless, due to the complexity of the Transformer, the application of fast MRI may not be straightforward. The main obstacle is the computational cost of the self-attention layer, which is the core part of the Transformer, can be expensive for high resolution MRI inputs. In this study, we propose a new Transformer architecture for solving fast MRI that coupled Shifted Windows Transformer with U-Net to reduce the network complexity. We incorporate deformable attention to construe the explainability of our reconstruction model. We empirically demonstrate that our method achieves consistently superior performance on the fast MRI task. Besides, compared to state-of-the-art Transformer models, our method has fewer network parameters while revealing explainability. The code is publicly available at https://github.com/ayanglab/SDAUT.

preprint2022arXiv

Swin Transformer for Fast MRI

Magnetic resonance imaging (MRI) is an important non-invasive clinical tool that can produce high-resolution and reproducible images. However, a long scanning time is required for high-quality MR images, which leads to exhaustion and discomfort of patients, inducing more artefacts due to voluntary movements of the patients and involuntary physiological movements. To accelerate the scanning process, methods by k-space undersampling and deep learning based reconstruction have been popularised. This work introduced SwinMR, a novel Swin transformer based method for fast MRI reconstruction. The whole network consisted of an input module (IM), a feature extraction module (FEM) and an output module (OM). The IM and OM were 2D convolutional layers and the FEM was composed of a cascaded of residual Swin transformer blocks (RSTBs) and 2D convolutional layers. The RSTB consisted of a series of Swin transformer layers (STLs). The shifted windows multi-head self-attention (W-MSA/SW-MSA) of STL was performed in shifted windows rather than the multi-head self-attention (MSA) of the original transformer in the whole image space. A novel multi-channel loss was proposed by using the sensitivity maps, which was proved to reserve more textures and details. We performed a series of comparative studies and ablation studies in the Calgary-Campinas public brain MR dataset and conducted a downstream segmentation experiment in the Multi-modal Brain Tumour Segmentation Challenge 2017 dataset. The results demonstrate our SwinMR achieved high-quality reconstruction compared with other benchmark methods, and it shows great robustness with different undersampling masks, under noise interruption and on different datasets. The code is publicly available at https://github.com/ayanglab/SwinMR.

preprint2022arXiv

The nuclear-spin-forbidden rovibrational transitions of water from first principles

The water molecule occurs in two nuclear-spin isomers that differ by the value of the total nuclear spin of the hydrogen atoms, i.e., $I=0$ for para-H$_2$O and $I=1$ for ortho-H$_2$O. Spectroscopic transitions between rovibrational states of ortho and para water are extremely weak due to the tiny hyperfine nuclear-spin-rotation interaction of only $\sim30$ kHz and so far were not observed. We report the first comprehensive theoretical investigation of the hyperfine effects and ortho-para transitions in H$_2$$^{16}$O due to nuclear-spin-rotation and spin-spin interactions. We also present the details of our newly developed general variational approach to the simulation of hyperfine effects in polyatomic molecules. Our results for water suggest that the strongest ortho-para transitions with room-temperature intensities on the order of $10^{-31}$ cm/molecule are about an order of magnitude larger than previously predicted values and should be detectable in the mid-infrared $ν_2$ and near-infrared $2ν_1+ν_2$ and $ν_1+ν_2+ν_3$ bands by current spectroscopy experiments.

preprint2022arXiv

Towards Reliable and Explainable AI Model for Solid Pulmonary Nodule Diagnosis

Lung cancer has the highest mortality rate of deadly cancers in the world. Early detection is essential to treatment of lung cancer. However, detection and accurate diagnosis of pulmonary nodules depend heavily on the experiences of radiologists and can be a heavy workload for them. Computer-aided diagnosis (CAD) systems have been developed to assist radiologists in nodule detection and diagnosis, greatly easing the workload while increasing diagnosis accuracy. Recent development of deep learning, greatly improved the performance of CAD systems. However, lack of model reliability and interpretability remains a major obstacle for its large-scale clinical application. In this work, we proposed a multi-task explainable deep-learning model for pulmonary nodule diagnosis. Our neural model can not only predict lesion malignancy but also identify relevant manifestations. Further, the location of each manifestation can also be visualized for visual interpretability. Our proposed neural model achieved a test AUC of 0.992 on LIDC public dataset and a test AUC of 0.923 on our in-house dataset. Moreover, our experimental results proved that by incorporating manifestation identification tasks into the multi-task model, the accuracy of the malignancy classification can also be improved. This multi-task explainable model may provide a scheme for better interaction with the radiologists in a clinical environment.

preprint2022arXiv

Unsupervised Image Registration Towards Enhancing Performance and Explainability in Cardiac And Brain Image Analysis

Magnetic Resonance Imaging (MRI) typically recruits multiple sequences (defined here as "modalities"). As each modality is designed to offer different anatomical and functional clinical information, there are evident disparities in the imaging content across modalities. Inter- and intra-modality affine and non-rigid image registration is an essential medical image analysis process in clinical imaging, as for example before imaging biomarkers need to be derived and clinically evaluated across different MRI modalities, time phases and slices. Although commonly needed in real clinical scenarios, affine and non-rigid image registration is not extensively investigated using a single unsupervised model architecture. In our work, we present an un-supervised deep learning registration methodology which can accurately model affine and non-rigid trans-formations, simultaneously. Moreover, inverse-consistency is a fundamental inter-modality registration property that is not considered in deep learning registration algorithms. To address inverse-consistency, our methodology performs bi-directional cross-modality image synthesis to learn modality-invariant latent rep-resentations, while involves two factorised transformation networks and an inverse-consistency loss to learn topology-preserving anatomical transformations. Overall, our model (named "FIRE") shows improved performances against the reference standard baseline method on multi-modality brain 2D and 3D MRI and intra-modality cardiac 4D MRI data experiments.

preprint2022arXiv

Unsupervised Tissue Segmentation via Deep Constrained Gaussian Network

Tissue segmentation is the mainstay of pathological examination, whereas the manual delineation is unduly burdensome. To assist this time-consuming and subjective manual step, researchers have devised methods to automatically segment structures in pathological images. Recently, automated machine and deep learning based methods dominate tissue segmentation research studies. However, most machine and deep learning based approaches are supervised and developed using a large number of training samples, in which the pixelwise annotations are expensive and sometimes can be impossible to obtain. This paper introduces a novel unsupervised learning paradigm by integrating an end-to-end deep mixture model with a constrained indicator to acquire accurate semantic tissue segmentation. This constraint aims to centralise the components of deep mixture models during the calculation of the optimisation function. In so doing, the redundant or empty class issues, which are common in current unsupervised learning methods, can be greatly reduced. By validation on both public and in-house datasets, the proposed deep constrained Gaussian network achieves significantly (Wilcoxon signed-rank test) better performance (with the average Dice scores of 0.737 and 0.735, respectively) on tissue segmentation with improved stability and robustness, compared to other existing unsupervised segmentation approaches. Furthermore, the proposed method presents a similar performance (p-value > 0.05) compared to the fully supervised U-Net.

preprint2021arXiv

A Multi-Band Forced-Photometry Catalog in the ELAIS-S1 Field

The ELAIS-S1 field will be an LSST Deep Drilling field, and it also has extensive multiwavelength coverage. To improve the utility of the existing data, we use The Tractor to perform forced-photometry measurements in this field. We compile data in 16 bands from the DeepDrill, VIDEO, DES, ESIS, and VOICE surveys. Using a priori information from the high-resolution fiducial images in VIDEO, we model the images in other bands and generate a forced-photometry catalog. This technique enables consistency throughout different surveys, deblends sources from low-resolution images, extends photometric measurements to a fainter magnitude regime, and improves photometric-redshift estimates. Our catalog contains over 0.8 million sources covering a 3.4 deg2 area in the VIDEO footprint and is available at 10.5281/zenodo.4540178.

preprint2021arXiv

Electro-optically modulated polarization mode conversion in lithium niobate ridge waveguides

Lithium niobate on insulator (LNOI) waveguides, as an emerging technology, have proven to offer a promising platform for integrated optics, due to their strong optical confinement comparable to silicon on insulator (SOI) waveguides, while possessing the versatile properties of lithium niobate, such as high electro-optic coefficients. In this paper, we show that mode hybridization, a phenomenon widely found in vertically asymmetric waveguides, can be efficiently modulated in an LNOI ridge waveguide by electro-optic effect, leading to a polarization mode converter with 97% efficiency. Moreover, the proposed device does not require tapering or periodic poling, thereby greatly simplifying the fabrication process. It can also be actively switched by external fields. Such a platform facilitates technological progress of photonic circuits and sensors.

preprint2021arXiv

Light dark bosons in the JUNO-TAO neutrino detector

This work presents a sensitivity study of a reactor liquid scintillator detector to three kinds of dark bosons with masses below 1 MeV, such as dark photons, axion-like particles and light scalar bosons. The JUNO-TAO detector with Taishan nuclear reactor is taken as a reference. With proposed 180 days data taking, the sensitivity to the dark bosons can reach $\sim10^{-5}$ 95%C.L. for the optimized signal to background ratio for the electron coupling constant $\it{g_X} $ through inverse Compton-like scattering. Similar calculations are completed for axion-like particles and scalar bosons. The background systematic uncertainty presents as the main limiting factor for the further sensitivity improvement. Several remarks are made to the controversial analysis for the NEOS experiment. Additionally the differential and the inverse differential cross sections have been derived for all three boson types and their interactions with electrons in liquid scintillator.

preprint2021arXiv

Marangoni Convection-Driven Laser Fountains and Waves on Free Surfaces of Liquids

It is well accepted that an outward Marangoni convection from a low surface tension region will make the surface depressed. Here, we report that this established perception is only valid for thin liquid films. Using surface laser heating, we show that in deep liquids a laser beam actually pulls up the fluid above the free surface generating fountains with different shapes. Whereas with decreasing liquid depth a transition from fountain to indentation with fountain in-indentation is observed. Further, high-speed imaging reveals a transient surface process before steady elevation is formed, and this dynamic deformation is subsequently utilized to resonantly excite giant surface waves by a modulated laser beam. Computational fluid dynamics models reveal the underlying flow patterns and quantify the depth-dependent and time-resolved surface deformations. Our discoveries and techniques have upended the century-old perception and opened up a new regime of interdisciplinary research and applications of Marangoni-induced interface phenomena and optocapillary fluidic surfaces-the control of fluids with light.

preprint2021arXiv

Spin-orbit coupling suppression and singlet-state blocking of spin-triplet Cooper pairs

An inhomogeneous magnetic exchange field at a superconductor/ferromagnet interface converts spin-singlet Cooper pairs to a spin-aligned (i.e. spin-polarized) triplet state. Although the decay envelope of such triplet pairs within ferromagnetic materials is well studied, little is known about their decay in non-magnetic metals and superconductors, and in particular in the presence of spin-orbit coupling (SOC). Here we investigate devices in which triplet supercurrents are injected into the s-wave superconductor Nb. In the normal state of Nb, triplet supercurrents decay over a distance of 5 nm, which is an order of magnitude smaller than the decay of spin singlet pairs due to the SOC interacting with the spin associated with triplet pairs. In the superconducting state of Nb, triplet supercurrents are not able to couple with the singlet wavefunction and thus blocked by the absence of available equilibrium states in the singlet gap. The results offer new insight into the dynamics between s-wave singlet and s-wave triplet states.

preprint2021arXiv

The viewing angle in AGN SED models, a data-driven analysis

The validity of the unified active galactic nuclei (AGN) model has been challenged in the last decade, especially when different types of AGNs are considered to only differ in the viewing angle to the torus. We aim to assess the importance of the viewing angle in classifying different types of Seyfert galaxies in spectral energy distribution (SED) modelling. We retrieve photometric data from publicly available astronomical databases: CDS and NED, to model SEDs with X-CIGALE in a sample of 13 173 Seyfert galaxies located at redshift range from $z=0$ to $z=3.5$, with a median redshift of $z\approx0.2$. We assess whether the estimated viewing angle from the SED models reflects different Seyfert classifications. Two AGN models with either a smooth or clumpy torus structure are adopted in this paper. We find that the viewing angle in Type-1 AGNs is better constrained than in Type-2 AGNs. Limiting the viewing angles representing these two types of AGNs do not affect the physical parameter estimates such as star-formation rate (SFR) or AGN fractional contribution ($f_{\rm{AGN}}$). In addition, the viewing angle is not the most discriminating physical parameter to differentiate Seyfert types. We suggest that the observed and intrinsic AGN disc luminosity can: i) be used in $z<0.5$ studies to distinguish between Type-1 and Type-2 AGNs, and ii) explain the probable evolutionary path between these AGN types. Finally, we propose the use of X-CIGALE for AGN galaxy classification tasks. All data from the 13 173 SED fits are available at https://doi.org/10.5281/zenodo.5221764

preprint2021arXiv

Thermal annealing enhancement of Josephson critical currents in ferromagnetic CoFeB

The electrical and structural properties of Co40Fe40B20 (CoFeB) alloy are tunable with thermal annealing. This is key in the optimization of CoFeB-based spintronic devices, where the advantageously low magnetic coercivity, high spin polarization, and controllable magnetocrystalline anisotropy are utilised. So far, there has been no report on superconducting devices based on CoFeB. Here, we report Nb/CoFeB/Nb Josephson devices and demonstrate an enhancement of the critical current by up to 700% following thermal annealing due to increased structural ordering of the CoFeB. The results demonstrate that CoFeB is a promising material for the development of superconducting spintronic devices.

preprint2021arXiv

Unbox the Black-box for the Medical Explainable AI via Multi-modal and Multi-centre Data Fusion: A Mini-Review, Two Showcases and Beyond

Explainable Artificial Intelligence (XAI) is an emerging research topic of machine learning aimed at unboxing how AI systems&#39; black-box choices are made. This research field inspects the measures and models involved in decision-making and seeks solutions to explain them explicitly. Many of the machine learning algorithms can not manifest how and why a decision has been cast. This is particularly true of the most popular deep neural network approaches currently in use. Consequently, our confidence in AI systems can be hindered by the lack of explainability in these black-box models. The XAI becomes more and more crucial for deep learning powered applications, especially for medical and healthcare studies, although in general these deep neural networks can return an arresting dividend in performance. The insufficient explainability and transparency in most existing AI systems can be one of the major reasons that successful implementation and integration of AI tools into routine clinical practice are uncommon. In this study, we first surveyed the current progress of XAI and in particular its advances in healthcare applications. We then introduced our solutions for XAI leveraging multi-modal and multi-centre data fusion, and subsequently validated in two showcases following real clinical scenarios. Comprehensive quantitative and qualitative analyses can prove the efficacy of our proposed XAI solutions, from which we can envisage successful applications in a broader range of clinical questions.

preprint2020arXiv

Annealing Genetic GAN for Minority Oversampling

The key to overcome class imbalance problems is to capture the distribution of minority class accurately. Generative Adversarial Networks (GANs) have shown some potentials to tackle class imbalance problems due to their capability of reproducing data distributions given ample training data samples. However, the scarce samples of one or more classes still pose a great challenge for GANs to learn accurate distributions for the minority classes. In this work, we propose an Annealing Genetic GAN (AGGAN) method, which aims to reproduce the distributions closest to the ones of the minority classes using only limited data samples. Our AGGAN renovates the training of GANs as an evolutionary process that incorporates the mechanism of simulated annealing. In particular, the generator uses different training strategies to generate multiple offspring and retain the best. Then, we use the Metropolis criterion in the simulated annealing to decide whether we should update the best offspring for the generator. As the Metropolis criterion allows a certain chance to accept the worse solutions, it enables our AGGAN steering away from the local optimum. According to both theoretical analysis and experimental studies on multiple imbalanced image datasets, we prove that the proposed training strategy can enable our AGGAN to reproduce the distributions of minority classes from scarce samples and provide an effective and robust solution for the class imbalance problem.

preprint2020arXiv

Deep Attentive Wasserstein Generative Adversarial Networks for MRI Reconstruction with Recurrent Context-Awareness

The performance of traditional compressive sensing-based MRI (CS-MRI) reconstruction is affected by its slow iterative procedure and noise-induced artefacts. Although many deep learning-based CS-MRI methods have been proposed to mitigate the problems of traditional methods, they have not been able to achieve more robust results at higher acceleration factors. Most of the deep learning-based CS-MRI methods still can not fully mine the information from the k-space, which leads to unsatisfactory results in the MRI reconstruction. In this study, we propose a new deep learning-based CS-MRI reconstruction method to fully utilise the relationship among sequential MRI slices by coupling Wasserstein Generative Adversarial Networks (WGAN) with Recurrent Neural Networks. Further development of an attentive unit enables our model to reconstruct more accurate anatomical structures for the MRI data. By experimenting on different MRI datasets, we have demonstrated that our method can not only achieve better results compared to the state-of-the-arts but can also effectively reduce residual noise generated during the reconstruction process.

preprint2020arXiv

Direct Quantification for Coronary Artery Stenosis Using Multiview Learning

The quantification of the coronary artery stenosis is of significant clinical importance in coronary artery disease diagnosis and intervention treatment. It aims to quantify the morphological indices of the coronary artery lesions such as minimum lumen diameter, reference vessel diameter, lesion length, and these indices are the reference of the interventional stent placement. In this study, we propose a direct multiview quantitative coronary angiography (DMQCA) model as an automatic clinical tool to quantify the coronary artery stenosis from X-ray coronary angiography images. The proposed DMQCA model consists of a multiview module with two attention mechanisms, a key-frame module, and a regression module, to achieve direct accurate multiple-index estimation. The multi-view module comprehensively learns the Spatio-temporal features of coronary arteries through a three-dimensional convolution. The attention mechanisms of each view focus on the subtle feature of the lesion region and capture the important context information. The key-frame module learns the subtle features of the stenosis through successive dilated residual blocks. The regression module finally generates the indices estimation from multiple features.

preprint2020arXiv

Enhanced carrier mobility in anisotropic 2D tetrahex-carbon through strain engineering

A recently predicted two dimensional (2D) carbon allotrope, tetrahex-carbon consisting of tetragonal and hexagonal rings, draws research interests due to its unique mechanical and electronic properties. Tetrahex-C shows ultrahigh strength, negative Poisson ratio, a direct band gap and high carrier mobility. In this work, we employ first-principles density-functional theory calculations to explore the directional dependence of electronic properties such as carrier effective mass and mobility in tetrahex-C. Tetrahex-C demonstrates strong anisotropicity in effective mass of charge carrier and therefore its mobility (electric conductance) exhibits a strong orientation preference. More interesting, we find that such unique anisotropic carrier effective mass and mobility can be controlled by simple uniaxial strain. The orientation dependence of effective mass can be dramatically rotated by 90 degrees through applying uniaxial tensile strain beyond ~ 7% (11%) in the armchair direction for the hole (electron). As a result, the intrinsic carrier mobility in tetrahex-C is significantly enhanced. The results are useful for potential electronic and mechanical applications in tetrahex-C.

preprint2020arXiv

GHAST: Breaking Confirmation Delay Barrier in Nakamoto Consensus via Adaptive Weighted Blocks

Initiated from Nakamoto&#39;s Bitcoin system, blockchain technology has demonstrated great capability of building secure consensus among decentralized parties at Internet-scale, i.e., without relying on any centralized trusted party. Nowadays, blockchain systems find applications in various fields. But the performance is increasingly becoming a bottleneck, especially when permissionless participation is retained for full decentralization. In this work, we present a new consensus protocol named GHAST (Greedy Heaviest Adaptive Sub-Tree) which organizes blocks in a Tree-Graph structure (i.e., a directed acyclic graph (DAG) with a tree embedded) that allows fast and concurrent block generation. GHAST protocol simultaneously achieves a logarithmically bounded liveness guarantee and low confirmation latency. More specifically, for maximum latency $d$ and adversarial computing power bounded away from 50\%, GHAST guarantees confirmation with confidence $\ge 1-\varepsilon$ after a time period of $O(d\cdot \log(1/\varepsilon))$. When there is no observable attack, GHAST only needs $3d$ time to achieve confirmation at the same confidence level as six-block-confirmation in Bitcoin, while it takes roughly $360d$ in Bitcoin.

preprint2020arXiv

Gleason Score Prediction using Deep Learning in Tissue Microarray Image

Prostate cancer (PCa) is one of the most common cancers in men around the world. The most accurate method to evaluate lesion levels of PCa is microscopic inspection of stained biopsy tissue and estimate the Gleason score of tissue microarray (TMA) image by expert pathologists. However, it is time-consuming for pathologists to identify the cellular and glandular patterns for Gleason grading in large TMA images. We used Gleason2019 Challenge dataset to build a convolutional neural network (CNN) model to segment TMA images to regions of different Gleason grades and predict the Gleason score according to the grading segmentation. We used a pre-trained model of prostate segmentation to increase the accuracy of the Gleason grade segmentation. The model achieved a mean Dice of 75.6% on the test cohort and ranked 4th in the Gleason2019 Challenge with a score of 0.778 combined of Cohen&#39;s kappa and the f1-score.

preprint2020arXiv

JWST/MIRI Simulated Imaging: Insights into Obscured Star-Formation and AGN for Distant Galaxies in Deep Surveys

The JWST MIRI instrument will revolutionize extragalactic astronomy with unprecedented sensitivity and angular resolution in mid-IR. Here, we assess the potential of MIRI photometry to constrain galaxy properties in the Cosmic Evolution Early Release Science (CEERS) survey. We derive estimated MIRI fluxes from the spectral energy distributions (SEDs) of real sources that fall in a planned MIRI pointing. We also obtain MIRI fluxes for hypothetical AGN-galaxy mixed models varying the AGN fractional contribution to the total IR luminosity ($\rm frac_{AGN}$). Based on these model fluxes, we simulate CEERS imaging (3.6-hour exposure) in 6 bands from F770W to F2100W using MIRISIM, and reduce these data using JWST PIPELINE. We perform PSF-matched photometry with TPHOT, and fit the source SEDs with X-CIGALE, simultaneously modeling photometric redshift and other physical properties. Adding the MIRI data, the accuracy of both redshift and $\rm frac_{AGN}$ is generally improved by factors of $\gtrsim 2$ for all sources at $z\lesssim 3$. Notably, for pure-galaxy inputs ($\rm frac_{AGN}=0$), the accuracy of $\rm frac_{AGN}$ is improved by $\sim 100$ times thanks to MIRI. The simulated CEERS MIRI data are slightly more sensitive to AGN detections than the deepest X-ray survey, based on the empirical $L_{\rm X}$-$L_{\rm 6μm}$ relation. Like X-ray observations, MIRI can also be used to constrain the AGN accretion power (accuracy $\approx 0.3$ dex). Our work demonstrates that MIRI will be able to place strong constraints on the mid-IR luminosities from star formation and AGN, and thereby facilitate studies of the galaxy/AGN co-evolution.

preprint2020arXiv

Magnitude and Spatial Distribution Control of the Supercurrent in Bi2O2Se-Based Josephson Junction

Many proposals in exploring topological quantum computation are based on superconducting quantum devices constructed on materials with strong spin-orbit coupling (SOC). For these devices, a full control on both the magnitude and the spatial distribution of the supercurrent would be highly demanded, but has been elusive up to now. We constructed proximity-type Josephson junction on nanoplates of Bi2O2Se, a new emerging semiconductor with strong SOC. Through electrical gating, we show that the supercurrent can be fully turned ON and OFF, and its real-space pathways can be configured either through the bulk or along the edges. Our work demonstrates Bi2O2Se as a promising platform for constructing multifunctional hybrid superconducting devices as well as for searching for topological superconductivity.

preprint2020arXiv

Molecular Patterning and Directed Self-Assembly of Gold Nanoparticles on GaAs

The ability to create micro/nano patterns of organic self-assembled monolayers on semiconductor surfaces is crucial for fundamental studies and applications in a number of emerging fields in nanoscience. Here, we demonstrate the patterning of thiol molecular SAMs on oxide-free GaAs surface by dip-pen nanolithography and micro-contact printing, facilitated by a process of surface etching and passivation of the GaAs. A quantitative analysis on the molecular diffusion on GaAs was conducted by examining the writing of nanoscale dot and line patterns by DPN, which agrees well with surface diffusion models. The functionality of the patterned thiol molecules was demonstrated by directed self-assembly of gold nanoparticles onto a template of 4-Aminothiophenol SAM on GaAs. The highly selective assembly of the Au NPs was evidenced with atomic force microscopy and scanning electron microscopy. The ability to precisely control the assembly of Au NPs on oxide-free semiconductor surfaces using molecular templates may lead to an efficient bottom-up method for the fabrication of nano-plasmonic structures.

preprint2020arXiv

Non-uniform Sampled Motion Planning for Continuous-time STL

This paper presents an offline motion planner for linear cyber-physical systems that satisfy a continuous-time Signal Temporal Logic (STL) specification, in which controls are applied in a Zeroth-order Hold (ZOH) manner. The motion planning problem is formulated as a Mixed-integer Program (MIP) with nonuniform control updates. We develop a novel method to obtain bounds of Control Barrier Functions (CBF) and linear predicates to render both spatial and temporal requirements. The theoretical results are validated in numerical examples.

preprint2020arXiv

Real-Time Edge Intelligence in the Making: A Collaborative Learning Framework via Federated Meta-Learning

Many IoT applications at the network edge demand intelligent decisions in a real-time manner. The edge device alone, however, often cannot achieve real-time edge intelligence due to its constrained computing resources and limited local data. To tackle these challenges, we propose a platform-aided collaborative learning framework where a model is first trained across a set of source edge nodes by a federated meta-learning approach, and then it is rapidly adapted to learn a new task at the target edge node, using a few samples only. Further, we investigate the convergence of the proposed federated meta-learning algorithm under mild conditions on node similarity and the adaptation performance at the target edge. To combat against the vulnerability of meta-learning algorithms to possible adversarial attacks, we further propose a robust version of the federated meta-learning algorithm based on distributionally robust optimization, and establish its convergence under mild conditions. Experiments on different datasets demonstrate the effectiveness of the proposed Federated Meta-Learning based framework.

preprint2020arXiv

Self-formed 2D/3D Heterostructure on the Edge of 2D Ruddlesden-Popper Hybrid Perovskites Responsible for Intriguing Optoelectronic Properties and Higher Cell Efficiency

The observation of low energy edge photoluminescence and its beneficial effect on the solar cell efficiency of Ruddlesden-Popper perovskites has unleashed an intensive research effort to reveal its origin. This effort, however, has been met with more challenges as the underlying material structure has still not been identified; new modellings and observations also do not seem to converge. Using 2D (BA)2(MA)2Pb3Br10 as an example, we show that 3D MAPbBr3 is formed due to the loss of BA on the edge. This self-formed MAPbBr3 can explain the reported edge emission under various conditions, while the reported intriguing optoelectronic properties such as fast exciton trapping from the interior 2D perovskite, rapid exciton dissociation and long carrier lifetime can be understood via the self-formed 2D/3D lateral perovskite heterostructure. The 3D perovskite is identified by submicron infrared spectroscopy, the emergence of XRD signature from freezer-milled nanometer-sized 2D perovskite and its photoluminescence response to external hydrostatic pressure. The revelation of this edge emission mystery and the identification of a self-formed 2D/3D heterostructure provide a new approach to engineering 2D perovskites for high-performance optoelectronic devices.

preprint2020arXiv

Simultaneous Left Atrium Anatomy and Scar Segmentations via Deep Learning in Multiview Information with Attention

Three-dimensional late gadolinium enhanced (LGE) cardiac MR (CMR) of left atrial scar in patients with atrial fibrillation (AF) has recently emerged as a promising technique to stratify patients, to guide ablation therapy and to predict treatment success. This requires a segmentation of the high intensity scar tissue and also a segmentation of the left atrium (LA) anatomy, the latter usually being derived from a separate bright-blood acquisition. Performing both segmentations automatically from a single 3D LGE CMR acquisition would eliminate the need for an additional acquisition and avoid subsequent registration issues. In this paper, we propose a joint segmentation method based on multiview two-task (MVTT) recursive attention model working directly on 3D LGE CMR images to segment the LA (and proximal pulmonary veins) and to delineate the scar on the same dataset. Using our MVTT recursive attention model, both the LA anatomy and scar can be segmented accurately (mean Dice score of 93% for the LA anatomy and 87% for the scar segmentations) and efficiently (~0.27 seconds to simultaneously segment the LA anatomy and scars directly from the 3D LGE CMR dataset with 60-68 2D slices). Compared to conventional unsupervised learning and other state-of-the-art deep learning based methods, the proposed MVTT model achieved excellent results, leading to an automatic generation of a patient-specific anatomical model combined with scar segmentation for patients in AF.

preprint2020arXiv

Valley polarization and valleyresistance in monolayer transition metal dichalcogenides superlattice

Manipulating the valley degree of freedom to encode information for potential valleytronic devices has ignited a new direction in solid-state physics. A significant, fundamental challenge in the field of valleytronics is how to generate and regulate valley-polarized currents by practical ways. Here, we discover a new mechanism of producing valley polarization in a monolayer transition metal dichalcogenides superlattice, in which valley-resolved gaps are formed at the supercell Brillouin zone boundaries and centers due to the intervalley scattering. When the energy of the incident electron is in the gaps, the available states are valley polarized, thus providing a valley-polarized current from the superlattice. We show that the direction and strength of the valley polarization may further be tuned by varying the potential applied the superlattice. The transmission can have a net valley polarization of 55% for a 4-period heterojunction. Moreover, such two valley filters in series may function as an electrostatically controlled giant valleyresistance device, representing a zero magnetic field counterpart to the familiar giant magnetoresistance device.

preprint2020arXiv

Weakly Supervised Deep Learning for COVID-19 Infection Detection and Classification from CT Images

An outbreak of a novel coronavirus disease (i.e., COVID-19) has been recorded in Wuhan, China since late December 2019, which subsequently became pandemic around the world. Although COVID-19 is an acutely treated disease, it can also be fatal with a risk of fatality of 4.03% in China and the highest of 13.04% in Algeria and 12.67% Italy (as of 8th April 2020). The onset of serious illness may result in death as a consequence of substantial alveolar damage and progressive respiratory failure. Although laboratory testing, e.g., using reverse transcription polymerase chain reaction (RT-PCR), is the golden standard for clinical diagnosis, the tests may produce false negatives. Moreover, under the pandemic situation, shortage of RT-PCR testing resources may also delay the following clinical decision and treatment. Under such circumstances, chest CT imaging has become a valuable tool for both diagnosis and prognosis of COVID-19 patients. In this study, we propose a weakly supervised deep learning strategy for detecting and classifying COVID-19 infection from CT images. The proposed method can minimise the requirements of manual labelling of CT images but still be able to obtain accurate infection detection and distinguish COVID-19 from non-COVID-19 cases. Based on the promising results obtained qualitatively and quantitatively, we can envisage a wide deployment of our developed technique in large-scale clinical studies.

preprint2020arXiv

X-CIGALE: fitting AGN/galaxy SEDs from X-ray to infrared

CIGALE is a powerful multiwavelength spectral energy distribution (SED) fitting code for extragalactic studies. However, the current version of CIGALE is not able to fit X-ray data, which often provide unique insights into AGN intrinsic power. We develop a new X-ray module for CIGALE, allowing it to fit SEDs from the X-ray to infrared (IR). We also improve the AGN fitting of CIGALE from UV-to-IR wavelengths. We implement a modern clumpy two-phase torus model, SKIRTOR. To account for moderately extincted type 1 AGNs, we implement polar-dust extinction. We publicly release the source code (named X-CIGALE). We test X-CIGALE with X-ray detected AGNs in SDSS, COSMOS, and AKARI-NEP. The fitting quality (as indicated by reduced $χ^2$) is good in general, indicating that X-CIGALE is capable of modelling the observed SED from X-ray to IR. We discuss constrainability and degeneracy of model parameters in the fitting of AKARI-NEP, for which excellent mid-IR photometric coverage is available. We also test fitting a sample of AKARI-NEP galaxies for which only X-ray upper limits are available from Chandra observations, and find that the upper limit can effectively constrain the AGN SED contribution for some systems. Finally, using X-CIGALE, we assess the ability of Athena to constrain the AGN activity in future extragalactic studies.

preprint2019arXiv

Auxetic tetrahex-carbon with ultrahigh strength and direct band gap

Tetrahex-carbon is a recently predicted two dimensional (2D) carbon allotrope which is composed of tetragonal and hexagonal rings. Unlike flat graphene, this new 2D carbon structure is buckled, possesses a direct band gap ~ 2.6 eV and high carrier mobility with anisotropic feature. In this work, we employ first-principles density-functional theory calculations to explore mechanical properties of tetrahex-C under uniaxial tensile strain. We find that tetrahex-C demonstrates ultrahigh ideal strength, outperforming both graphene and penta-graphene. It shows superior ductility and sustains uniaxial tensile strain up to 20% (16%) till phonon instability occurs, and the corresponding maximal strength is 38.3 N/m (37.8 N/m) in the zigzag (armchair) direction. It shows intrinsic negative Poisson&#39;s ratio. This exotic in-plane Poisson&#39;s ratio takes place when axial strain reaches a threshold value of 7% (5%) in the zigzag (armchair) direction. We also find that tetrahex-C holds a direct band gap of 2.64 eV at the center of Brillouin zone. This direct-band-gap feature maintains intact upon strain application with no direction-indirect gap transition. The ultrahigh ideal strength, negative Poisson&#39;s ratio and integrity of direct-gap under strain in tetrahex-C suggest it may have potential applications in nanomechanics and nanoelectronics.

preprint2019arXiv

Haptic Teleoperation of UAVs through Control Barrier Functions

This paper presents a novel approach to haptic teleoperation. Specifically, we use control barrier functions (CBFs) to generate force feedback to help human operators safely fly quadrotor UAVs. CBFs take a control signal as input and output a control signal that is as close as possible to the initial control signal, while also meeting specified safety constraints. In the proposed method, we generate haptic force feedback based on the difference between a command issued by the human operator and the safe command returned by a CBF. In this way, if the user issues an unsafe control command, the haptic feedback will help guide the user towards the safe input command that is closest to their current command. We conducted a within-subject user study, in which 12 participants flew a simulated UAV in a virtual hallway environment. Participants completed the task with our proposed CBF-based haptic feedback, no haptic feedback, and haptic feedback generated via parametric risk fields, which is a state-of-the-art method described in the literature. The results of this study show that CBF-based haptic feedback can improve a human operator&#39;s ability to safely fly a UAV and reduce the operator&#39;s perceived workload, without sacrificing task efficiency.

preprint2019arXiv

MRI Brain Tumor Segmentation using Random Forests and Fully Convolutional Networks

In this paper, we propose a novel learning based method for automated segmentation of brain tumor in multimodal MRI images, which incorporates two sets of machine -learned and hand crafted features. Fully convolutional networks (FCN) forms the machine learned features and texton based features are considered as hand-crafted features. Random forest (RF) is used to classify the MRI image voxels into normal brain tissues and different parts of tumors, i.e. edema, necrosis and enhancing tumor. The method was evaluated on BRATS 2017 challenge dataset. The results show that the proposed method provides promising segmentations. The mean Dice overlap measure for automatic brain tumor segmentation against ground truth is 0.86, 0.78 and 0.66 for whole tumor, core and enhancing tumor, respectively.

preprint2019arXiv

Probing black hole accretion tracks, scaling relations and radiative efficiencies from stacked X-ray active galactic nuclei

The masses of supermassive black holes at the centres of local galaxies appear to be tightly correlated with the mass and velocity dispersions of their galactic hosts. However, the local Mbh-Mstar relation inferred from dynamically measured inactive black holes is up to an order-of-magnitude higher than some estimates from active black holes, and recent work suggests that this discrepancy arises from selection bias on the sample of dynamical black hole mass measurements. In this work we combine X-ray measurements of the mean black hole accretion luminosity as a function of stellar mass and redshift with empirical models of galaxy stellar mass growth, integrating over time to predict the evolving Mbh-Mstar relation. The implied relation is nearly independent of redshift, indicating that stellar and black hole masses grow, on average, at similar rates. Matching the de-biased local Mbh-Mstar relation requires a mean radiative efficiency ~0.15, in line with theoretical expectations for accretion onto spinning black holes. However, matching the &#34;raw&#34; observed relation for inactive black holes requires a mean radiative efficiency around 0.02, far below theoretical expectations. This result provides independent evidence for selection bias in dynamically estimated black hole masses, a conclusion that is robust to uncertainties in bolometric corrections, obscured active black hole fractions, and kinetic accretion efficiency. For our fiducial assumptions, they favour moderate-to-rapid spins of typical supermassive black holes, to achieve a mean radiative efficiency ~0.12-0.20. Our approach has similarities to the classic Soltan analysis, but by using galaxy-based data instead of integrated quantities we are able to focus on regimes where observational uncertainties are minimized.