Source author record

Dong Yang

Dong Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

51works

23topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Improved LLM Agents for Financial Document Question Answering

Large language models (LLMs) have shown impressive capabilities on numerous natural language processing tasks. However, LLMs still struggle with numerical question answering for financial documents that include tabular and textual data. Recent works have showed the effectiveness of critic agents (i.e., self-correction) for this task given oracle labels. Building upon this framework, this paper examines the effectiveness of the traditional critic agent when oracle labels are not available, and show, through experiments, that this critic agent's performance deteriorates in this scenario. With this in mind, we present an improved critic agent, along with the calculator agent which outperforms the previous state-of-the-art approach (program-of-thought) and is safer. Furthermore, we investigate how our agents interact with each other, and how this interaction affects their performance.

preprint2026arXiv

Kinetic-Optimal Scheduling with Moment Correction for Metric-Induced Discrete Flow Matching in Zero-Shot Text-to-Speech

Metric-induced discrete flow matching (MI-DFM) exploits token-latent geometry for discrete generation, but its practical use is limited by two issues: heuristic schedulers requiring hyperparameter search, and finite-step path-tracking error from its first-order continuous-time Markov chain (CTMC) solver. We address both issues. First, we derive a kinetic-optimal scheduler for prescribed scalar-parameterized probability paths, and instantiate it for MI-DFM as a training-free numerical schedule that traverses the path at constant Fisher-Rao speed. Second, we introduce a finite-step moment correction that adjusts the jump probability while preserving the CTMC jump destination distribution. We validate the resulting method, GibbsTTS, on codec-based zero-shot text-to-speech (TTS). Under controlled comparisons with a unified architecture and large-scale dataset, GibbsTTS achieves the best objective naturalness and is preferred in subjective evaluations over masked discrete generative baselines. Additionally, in comparison with the evaluated state-of-the-art TTS systems, GibbsTTS shows strong speaker similarity, achieving the highest similarity on three of four test sets and ranking second on the fourth. Project page: https://ydqmkkx.github.io/GibbsTTSProject

preprint2024arXiv

MvKSR: Multi-view Knowledge-guided Scene Recovery for Hazy and Rainy Degradation

High-quality imaging is crucial for ensuring safety supervision and intelligent deployment in fields like transportation and industry. It enables precise and detailed monitoring of operations, facilitating timely detection of potential hazards and efficient management. However, adverse weather conditions, such as atmospheric haziness and precipitation, can have a significant impact on image quality. When the atmosphere contains dense haze or water droplets, the incident light scatters, leading to degraded captured images. This degradation is evident in the form of image blur and reduced contrast, increasing the likelihood of incorrect assessments and interpretations by intelligent imaging systems (IIS). To address the challenge of restoring degraded images in hazy and rainy conditions, this paper proposes a novel multi-view knowledge-guided scene recovery network (termed MvKSR). Specifically, guided filtering is performed on the degraded image to separate high/low-frequency components. Subsequently, an en-decoder-based multi-view feature coarse extraction module (MCE) is used to coarsely extract features from different views of the degraded image. The multi-view feature fine fusion module (MFF) will learn and infer the restoration of degraded images through mixed supervision under different views. Additionally, we suggest an atrous residual block to handle global restoration and local repair in hazy/rainy/mixed scenes. Extensive experimental results demonstrate that MvKSR outperforms other state-of-the-art methods in terms of efficiency and stability for restoring degraded scenarios in IIS.

preprint2022arXiv

A low frequency model for the aeroacoustic scattering of cylindrical tube rows in cross-flow

Heat exchanger tube rows can influence the thermoacoustic instability behaviour of combustion systems since they act as both acoustic scatterers and unsteady heat sinks. Therefore, with careful tuning of their thermoacoustic properties, heat exchangers have the potential to act as passive control devices. In this work, we focus on (only) the acoustic scattering behaviour of heat exchanger tubes. We present a comparison of existing acoustic models for tube rows and slits, models for the latter having the advantage of incorporating frequency dependence. We then propose a new model that enables the adaptation of slit models for tube rows. This model is validated against experiments and Linearised Navier Stokes Equations (LNSE) predictions for the transmission and reflection coefficients, including phase information. The model predictions show very good agreement with the experimental and numerical validations, especially for low frequencies (Strouhal number < 0.5, based on tube radius and excitation frequency), with mean differences less than 2% for the transmission coefficients (the reflection coefficient errors are somewhat larger since their magnitudes are very close to zero).

preprint2022arXiv

Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation

Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. The inherent privacy-preserving property of FL algorithms makes them especially attractive to the medical field. However, in case of heterogeneous client data distributions, standard FL methods are unstable and require intensive hyperparameter tuning to achieve optimal performance. Conventional hyperparameter optimization algorithms are impractical in real-world FL applications as they involve numerous training trials, which are often not affordable with limited compute budgets. In this work, we propose an efficient reinforcement learning (RL)-based federated hyperparameter optimization algorithm, termed Auto-FedRL, in which an online RL agent can dynamically adjust hyperparameters of each client based on the current training progress. Extensive experiments are conducted to investigate different search strategies and RL agents. The effectiveness of the proposed method is validated on a heterogeneous data split of the CIFAR-10 dataset as well as two real-world medical image segmentation datasets for COVID-19 lesion segmentation in chest CT and pancreas segmentation in abdominal CT.

preprint2022arXiv

HyperSegNAS: Bridging One-Shot Neural Architecture Search with 3D Medical Image Segmentation using HyperNet

Semantic segmentation of 3D medical images is a challenging task due to the high variability of the shape and pattern of objects (such as organs or tumors). Given the recent success of deep learning in medical image segmentation, Neural Architecture Search (NAS) has been introduced to find high-performance 3D segmentation network architectures. However, because of the massive computational requirements of 3D data and the discrete optimization nature of architecture search, previous NAS methods require a long search time or necessary continuous relaxation, and commonly lead to sub-optimal network architectures. While one-shot NAS can potentially address these disadvantages, its application in the segmentation domain has not been well studied in the expansive multi-scale multi-path search space. To enable one-shot NAS for medical image segmentation, our method, named HyperSegNAS, introduces a HyperNet to assist super-net training by incorporating architecture topology information. Such a HyperNet can be removed once the super-net is trained and introduces no overhead during architecture search. We show that HyperSegNAS yields better performing and more intuitive architectures compared to the previous state-of-the-art (SOTA) segmentation networks; furthermore, it can quickly and accurately find good architecture candidates under different computing constraints. Our method is evaluated on public datasets from the Medical Segmentation Decathlon (MSD) challenge, and achieves SOTA performances.

preprint2022arXiv

Multi-stage Moving Target Defense: A Security-enhanced D-FACTS Implementation Approach

In recent studies, moving target defense (MTD) has been applied to detect false data injection (FDI) attacks using distributed flexible AC transmission system (D-FACTS) devices. However, the inherent conflict between the security goals of MTD (i.e., detecting FDI attacks) and the economic goals of D-FACTS devices (i.e., reducing power losses) would impede the application of MTD in real systems. Moreover, the detection capabilities of existing MTDs are often insufficient. This paper proposes a multi-stage MTD (MMTD) approach to resolve these two issues by adding a group of designed security-oriented schemes before D-FACTS' economic-oriented scheme to detect FDI attacks. We keep these security-oriented schemes for a very short time interval and then revert to the economic-oriented scheme for the remaining time to ensure the economic requirements. We prove that a designed MMTD can significantly improve the detection capability compared to existing one-stage MTDs. We find the supremum of MMTD's detection capability and study its relationship with system topology and D-FACTS deployment. Meanwhile, a greedy algorithm is proposed to search the MMTD strategy to reach this supremum. Simulation results show that the proposed MMTD can achieve the supremum against FDI attacks while outperforming current MTD strategies on economic indicators.

preprint2022arXiv

Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis

Vision Transformers (ViT)s have shown great performance in self-supervised learning of global and local representations that can be transferred to downstream applications. Inspired by these results, we introduce a novel self-supervised learning framework with tailored proxy tasks for medical image analysis. Specifically, we propose: (i) a new 3D transformer-based model, dubbed Swin UNEt TRansformers (Swin UNETR), with a hierarchical encoder for self-supervised pre-training; (ii) tailored proxy tasks for learning the underlying pattern of human anatomy. We demonstrate successful pre-training of the proposed model on 5,050 publicly available computed tomography (CT) images from various body organs. The effectiveness of our approach is validated by fine-tuning the pre-trained models on the Beyond the Cranial Vault (BTCV) Segmentation Challenge with 13 abdominal organs and segmentation tasks from the Medical Segmentation Decathlon (MSD) dataset. Our model is currently the state-of-the-art (i.e. ranked 1st) on the public test leaderboards of both MSD and BTCV datasets. Code: https://monai.io/research/swin-unetr

preprint2022arXiv

Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images

Semantic segmentation of brain tumors is a fundamental medical image analysis task involving multiple MRI imaging modalities that can assist clinicians in diagnosing the patient and successively studying the progression of the malignant entity. In recent years, Fully Convolutional Neural Networks (FCNNs) approaches have become the de facto standard for 3D medical image segmentation. The popular "U-shaped" network architecture has achieved state-of-the-art performance benchmarks on different 2D and 3D semantic segmentation tasks and across various imaging modalities. However, due to the limited kernel size of convolution layers in FCNNs, their performance of modeling long-range information is sub-optimal, and this can lead to deficiencies in the segmentation of tumors with variable sizes. On the other hand, transformer models have demonstrated excellent capabilities in capturing such long-range information in multiple domains, including natural language processing and computer vision. Inspired by the success of vision transformers and their variants, we propose a novel segmentation model termed Swin UNEt TRansformers (Swin UNETR). Specifically, the task of 3D brain tumor semantic segmentation is reformulated as a sequence to sequence prediction problem wherein multi-modal input data is projected into a 1D sequence of embedding and used as an input to a hierarchical Swin transformer as the encoder. The swin transformer encoder extracts features at five different resolutions by utilizing shifted windows for computing self-attention and is connected to an FCNN-based decoder at each resolution via skip connections. We have participated in BraTS 2021 segmentation challenge, and our proposed model ranks among the top-performing approaches in the validation phase. Code: https://monai.io/research/swin-unetr

preprint2022arXiv

UNetFormer: A Unified Vision Transformer Model and Pre-Training Framework for 3D Medical Image Segmentation

Vision Transformers (ViT)s have recently become popular due to their outstanding modeling capabilities, in particular for capturing long-range information, and scalability to dataset and model sizes which has led to state-of-the-art performance in various computer vision and medical image analysis tasks. In this work, we introduce a unified framework consisting of two architectures, dubbed UNetFormer, with a 3D Swin Transformer-based encoder and Convolutional Neural Network (CNN) and transformer-based decoders. In the proposed model, the encoder is linked to the decoder via skip connections at five different resolutions with deep supervision. The design of proposed architecture allows for meeting a wide range of trade-off requirements between accuracy and computational cost. In addition, we present a methodology for self-supervised pre-training of the encoder backbone via learning to predict randomly masked volumetric tokens using contextual information of visible tokens. We pre-train our framework on a cohort of $5050$ CT images, gathered from publicly available CT datasets, and present a systematic investigation of various components such as masking ratio and patch size that affect the representation learning capability and performance of downstream tasks. We validate the effectiveness of our pre-training approach by fine-tuning and testing our model on liver and liver tumor segmentation task using the Medical Segmentation Decathlon (MSD) dataset and achieve state-of-the-art performance in terms of various segmentation metrics. To demonstrate its generalizability, we train and test the model on BraTS 21 dataset for brain tumor segmentation using MRI images and outperform other methods in terms of Dice score. Code: https://github.com/Project-MONAI/research-contributions

preprint2022arXiv

Upper bounds on the leakage of private data and operational approach to markovianity

We quantify the consequences of a private key leakage and private randomness generated during quantum key distribution. We provide simple lower bounds on the one-way distillable key after the leakage has been detected. We also show that the distributed private randomness does not drop by more than twice the number of qubits of the traced-out system. We further focus on irreducible private states, showing that their two-way distillable key is non-lockable. We then strengthen this result by referring to the idea of recovery maps. We further consider the action of special case of side-channels on some of the private states. Finally, we connect the topic of (non)markovian dynamics with that of hacking. In particular, we show that an invertible map is non-CP-divisible if and only if there exists a state whose the key witnessed by a particular privacy witness increases in time. This complements the recent result of J. Kolodyński et al. [Phys. Rev. A 101, 020303(R) (2020)] where the log-negativity was connected with the (non)markovianity of the dynamics.

preprint2022arXiv

VerSe: A Vertebrae Labelling and Segmentation Benchmark for Multi-detector CT Images

Vertebral labelling and segmentation are two fundamental tasks in an automated spine processing pipeline. Reliable and accurate processing of spine images is expected to benefit clinical decision-support systems for diagnosis, surgery planning, and population-based analysis on spine and bone health. However, designing automated algorithms for spine processing is challenging predominantly due to considerable variations in anatomy and acquisition protocols and due to a severe shortage of publicly available data. Addressing these limitations, the Large Scale Vertebrae Segmentation Challenge (VerSe) was organised in conjunction with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2019 and 2020, with a call for algorithms towards labelling and segmentation of vertebrae. Two datasets containing a total of 374 multi-detector CT scans from 355 patients were prepared and 4505 vertebrae have individually been annotated at voxel-level by a human-machine hybrid algorithm (https://osf.io/nqjyw/, https://osf.io/t98fz/). A total of 25 algorithms were benchmarked on these datasets. In this work, we present the the results of this evaluation and further investigate the performance-variation at vertebra-level, scan-level, and at different fields-of-view. We also evaluate the generalisability of the approaches to an implicit domain shift in data by evaluating the top performing algorithms of one challenge iteration on data from the other iteration. The principal takeaway from VerSe: the performance of an algorithm in labelling and segmenting a spine scan hinges on its ability to correctly identify vertebrae in cases of rare anatomical variations. The content and code concerning VerSe can be accessed at: https://github.com/anjany/verse.

preprint2022arXiv

Warm Start Active Learning with Proxy Labels \& Selection via Semi-Supervised Fine-Tuning

Which volume to annotate next is a challenging problem in building medical imaging datasets for deep learning. One of the promising methods to approach this question is active learning (AL). However, AL has been a hard nut to crack in terms of which AL algorithm and acquisition functions are most useful for which datasets. Also, the problem is exacerbated with which volumes to label first when there is zero labeled data to start with. This is known as the cold start problem in AL. We propose two novel strategies for AL specifically for 3D image segmentation. First, we tackle the cold start problem by proposing a proxy task and then utilizing uncertainty generated from the proxy task to rank the unlabeled data to be annotated. Second, we craft a two-stage learning framework for each active iteration where the unlabeled data is also used in the second stage as a semi-supervised fine-tuning strategy. We show the promise of our approach on two well-known large public datasets from medical segmentation decathlon. The results indicate that the initial selection of data and semi-supervised framework both showed significant improvement for several AL strategies.

preprint2021arXiv

Diminishing Uncertainty within the Training Pool: Active Learning for Medical Image Segmentation

Active learning is a unique abstraction of machine learning techniques where the model/algorithm could guide users for annotation of a set of data points that would be beneficial to the model, unlike passive machine learning. The primary advantage being that active learning frameworks select data points that can accelerate the learning process of a model and can reduce the amount of data needed to achieve full accuracy as compared to a model trained on a randomly acquired data set. Multiple frameworks for active learning combined with deep learning have been proposed, and the majority of them are dedicated to classification tasks. Herein, we explore active learning for the task of segmentation of medical imaging data sets. We investigate our proposed framework using two datasets: 1.) MRI scans of the hippocampus, 2.) CT scans of pancreas and tumors. This work presents a query-by-committee approach for active learning where a joint optimizer is used for the committee. At the same time, we propose three new strategies for active learning: 1.) increasing frequency of uncertain data to bias the training data set; 2.) Using mutual information among the input images as a regularizer for acquisition to ensure diversity in the training dataset; 3.) adaptation of Dice log-likelihood for Stein variational gradient descent (SVGD). The results indicate an improvement in terms of data reduction by achieving full accuracy while only using 22.69 % and 48.85 % of the available data for each dataset, respectively.

preprint2020arXiv

3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training

While making a tremendous impact in various fields, deep neural networks usually require large amounts of labeled data for training which are expensive to collect in many applications, especially in the medical domain. Unlabeled data, on the other hand, is much more abundant. Semi-supervised learning techniques, such as co-training, could provide a powerful tool to leverage unlabeled data. In this paper, we propose a novel framework, uncertainty-aware multi-view co-training (UMCT), to address semi-supervised learning on 3D data, such as volumetric data from medical imaging. In our work, co-training is achieved by exploiting multi-viewpoint consistency of 3D data. We generate different views by rotating or permuting the 3D data and utilize asymmetrical 3D kernels to encourage diversified features in different sub-networks. In addition, we propose an uncertainty-weighted label fusion mechanism to estimate the reliability of each view's prediction with Bayesian deep learning. As one view requires the supervision from other views in co-training, our self-adaptive approach computes a confidence score for the prediction of each unlabeled sample in order to assign a reliable pseudo label. Thus, our approach can take advantage of unlabeled data during training. We show the effectiveness of our proposed semi-supervised method on several public datasets from medical image segmentation tasks (NIH pancreas & LiTS liver tumor dataset). Meanwhile, a fully-supervised method based on our approach achieved state-of-the-art performances on both the LiTS liver tumor segmentation and the Medical Segmentation Decathlon (MSD) challenge, demonstrating the robustness and value of our framework, even when fully supervised training is feasible.

preprint2020arXiv

C2FNAS: Coarse-to-Fine Neural Architecture Search for 3D Medical Image Segmentation

3D convolution neural networks (CNN) have been proved very successful in parsing organs or tumours in 3D medical images, but it remains sophisticated and time-consuming to choose or design proper 3D networks given different task contexts. Recently, Neural Architecture Search (NAS) is proposed to solve this problem by searching for the best network architecture automatically. However, the inconsistency between search stage and deployment stage often exists in NAS algorithms due to memory constraints and large search space, which could become more serious when applying NAS to some memory and time consuming tasks, such as 3D medical image segmentation. In this paper, we propose coarse-to-fine neural architecture search (C2FNAS) to automatically search a 3D segmentation network from scratch without inconsistency on network size or input size. Specifically, we divide the search procedure into two stages: 1) the coarse stage, where we search the macro-level topology of the network, i.e. how each convolution module is connected to other modules; 2) the fine stage, where we search at micro-level for operations in each cell based on previous searched macro-level topology. The coarse-to-fine manner divides the search procedure into two consecutive stages and meanwhile resolves the inconsistency. We evaluate our method on 10 public datasets from Medical Segmentation Decalthon (MSD) challenge, and achieve state-of-the-art performance with the network searched using one dataset, which demonstrates the effectiveness and generalization of our searched models.

preprint2020arXiv

Enhanced MRI Reconstruction Network using Neural Architecture Search

The accurate reconstruction of under-sampled magnetic resonance imaging (MRI) data using modern deep learning technology, requires significant effort to design the necessary complex neural network architectures. The cascaded network architecture for MRI reconstruction has been widely used, while it suffers from the "vanishing gradient" problem when the network becomes deep. In addition, homogeneous architecture degrades the representation capacity of the network. In this work, we present an enhanced MRI reconstruction network using a residual in residual basic block. For each cell in the basic block, we use the differentiable neural architecture search (NAS) technique to automatically choose the optimal operation among eight variants of the dense block. This new heterogeneous network is evaluated on two publicly available datasets and outperforms all current state-of-the-art methods, which demonstrates the effectiveness of our proposed method.

preprint2020arXiv

Enhancing Foreground Boundaries for Medical Image Segmentation

Object segmentation plays an important role in the modern medical image analysis, which benefits clinical study, disease diagnosis, and surgery planning. Given the various modalities of medical images, the automated or semi-automated segmentation approaches have been used to identify and parse organs, bones, tumors, and other regions-of-interest (ROI). However, these contemporary segmentation approaches tend to fail to predict the boundary areas of ROI, because of the fuzzy appearance contrast caused during the imaging procedure. To further improve the segmentation quality of boundary areas, we propose a boundary enhancement loss to enforce additional constraints on optimizing machine learning models. The proposed loss function is light-weighted and easy to implement without any pre- or post-processing. Our experimental results validate that our loss function are better than, or at least comparable to, other state-of-the-art loss functions in terms of segmentation accuracy.

preprint2020arXiv

PC-U Net: Learning to Jointly Reconstruct and Segment the Cardiac Walls in 3D from CT Data

The 3D volumetric shape of the heart's left ventricle (LV) myocardium (MYO) wall provides important information for diagnosis of cardiac disease and invasive procedure navigation. Many cardiac image segmentation methods have relied on detection of region-of-interest as a pre-requisite for shape segmentation and modeling. With segmentation results, a 3D surface mesh and a corresponding point cloud of the segmented cardiac volume can be reconstructed for further analyses. Although state-of-the-art methods (e.g., U-Net) have achieved decent performance on cardiac image segmentation in terms of accuracy, these segmentation results can still suffer from imaging artifacts and noise, which will lead to inaccurate shape modeling results. In this paper, we propose a PC-U net that jointly reconstructs the point cloud of the LV MYO wall directly from volumes of 2D CT slices and generates its segmentation masks from the predicted 3D point cloud. Extensive experimental results show that by incorporating a shape prior from the point cloud, the segmentation masks are more accurate than the state-of-the-art U-Net results in terms of Dice's coefficient and Hausdorff distance.The proposed joint learning framework of our PC-U net is beneficial for automatic cardiac image analysis tasks because it can obtain simultaneously the 3D shape and segmentation of the LV MYO walls.

preprint2020arXiv

Quotients of triangulated categories and Equivalences of Buchweitz, Orlov and Amiot--Guo--Keller

We give a sufficient condition for a Verdier quotient $\ct/\cs$ of a triangulated category $\ct$ by a thick subcategory $\cs$ to be realized inside of $\ct$ as an ideal quotient. As applications, we deduce three significant results by Buchweitz, Orlov and Amiot--Guo--Keller.

preprint2020arXiv

Searching Learning Strategy with Reinforcement Learning for 3D Medical Image Segmentation

Deep neural network (DNN) based approaches have been widely investigated and deployed in medical image analysis. For example, fully convolutional neural networks (FCN) achieve the state-of-the-art performance in several applications of 2D/3D medical image segmentation. Even the baseline neural network models (U-Net, V-Net, etc.) have been proven to be very effective and efficient when the training process is set up properly. Nevertheless, to fully exploit the potentials of neural networks, we propose an automated searching approach for the optimal training strategy with reinforcement learning. The proposed approach can be utilized for tuning hyper-parameters, and selecting necessary data augmentation with certain probabilities. The proposed approach is validated on several tasks of 3D medical image segmentation. The performance of the baseline model is boosted after searching, and it can achieve comparable accuracy to other manually-tuned state-of-the-art segmentation approaches.

preprint2020arXiv

Some examples of $t$-structures for finite-dimensional algebras

We describe the heart of the canonical $t$-structure on the perfect derived category of a strictly positive graded algebra as the module category over the quadratic dual. Applying this result we obtain examples showing new phenomena on $t$-structures on derived categories of finite-dimensional algebras.

preprint2020arXiv

Time-aware Graph Embedding: A temporal smoothness and task-oriented approach

Knowledge graph embedding, which aims to learn the low-dimensional representations of entities and relationships, has attracted considerable research efforts recently. However, most knowledge graph embedding methods focus on the structural relationships in fixed triples while ignoring the temporal information. Currently, existing time-aware graph embedding methods only focus on the factual plausibility, while ignoring the temporal smoothness which models the interactions between a fact and its contexts, and thus can capture fine-granularity temporal relationships. This leads to the limited performance of embedding related applications. To solve this problem, this paper presents a Robustly Time-aware Graph Embedding (RTGE) method by incorporating temporal smoothness. Two major innovations of our paper are presented here. At first, RTGE integrates a measure of temporal smoothness in the learning process of the time-aware graph embedding. Via the proposed additional smoothing factor, RTGE can preserve both structural information and evolutionary patterns of a given graph. Secondly, RTGE provides a general task-oriented negative sampling strategy associated with temporally-aware information, which further improves the adaptive ability of the proposed algorithm and plays an essential role in obtaining superior performance in various tasks. Extensive experiments conducted on multiple benchmark tasks show that RTGE can increase performance in entity/relationship/temporal scoping prediction tasks.

preprint2020arXiv

Uncertainty-aware multi-view co-training for semi-supervised medical image segmentation and domain adaptation

Although having achieved great success in medical image segmentation, deep learning-based approaches usually require large amounts of well-annotated data, which can be extremely expensive in the field of medical image analysis. Unlabeled data, on the other hand, is much easier to acquire. Semi-supervised learning and unsupervised domain adaptation both take the advantage of unlabeled data, and they are closely related to each other. In this paper, we propose uncertainty-aware multi-view co-training (UMCT), a unified framework that addresses these two tasks for volumetric medical image segmentation. Our framework is capable of efficiently utilizing unlabeled data for better performance. We firstly rotate and permute the 3D volumes into multiple views and train a 3D deep network on each view. We then apply co-training by enforcing multi-view consistency on unlabeled data, where an uncertainty estimation of each view is utilized to achieve accurate labeling. Experiments on the NIH pancreas segmentation dataset and a multi-organ segmentation dataset show state-of-the-art performance of the proposed framework on semi-supervised medical image segmentation. Under unsupervised domain adaptation settings, we validate the effectiveness of this work by adapting our multi-organ segmentation model to two pathological organs from the Medical Segmentation Decathlon Datasets. Additionally, we show that our UMCT-DA model can even effectively handle the challenging situation where labeled source data is inaccessible, demonstrating strong potentials for real-world applications.

preprint2019arXiv

Weakly supervised segmentation from extreme points

Annotation of medical images has been a major bottleneck for the development of accurate and robust machine learning models. Annotation is costly and time-consuming and typically requires expert knowledge, especially in the medical domain. Here, we propose to use minimal user interaction in the form of extreme point clicks in order to train a segmentation model that can, in turn, be used to speed up the annotation of medical images. We use extreme points in each dimension of a 3D medical image to constrain an initial segmentation based on the random walker algorithm. This segmentation is then used as a weak supervisory signal to train a fully convolutional network that can segment the organ of interest based on the provided user clicks. We show that the network's predictions can be refined through several iterations of training and prediction using the same weakly annotated data. Ultimately, our method has the potential to speed up the generation process of new training datasets for the development of new machine learning and deep learning-based models for, but not exclusively, medical image analysis.

preprint2016arXiv

Classical capacities of quantum channels with environment assistance

A quantum channel physically is a unitary interaction between the information carrying system and an environment, which is initialized in a pure state before the interaction. Conventionally, this state, as also the parameters of the interaction, is assumed to be fixed and known to the sender and receiver. Here, following the model introduced by us earlier [Karumanchi et al., arXiv[quant-ph]:1407.8160], we consider a benevolent third party, i.e. a helper, controlling the environment state, and how the helper's presence changes the communication game. In particular, we define and study the classical capacity of a unitary interaction with helper, indeed two variants, one where the helper can only prepare separable states across many channel uses, and one without this restriction. Furthermore, the two even more powerful scenarios of pre-shared entanglement between helper and receiver, and of classical communication between sender and helper (making them conferencing encoders) are considered.

preprint2016arXiv

Ladders and simplicity of derived module categories

Recollements of derived module categories are investigated, using a new technique, ladders of recollements, which are mutation sequences. The position in the ladder is shown to control whether a recollement restricts from unbounded to another level of derived category. Ladders also turn out to control derived simplicity on all levels. An algebra is derived simple if its derived category cannot be deconstructed, that is, if it is not the middle term of a non-trivial recollement whose outer terms are again derived categories of algebras. Derived simplicity on each level is characterised in terms of heights of ladders. These results are complemented by providing new classes of examples of derived simple algebras, in particular indecomposable commutative rings, as well as by a finite-dimensional counterexample to the Jordan--Hölder property for derived module categories. Moreover, recollements are used to compute homological and K-theoretic invariants.

preprint2016arXiv

Operational Resource Theory of Coherence

We establish an operational theory of coherence (or of superposition) in quantum systems, by focusing on the optimal rate of performance of certain tasks. Namely, we introduce the two basic concepts - "coherence distillation" and "coherence cost" in the processing quantum states under so-called incoherent operations [Baumgratz/Cramer/Plenio, Phys. Rev. Lett. 113:140401 (2014)]. We then show that in the asymptotic limit of many copies of a state, both are given by simple single-letter formulas: the distillable coherence is given by the relative entropy of coherence (in other words, we give the relative entropy of coherence its operational interpretation), and the coherence cost by the coherence of formation, which is an optimization over convex decompositions of the state. An immediate corollary is that there exists no bound coherent state in the sense that one would need to consume coherence to create the state but no coherence could be distilled from it. Further we demonstrate that the coherence theory is generically an irreversible theory by a simple criterion that completely characterizes all reversible states.

preprint2016arXiv

Potential capacities of quantum channels

We introduce potential capacities of quantum channels in an operational way and provide upper bounds for these quantities, which quantify the ultimate limit of usefulness of a channel for a given task in the best possible context. Unfortunately, except for a few isolated cases, potential capacities seem to be as hard to compute as their "plain" analogues. We thus study upper bounds on some potential capacities: For the classical capacity, we give an upper bound in terms of the entanglement of formation. To establish a bound for the quantum and private capacity, we first "lift" the channel to a Hadamard channel and then prove that the quantum and private capacity of a Hadamard channel is strongly additive, implying that for these channels, potential and plain capacity are equal. Employing these upper bounds we show that if a channel is noisy, however close it is to the noiseless channel, then it cannot be activated into the noiseless channel by any other contextual channel; this conclusion holds for all the three capacities. We also discuss the so-called environment-assisted quantum capacity, because we are able to characterize its "potential" version.

preprint2016arXiv

Recollements and stratifying ideals

Surjective homological epimorphisms with stratifying kernel can be used to construct recollements of derived module categories. These `stratifying' recollements are derived from recollements of module categories. Can every recollement be put in this form, up to equivalence? A negative answer will be given after providing a characterisation of recollements equivalent to stratifying ones. Moreover, criteria for a ring epimorphism to be `stratifying' will be presented as well as constructions of such epimorphisms.

preprint2016arXiv

Relative singularity categories I: Auslander resolutions

Let $R$ be an isolated Gorenstein singularity with a non-commutative resolution $A=End_R(R\oplus M)$. In this paper, we show that the relative singularity category $Δ_R(A)$ of $A$ has a number of pleasant properties, such as being Hom-finite. Moreover, it determines the classical singularity category $D_{sg}(R)$ of Buchweitz and Orlov as a certain canonical quotient category. If $R$ has finite CM type, which includes for example Kleinian singularities, then we show the much more surprising result that $D_{sg}(R)$ determines $Δ_R(Aus(R))$, where $Aus(R)$ is the corresponding Auslander algebra. The proofs of these results use dg algebras, $A_\infty$ Koszul duality, and the new concept of dg Auslander algebras, which may be of independent interest.

preprint2014arXiv

A novel hohlraum with ultrathin depleted-uranium-nitride coating layer for low hard x-ray emission and high radiation temperature

An ultra-thin layer of uranium nitrides (UN) has been coated on the inner surface of the depleted uranium hohlraum (DUH), which has been proved by our experiment can prevent the oxidization of Uranium (U) effectively. Comparative experiments between the novel depleted uranium hohlraum and pure golden (Au) hohlraum are implemented on Shenguang III prototype laser facility. Under the laser intensity of 6*10^14 W/cm2, we observe that, the hard x-ray (> 1.8 keV) fraction of this uranium hohlraum decreases by 61% and the peak intensity of total x-ray flux (0.1 keV ~ 5.0 keV) increases by 5%. Radiation hydrodynamic code LARED is used to interpret the above observations. Our result for the first time indicates the advantage of the UN-coated DUH in generating the uniform x-ray field with a quasi Planckian spectrum and thus has important implications in optimizing the ignition hohlraum design.

preprint2014arXiv

Collaborative Discriminant Locality Preserving Projections With its Application to Face Recognition

We present a novel Discriminant Locality Preserving Projections (DLPP) algorithm named Collaborative Discriminant Locality Preserving Projection (CDLPP). In our algorithm, the discriminating power of DLPP are further exploited from two aspects. On the one hand, the global optimum of class scattering is guaranteed via using the between-class scatter matrix to replace the original denominator of DLPP. On the other hand, motivated by collaborative representation, an $L_2$-norm constraint is imposed to the projections to discover the collaborations of dimensions in the sample space. We apply our algorithm to face recognition. Three popular face databases, namely AR, ORL and LFW-A, are employed for evaluating the performance of CDLPP. Extensive experimental results demonstrate that CDLPP significantly improves the discriminating power of DLPP and outperforms the state-of-the-arts.

preprint2014arXiv

Homotopy categories, Leavitt path algebras and Gorenstein projective modules

For a finite quiver without sources or sinks, we prove that the homotopy category of acyclic complexes of injective modules over the corresponding finite dimensional algebra with radical square zero is triangle equivalent to the derived category of the Leavitt path algebra viewed as a differential graded algebra with trivial differential, which is further triangle equivalent to the stable category of Gorenstein projective modules over the trivial extension algebra of a von Neumann regular algebra by an invertible bimodule. A related, but different, result for the homotopy category of acyclic complexes of projective modules is given. Restricting these equivalences to compact objects, we obtain various descriptions of the singularity category of a finite dimensional algebra with radical square zero, which contain previous results.

preprint2014arXiv

Intermediate co-t-structures, two-term silting objects, tau-tilting modules, and torsion classes

If (A,B) and (A',B') are co-t-structures of a triangulated category, then (A',B') is called intermediate if A \subseteq A' \subseteq ΣA. Our main results show that intermediate co-t-structures are in bijection with two-term silting subcategories, and also with support tau-tilting subcategories under some assumptions. We also show that support tau-tilting subcategories are in bijection with certain finitely generated torsion classes. These generalise results by Adachi, Iyama, and Reiten.

preprint2014arXiv

Quantum Channel Capacities with Passive Environment Assistance

We initiate the study of passive environment-assisted communication via a quantum channel, modeled as a unitary interaction between the information carrying system and an environment. In this model, the environment is controlled by a benevolent helper who can set its initial state such as to assist sender and receiver of the communication link. (The case of a malicious environment, also known as jammer, or arbitrarily varying channel, is essentially well-understood and comprehensively reviewed.) Here, after setting out precise definitions, focussing on the problem of quantum communication, we show that entanglement plays a crucial role in this problem: indeed, the assisted capacity where the helper is restricted to product states between channel uses is different from the one with unrestricted helper. Furthermore, prior shared entanglement between the helper and the receiver makes a difference, too.

preprint2014arXiv

Stratifications of algebras with two simple modules

Let $A$ be a finite-dimensional algebra with two simple modules. It is shown that if the derived category of $A$ admits a stratification with simple factors being the base field $k$, then $A$ is derived equivalent to a quasi-hereditary algebra. As a consequence, if further $k$ is algebraically closed and $A$ has finite global dimension, then $A$ is either derived simple or derived equivalent to a quasi-hereditary algebra

preprint2014arXiv

Strong converse for the classical capacity of entanglement-breaking and Hadamard channels via a sandwiched Renyi relative entropy

A strong converse theorem for the classical capacity of a quantum channel states that the probability of correctly decoding a classical message converges exponentially fast to zero in the limit of many channel uses if the rate of communication exceeds the classical capacity of the channel. Along with a corresponding achievability statement for rates below the capacity, such a strong converse theorem enhances our understanding of the capacity as a very sharp dividing line between achievable and unachievable rates of communication. Here, we show that such a strong converse theorem holds for the classical capacity of all entanglement-breaking channels and all Hadamard channels (the complementary channels of the former). These results follow by bounding the success probability in terms of a "sandwiched" Renyi relative entropy, by showing that this quantity is subadditive for all entanglement-breaking and Hadamard channels, and by relating this quantity to the Holevo capacity. Prior results regarding strong converse theorems for particular covariant channels emerge as a special case of our results.

preprint2013arXiv

Entangled inputs cannot make imperfect quantum channels perfect

Entangled inputs can enhance the capacity of quantum channels, this being one of the consequences of the celebrated result showing the non-additivity of several quantities relevant for quantum information science. In this work, we answer the converse question (whether entangled inputs can ever render noisy quantum channels have maximum capacity) to the negative: No sophisticated entangled input of any quantum channel can ever enhance the capacity to the maximum possible value; a result that holds true for all channels both for the classical as well as the quantum capacity. This result can hence be seen as a bound as to how "non-additive quantum information can be". As a main result, we find first practical and remarkably simple computable single-shot bounds to capacities, related to entanglement measures. As examples, we discuss the qubit amplitude damping and identify the first meaningful bound for its classical capacity.

preprint2013arXiv

Glueing silting objects

Recent results by Keller and Nicol{á}s and by Koenig and Yang have shown bijective correspondences between suitable classes of t-structures and co-t-structures with certain objects of the derived category: silting objects. On the other hand, the techniques of glueing (co-)t-structures along a recollement play an important role in the understanding of derived module categories. Using the above correspondence with silting objects we present explicit constructions of glueing of silting objects, and, furthermore, we answer the question of when is the glued silting tilting.

preprint2013arXiv

Silting objects, simple-minded collections, $t$-structures and co-$t$-structures for finite-dimensional algebras

Bijective correspondences are established between (1) silting objects, (2) simple-minded collections, (3) bounded $t$-structures with length heart and (4) bounded co-$t$-structures. These correspondences are shown to commute with mutations. The results are valid for finite-dimensional algebras. A concrete example is given to illustrate how these correspondences help to compute the space of Bridgeland's stability conditions.

preprint2011arXiv

Blocks of group algebras are derived simple

A derived version of Maschke's theorem for finite groups is proved: the derived categories, bounded or unbounded, of all blocks of the group algebra of a finite group are simple, in the sense that they admit no nontrivial recollements. This result is independent of the characteristic of the base field.

preprint2011arXiv

Endomorphism algebras of maximal rigid objects in cluster tubes

Given a maximal rigid object $T$ of the cluster tube, we determine the objects finitely presented by $T$. We then use the method of Keller and Reiten to show that the endomorphism algebra of $T$ is Gorenstein and of finite representation type, as first shown by Vatne. This algebra turns out to be the Jacobian algebra of a certain quiver with potential, when the characteristic of the base field is not 3. We study how this quiver with potential changes when $T$ is mutated. We also provide a derived equivalence classification for the endomorphism algebras of maximal rigid objects.

preprint2011arXiv

Sparseness of t-structures and negative Calabi-Yau dimension in triangulated categories generated by a spherical object

Let k be an algebraically closed field and let T be the k-linear algebraic triangulated category generated by a w-spherical object for an integer w. For certain values of w this category is classical. For instance, if w = 0 then it is the compact derived category of the dual numbers over k. As main results of the paper we show that for w \leq 0, the category T has no non-trivial t-structures, but does have one family of non-trivial co-t-structures, whereas for w \geq 1 the opposite statement holds. Moreover, without any claim to originality, we observe that for w \leq -1, the category T is a candidate to have negative Calabi-Yau dimension since Σ^w is the unique power of the suspension functor which is a Serre functor.

preprint2011arXiv

The Ringel--Hall Lie algebra of a spherical object

For an integer $w$, let $\cs_w$ be the algebraic triangulated category generated by a $w$-spherical object. We determine the Picard group of $\cs_w$ and show that each orbit category of $\cs_w$ is triangulated and is triangle equivalent to a certain orbit category of the bounded derived category of a standard tube. When $n=2$, the orbit category $\cs_w/Σ^2$ is 2-periodic triangulated, and we characterize the associated Ringel--Hall Lie algebra in the sense of Peng and Xiao.

preprint2010arXiv

On tilting complexes providing derived equivalences that send simple-minded objects to simple objects

Given a set of 'simple-minded' objects in a derived category, Rickard constructed a complex, which over a symmetric algebra provides a derived equivalence sending the 'simple-minded' objects to simple ones. We characterise in terms of t-structures, when this complex is a tilting complex, show that there is an associated natural $t$-structure and we provide an alternative construction of this complex in terms of A-infinity-structures. Our approach is similar to that of Keller--Nicolás.

preprint2010arXiv

Recollements from generalized tilting

Let $\ca$ be a small dg category over a field $k$ and let $\cu$ be a small full subcategory of the derived category $\cd\ca$ which generate all free dg $\ca$-modules. Let $(\cb,X)$ be a standard lift of $\cu$. We show that there is a recollement such that its middle term is $\cd\cb$, its right term is $\cd\ca$, and the three functors on its right side are constructed from $X$. This applies to the pair $(A,T)$, where $A$ is a $k$-algebra and $T$ is a good $n$-tilting module, and we obtain a result of Bazzoni--Mantese--Tonolo. This also applies to the pair $(\ca,\cu)$, where $\ca$ is an augmented dg category and $\cu$ is the category of `simple' modules, e.g. $\ca$ is a finite-dimensional algebra or the Kontsevich--Soibelman $A_\infty$-category associated to a quiver with potential.

preprint2009arXiv

Entanglement combing

We show that all multi-partite pure states can, under local operations, be transformed into bi-partite pairwise entangled states in a "lossless fashion": An arbitrary distinguished party will keep pairwise entanglement with all other parties after the asymptotic protocol - decorrelating all other parties from each other - in a way that the degree of entanglement of this party with respect to the rest will remain entirely unchanged. The set of possible entanglement distributions of bi-partite pairs is also classified. Finally, we point out several applications of this protocol as a useful primitive in quantum information theory.

preprint2009arXiv

Squashed entanglement for multipartite states and entanglement measures based on the mixed convex roof

New measures of multipartite entanglement are constructed based on two definitions of multipartite information and different methods of optimizing over extensions of the states. One is a generalization of the squashed entanglement where one takes the mutual information of parties conditioned on the state's extension and takes the infimum over such extensions. Additivity of the multipartite squashed entanglement is proved for both versions of the multipartite information which turn out to be related. The second one is based on taking classical extensions. This scheme is generalized, which enables to construct measures of entanglement based on the {\it mixed convex roof} of a quantity, which in contrast to the standard convex roof method involves optimization over all decompositions of a density matrix rather than just the decompositions into pure states. As one of the possible applications of these results we prove that any multipartite monotone is an upper bound on the amount of multipartite distillable key. The findings are finally related to analogous results in classical key agreement.

preprint2008arXiv

The Hall algebra of a spherical object

We determine the Hall algebra, in the sense of Toen, of the algebraic triangulated category generated by a spherical object.

preprint2001arXiv

Optimally Conclusive Discrimination of Non-orthogonal Entangled States Locally

We consider one copy of a quantum system prepared with equal prior probability in one of two non-orthogonal entangled states of multipartite distributed among separated parties. We demonstrate that these two states can be optimally distinguished in the sense of conclusive discrimination by local operations and classical communications(LOCC) alone. And this proves strictly the conjecture that Virmani et.al. [8] confirmed numerically and analytically. Generally, the optimal protocol requires local POVM operations which are explicitly constructed. The result manifests that the distinguishable information is obtained only and completely at the last operation and all prior ones give no information about that state.

Dong Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

51 published item(s)

Improved LLM Agents for Financial Document Question Answering

Kinetic-Optimal Scheduling with Moment Correction for Metric-Induced Discrete Flow Matching in Zero-Shot Text-to-Speech

MvKSR: Multi-view Knowledge-guided Scene Recovery for Hazy and Rainy Degradation

A low frequency model for the aeroacoustic scattering of cylindrical tube rows in cross-flow

Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation

HyperSegNAS: Bridging One-Shot Neural Architecture Search with 3D Medical Image Segmentation using HyperNet

Multi-stage Moving Target Defense: A Security-enhanced D-FACTS Implementation Approach

Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis

Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images

UNetFormer: A Unified Vision Transformer Model and Pre-Training Framework for 3D Medical Image Segmentation

Upper bounds on the leakage of private data and operational approach to markovianity

VerSe: A Vertebrae Labelling and Segmentation Benchmark for Multi-detector CT Images

Warm Start Active Learning with Proxy Labels \& Selection via Semi-Supervised Fine-Tuning

Diminishing Uncertainty within the Training Pool: Active Learning for Medical Image Segmentation

3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training

C2FNAS: Coarse-to-Fine Neural Architecture Search for 3D Medical Image Segmentation

Enhanced MRI Reconstruction Network using Neural Architecture Search

Enhancing Foreground Boundaries for Medical Image Segmentation

PC-U Net: Learning to Jointly Reconstruct and Segment the Cardiac Walls in 3D from CT Data

Quotients of triangulated categories and Equivalences of Buchweitz, Orlov and Amiot--Guo--Keller

Searching Learning Strategy with Reinforcement Learning for 3D Medical Image Segmentation

Some examples of $t$-structures for finite-dimensional algebras

Time-aware Graph Embedding: A temporal smoothness and task-oriented approach

Uncertainty-aware multi-view co-training for semi-supervised medical image segmentation and domain adaptation

Weakly supervised segmentation from extreme points

Classical capacities of quantum channels with environment assistance

Ladders and simplicity of derived module categories

Operational Resource Theory of Coherence

Potential capacities of quantum channels

Recollements and stratifying ideals

Relative singularity categories I: Auslander resolutions

A novel hohlraum with ultrathin depleted-uranium-nitride coating layer for low hard x-ray emission and high radiation temperature

Collaborative Discriminant Locality Preserving Projections With its Application to Face Recognition

Homotopy categories, Leavitt path algebras and Gorenstein projective modules

Intermediate co-t-structures, two-term silting objects, tau-tilting modules, and torsion classes

Quantum Channel Capacities with Passive Environment Assistance

Stratifications of algebras with two simple modules

Strong converse for the classical capacity of entanglement-breaking and Hadamard channels via a sandwiched Renyi relative entropy

Entangled inputs cannot make imperfect quantum channels perfect

Glueing silting objects

Silting objects, simple-minded collections, $t$-structures and co-$t$-structures for finite-dimensional algebras

Blocks of group algebras are derived simple

Endomorphism algebras of maximal rigid objects in cluster tubes

Sparseness of t-structures and negative Calabi-Yau dimension in triangulated categories generated by a spherical object

The Ringel--Hall Lie algebra of a spherical object

On tilting complexes providing derived equivalences that send simple-minded objects to simple objects

Recollements from generalized tilting

Entanglement combing

Squashed entanglement for multipartite states and entanglement measures based on the mixed convex roof

The Hall algebra of a spherical object

Optimally Conclusive Discrimination of Non-orthogonal Entangled States Locally